The standard Linux executable format is named Executable and Linking Format (ELF). It was developed by Unix System Laboratories and is now the most widely used format in the Unix world. Several well-known Unix operating systems, such as System V Release 4 and Sun's Solaris 2, have adopted ELF as their main executable format.
Older Linux versions supported another format named Assembler OUTput Format (a.out); actually, there were several versions of that format floating around the Unix world. It is seldom used now, since ELF is much more practical.
Linux supports many other different formats for executable files; in this way, it can run programs compiled for other operating systems, such as MS-DOS EXE programs or BSD Unix's COFF executables. A few executable formats, like Java or bash scripts, are platform-independent.
An executable format is described by an object of type linux_binfmt, which essentially provides three methods:
load_binary
Sets up a new execution environment for the current process by reading the information stored in an executable file.
load_shlib
Dynamically binds a shared library to an already running process; it is activated by the uselib( ) system call.
core_dump
Stores the execution context of the current process in a file named core. This file, whose format depends on the type of executable of the program being executed, is usually created when a process receives a signal whose default action is "dump" (see Section 10.1.1).
All linux_binfmt objects are included in a simply linked list, and the address of the first element is stored in the formats variable. Elements can be inserted and removed in the list by invoking the register_binfmt( ) and unregister_binfmt( ) functions. The register_binfmt( ) function is executed during system startup for each executable format compiled into the kernel. This function is also executed when a module implementing a new executable format is being loaded, while the unregister_binfmt( ) function is invoked when the module is unloaded.
The last element in the formats list is always an object describing the executable format for interpreted scripts. This format defines only the load_binary method. The corresponding load_script( ) function checks whether the executable file starts with the #! pair of characters. If so, it interprets the rest of the first line as the pathname of another executable file and tries to execute it by passing the name of the script file as a parameter.[5]
[5] It is possible to execute a script file even if it doesn't start with the #! characters, as long as the file is written in the language recognized by a command shell. In this case, however, the script is interpreted either by the shell on which the user types the command or by the default Bourne shell sh; therefore, the kernel is not directly involved.
Linux allows users to register their own custom executable formats. Each such format may be recognized either by means of a magic number stored in the first 128 bytes of the file, or by a filename extension that identifies the file type. For example, MS-DOS extensions consist of three characters separated from the filename by a dot: the .exe extension identifies executable programs, while the .bat extension identifies shell scripts.
Each custom format is associated with an interpreter program, which is automatically invoked by the kernel with the original custom executable filename as a parameter. The mechanism is similar to the script's format, but it's more powerful since it doesn't impose any restrictions on the custom format. To register a new format, the user writes into the /proc/sys/fs/binfmt_misc/register file a string with the following format:
:name:type:offset:string:mask:interpreter:
where each field has the following meaning:
name
An identifier for the new format
type
The type of recognition (M for magic number, E for extension)
offset
The starting offset of the magic number inside the file
string
The byte sequence to be matched either in the magic number or in the extension
mask
The string to mask out some bits in string
interpreter
The full pathname of the program interpreter
For example, the following command performed by the superuser enables the kernel to recognize the Microsoft Windows executable format:
$ echo ':DOSWin:M:0:MZ:0xff:/usr/local/bin/wine:' > /proc/sys/fs/binfmt_misc/register
A Windows executable file has the MZ magic number in the first two bytes, and it is executed by the /usr/local/bin/wine program interpreter.