The Linux kernel supports many different types of filesystems. In the following, we introduce a few special types of filesystems that play an important role in the internal design of the Linux kernel.
Next, we shall discuss filesystem registration—that is, the basic operation that must be performed, usually during system initialization, before using a filesystem type. Once a filesystem is registered, its specific functions are available to the kernel, so that type of filesystem can be mounted on the system's directory tree.
While network and disk-based filesystems enable the user to handle information stored outside the kernel, special filesystems may provide an easy way for system programs and administrators to manipulate the data structures of the kernel and to implement special features of the operating system. Table 12-8 lists the most common special filesystems used in Linux; for each of them, the table reports its mount point and a short description.
Notice that a few filesystems have no fixed mount point (keyword "any" in the table). These filesystems can be freely mounted and used by the users. Moreover, some other special filesystems do not have a mount point at all (keyword "none" in the table). They are not for user interaction, but the kernel can use them to easily reuse some of the VFS layer code; for instance, we'll see in Chapter 19 that, thanks to the pipefs special filesystem, pipes can be treated in the same way as FIFO files.
Table 12-8. Most common special filesystems |
||
Name |
Mount point |
Description |
bdev |
none |
Block devices (see Chapter 13) |
binfmt_misc |
any |
Miscellaneous executable formats (see Chapter 20) |
devfs |
/dev |
Virtual device files (see Chapter 13) |
devpts |
/dev/pts |
Pseudoterminal support (Open Group's Unix98 standard) |
pipefs |
none |
Pipes (see Chapter 19) |
proc |
/proc |
General access point to kernel data structures |
rootfs |
none |
Provides an empty root directory for the bootstrap phase |
shm |
none |
IPC-shared memory regions (see Chapter 19) |
sockfs |
none |
Sockets (see Chapter 18) |
tmpfs |
any |
Temporary files (kept in RAM unless swapped) |
Special filesystems are not bound to physical block devices. However, the kernel assigns to each mounted special filesystem a fictitious block device that has the value 0 as major number and an arbitrary value (different for each special filesystem) as a minor number. The get_unnamed_dev( ) function returns a new fictitious block device identifier, while the put_unnamed_dev( ) function releases it. The unnamed_dev_in_use array contains a mask of 256 bits that record what minor numbers are currently in use. Although some kernel designers dislike the fictitious block device identifiers, they help the kernel to handle special filesystems and regular ones in a uniform way.
We see a practical example of how the kernel defines and initializes a special filesystem in the later section Section 12.4.1.
Often, the user configures Linux to recognize all the filesystems needed when compiling the kernel for her system. But the code for a filesystem actually may either be included in the kernel image or dynamically loaded as a module (see Appendix B). The VFS must keep track of all filesystem types whose code is currently included in the kernel. It does this by performing filesystem type registration.
Each registered filesystem is represented as a file_system_type object whose fields are illustrated in Table 12-9.
Table 12-9. The fields of the file_system_type object |
||
Type |
Field |
Description |
const char * |
name |
Filesystem name |
int |
fs_flags |
Filesystem type flags |
struct super_block *(*)( ) |
read_super |
Method for reading superblock |
struct module * |
owner |
Pointer to the module implementing the filesystem (see Appendix B) |
struct file_system_type * |
next |
Pointer to the next list element |
struct list_head |
fs_supers |
Head of a list of superblock objects |
All filesystem-type objects are inserted into a simply linked list. The file_systems variable points to the first item, while the next field of the structure points to the next item in the list. The file_systems_lock read/write spin lock protects the whole list against concurrent accesses.
The fs_supers field represents the head (first dummy element) of a list of superblock objects corresponding to mounted filesystems of the given type. The backward and forward links of a list element are stored in the s_instances field of the superblock object.
The read_super field points to the filesystem-type-dependant function that reads the superblock from the disk device and copies it into the corresponding superblock object.
The fs_flags field stores several flags, which are listed in Table 12-10.
Table 12-10. The filesystem type flags |
|
Name |
Description |
FS_REQUIRES_DEV |
Any filesystem of this type must be located on a physical disk device. |
FS_NO_DCACHE |
No longer used. |
FS_NO_PRELIM |
No longer used. |
FS_SINGLE |
There can be only one superblock object for this filesystem type. |
FS_NOMOUNT |
Filesystem has no mount point (see Section 12.3.1). |
FS_LITTER |
Purge dentry cache after unmounting (for special filesystems). |
FS_ODD_RENAME |
"Rename" operations are "move" operations (for network filesystems). |
During system initialization, the register_filesystem( ) function is invoked for every filesystem specified at compile time; the function inserts the corresponding file_system_type object into the filesystem-type list.
The register_filesystem( ) function is also invoked when a module implementing a filesystem is loaded. In this case, the filesystem may also be unregistered (by invoking the unregister_filesystem( ) function) when the module is unloaded.
The get_fs_type( ) function, which receives a filesystem name as its parameter, scans the list of registered filesystems looking at the name field of their descriptors, and returns a pointer to the corresponding file_system_type object, if it is present.