Book: LPI Linux Certification in a Nutshell
Section: Chapter 4.  Devices, Linux Filesystems, and the Filesystem Hierarchy Standard (Topic 2.4)



4.8 Objective 8: Find System Files and Place Files in the Correct Location

In 1993, the Linux community formed a project to provide a standardized filesystem layout for all general-purpose distributions of Linux. The intent of this standardization was to provide advice on how to create a low-maintenance filesystem, and to reduce the proliferation of proprietary Linux filesystem layouts and their possible contribution to market fragmentation.

The project released a document describing the Linux Filesystem Standard, usually abbreviated FSSTND, in 1994. The following year, the group began to reduce Linux-specific content and to refine the standard to include other Unix or Unix-like operating systems. As the FSSTND attracted broader appeal, it was renamed the Filesystem Hierarchy Standard, or FHS. Although the FHS is not a requirement of Linux developers and distributors, the Linux community understands the importance of standards, and all major distributions support the standard.

4.8.1 Data Types

In order to frame its recommendations, the FHS defines two categories of data use, each with two opposing subtypes:

Data sharing

This category defines the scope of data use in a networked environment:

Sharable

Sharable data can be used by multiple host systems on a network. Sharable files contain general-purpose information, without ties to any specific host. Examples include user datafiles, many executable program files, and common configuration files such as hosts.

Non-sharable

Data is not sharable when linked to a specific host, such as a unique configuration file. Examples include the passwd file, network configuration files, and system logs.

Data modification

This category specifies how data changes:

Variable

Data is considered variable when changed by natural, frequent processes. Examples include user files and system log files, such as /var/log/messages.

Static

Static data is left alone for the most part, remaining the same from day to day or even year to year. Examples include binary programs such as ls and bash, which change only when the system administrator performs an upgrade.

Some directories in the Linux filesystem are intended to hold specific types of data. For example, the executable files in /usr are rarely changed, and thus could be defined as static because they are needed by all users on a network. Before disks were as large as they are today, the files commonly found in /usr were often mounted from remote servers to preserve local disk space. Thus, in addition to being static, /usr is said to be sharable. Keeping files organized with respect to these attributes can simplify file sharing, system administration, and backup complexity, as well as reduce storage requirements. The FHS arranges the preceding data categories into a 2 x 2 matrix, as shown with a few example directories in Table 4-6.

Table 4-6. FHS Data Types
 

Sharable

Non-sharable

Static

/usr
/usr/local
/etc
/boot

Variable

/var/mail
/home
/var/log
/proc

On many networks, /usr and /usr/local are mounted by individual workstations from an NFS server. This can save a considerable amount of local storage on the workstations. More important, placing these directories on another system can make upgrades and additions much simpler. These directories are usually shared as read-only filesystems because they are never modified by most end users. The /var/mail and /home directories, on the other hand, are shared but must be changed regularly by users. The /etc and /boot directories contain files that are static in the sense that only the administrator changes them, but sharing them is not necessary or advised, since they are local configuration files. The /var/log and /proc directories are very dynamic but also of local interest only.

4.8.2 The root Filesystem

The FHS offers a significant level of detail describing the exact locations of files, using rationale derived from the static/variable and sharable/nonsharable definitions. However, knowledge of the location of every file is not necessary or required for Exam 101. This section discusses the major portions of the FHS directory hierarchy overall, with specific example files offered as illustrations.

While the FHS is a defining document for the Linux filesystem, it does not follow that all directories described in the FHS will be present in all Linux installations. Some directory locations cited in the FHS are package-dependent or open to customization by the vendor.

The root filesystem is located at the top of the entire directory hierarchy. The FHS defines these goals for the root filesystem:

  • It must contain utilities and files sufficient to boot the operating system, including the ability to mount other filesystems. This includes utilities, device files, configuration, boot loader information, and other essential start-up data.

  • It should contain the utilities needed by the system administrator to repair or restore a damaged system.

  • It should be relatively small. Small partitions are less likely to be corrupted due to a system crash or power failure than large ones are. In addition, the root partition should contain non-sharable data to maximize the remaining disk space for sharable data.

  • Software should not create files or directories in the root filesystem.

While a Linux system with everything in a single root partition may be created, doing so would not meet these goals. Instead, the root filesystem should contain only essential system directories, along with mount points for other filesystems. Essential root filesystem directories include:

/bin

The /bin directory contains executable system commands such as cp, date, ln, ls, mkdir, and more. These commands are deemed essential to system administration in case of a problem.

/dev

Device files, necessary for accessing disks and other devices, are stored in /dev. Examples include disk partitions, such as hda1, and terminals, such as tty1. Devices must be present at boot time for proper mounting and configuration.

/etc

The /etc directory contains configuration information unique to the system and is required for boot time. No binary executable programs are stored here.[9] Example files include passwd, hosts, and login.defs.

[9] Prior practice in various versions of Unix had administrative executable programs stored in /etc. These have been moved to /sbin under the FHS.

/lib

The /lib directory contains shared libraries and kernel modules, both essential for system initialization.

/mnt

This directory is empty except for some mount points for temporary partitions, including cdrom and floppy.

/root

The typical home directory for the superuser is /root . While it is not absolutely essential for /root to be on the root filesystem, it is customary and convenient, because doing so keeps root's configuration files available for system maintenance or recovery.

/sbin

Essential utilities used for system administration are stored in /sbin. Examples include fdisk, fsck, and mkfs.

The remaining top-level directories in the root filesystem are considered non-essential for emergency procedures:

/boot

The /boot directory contains files for LILO. Because it is typically small, it can be left in the root filesystem. However, it is often separated to keep the boot loader files within the first 1024 cylinders of a physical disk.

/home

The /home directory contains home directories for system users. This is usually a separate filesystem and is often the largest variable filesystem in the hierarchy.

/opt

The /opt directory is intended for the installation of software other than that packaged with the operating system. This is often the location selected by third-party software vendors for their products.

/tmp

The /tmp directory is for the storage of temporary files. The contents are deleted upon every system boot.

/usr

The /usr directory contains a significant hierarchy of executable programs deemed nonessential for emergency procedures. It is usually contained in a separate partition. It contains sharable, read-only data, and is often mounted locally read-only and shared via NFS read-only. /usr is described in detail in the next section.

/var

Like /usr, the /var directory contains a large hierarchy and is usually contained in a separate partition. It holds data that varies over time, such as logs, mail, and spools.

4.8.2.1 The /usr filesystem

The /usr filesystem hierarchy contains system utilities and programs that do not appear in the root partition. For example, user programs such as awk, less, and tail are found in /usr/bin. /usr/sbin contains system administration commands such as adduser and traceroute, and a number of daemons needed only on a normally operating system. No host-specific or variable data is stored in /usr. Also disallowed is the placement of directories directly under /usr for large software packages. An exception to this rule is made for X11, which has a strong precedent for this location. The following subdirectories can be found under /usr :

/usr/X11R6

This directory contains files for XFree86. Because X is deployed directly under /usr on many Unix systems, X breaks the rule that usually prohibits a custom /usr directory for a software package.

/usr/bin

The /usr/bin directory is the primary location for user commands that are not considered essential for emergency system maintenance (and thus are stored here rather than in /bin).

/usr/games

It's unlikely that you'll find anything of significant interest here. This location was used for older console (text) games and utilities.

/usr/include

/usr/include is the standard location for include or header files, used for C and C++ programming.

/usr/lib

This directory contains shared libraries that support various programs. FHS also allows the creation of software-specific directories here. For example, /usr/lib/perl5 contains the standard library of Perl modules that implement programming functions in that language.

/usr/local

/usr/local is the top level of another hierarchy of binary files, intended for use by the system administrator. It contains subdirectories much like /usr itself, such as /bin, /include, /lib, and /sbin. After a fresh Linux installation, this directory contains no files but may contain an empty directory hierarchy. Example items that may be found here are locally created documents in /usr/local/doc or /usr/local/man, and executable scripts and binary utilities provided by the system administrator in /usr/local/bin.

/usr/sbin

The /usr/sbin directory is the primary location for system administration commands that are not considered essential for emergency system maintenance (and thus are stored here rather than in /sbin).

/usr/share

/usr/share contains a hierarchy of datafiles that are independent of, and thus can be shared among, various hardware architectures and operating system versions. This is in sharp contrast to architecture-dependant files such as those in /usr/bin. For example, in an enterprise that uses both i386- and Alpha-based Linux systems, /usr/share could be offered to all systems via NFS. However, since the two processors are not binary-compatible, /usr/bin would have two NFS shares, one for each architecture.

The information stored in /usr/share is static data, such as the GNU info system files, dictionary files, and support files for software packages.

/usr/src

/usr/src contains Linux source code, if installed. For example, if kernel development files are installed, /usr/src/linux contains the complete tree of source and configuration files necessary to build a custom kernel.

4.8.2.2 The /var filesystem

The /var filesystem contains data such as printer spools and log files that vary over time. Since variable data is always changing and growing, /var is usually contained in a separate partition to prevent the root partition from filling. The following subdirectories can be found under /var :

/var/account

Some systems maintain process accounting data in this directory.

/var/cache

/var/cache is intended for use by programs for the temporary storage of intermediate data, such as the results of lengthy computations. Programs using this directory must be capable of regenerating the cached information at any time, which allows the system administrator to delete files as needed. Because it holds transient data, /var/cache never has to be backed up.

/var/crash

This directory holds crash dumps for systems that support that feature.

/var/games

Older games may use this directory to store state information, user score data, and other transient items.

/var/lock

Lock files, used by applications to signal their existence to other processes, are stored here. Lock files usually contain no data.

/var/log

The /var/log directory is the main repository for system log files, such as those created by the syslog system. For example, the default system log file is /var/log/messages.

/var/mail

This is the system mailbox, with mail files for each user. /var/mail is a replacement for /var/spool/mail and aligns FHS with many other Unix implementations. You may find that your Linux distribution still uses /var/spool/mail.

/var/opt

This directory is defined as a location for temporary files of programs stored in /opt.

/var/run

/var/run contains various files describing the present state of the system. All such files may be deleted at system boot time. This is the default location for PID files, which contain the PIDs of the processes for which they are named. For example, if the Apache web server, httpd, is running as process number 534, /var/run/httpd.pid will contain that number:

# cat /var/run/httpd.pid
534

Such files are needed by utilities that must be able to find a PID for a running process. Also located here is the utmp file, used by commands such as last and who, to display logged-in users.

/var/spool

The /var/spool directory contains information that is queued for processing. Examples include print queues, outgoing mail, and crontab files.

/var/state

The /var/state directory is intended to contain information that helps applications preserve state across multiple invocations or multiple instances.

/var/tmp

As with /tmp in the root filesystem, /var/tmp is used for storage of temporary files. Unlike /tmp, the files in /var/tmp are expected to survive across multiple system boots. The information found in /var/tmp could be considered more persistent than information in /tmp.

/var/yp

This directory contains the database files of the Network Information Service (NIS), if implemented. NIS was formerly known as the yellow pages (not to be confused with the big yellow book).

Figure 4-5 depicts an example filesystem hierarchy. This figure is a graphical depiction of the partitioning scheme listed in Table 4-1 earlier in this chapter. The root partition contains full directories for /bin, /dev, /etc, /lib, /mnt, /root, and /sbin. Top-level directories /boot, /home, /opt, /tmp, /usr, and /var exist on the root filesystem, but they are empty and act as mount points for other filesystems.

Figure 4-5. Example filesystem hierarchy
figs/lpi_0405.gif
4.8.2.3 Linux annex

Since FHS migrated away from being a Linux-only document and expanded to cover other operating systems, information specific to any one operating system was moved to an annex. The only annex listed in v2.0 of FHS is the Linux annex, which mentions a few guidelines and makes allowances for the placement of additional program files in /sbin. The Linux annex also mentions and supports the use of the /proc filesystem for the processing of kernel, memory, and process information.

4.8.2.4 Where's that binary?

Compiled executable files, called binary files, or just binaries, can be located in a number of places in an FHS-compliant filesystem. However, it's easy to become a little confused over why a particular executable file is placed where it is in the FHS. This is particularly true for bin and sbin directories, which appear in multiple locations. Table 4-7 lists these directories and shows how each is used.

Table 4-7. Binary File Locations
 

User Commands

System Admininistration Commands

Vendor-supplied, essential (root filesystem)

/bin
/sbin

Vendor-supplied, nonessential (/usr filesystem)

/usr/bin
/usr/sbin

Locally supplied, nonessential ( /usr filesystem)

/usr/local/bin
/usr/local/sbin

4.8.3 Locating Files

FHS offers the Linux community an excellent resource that assures consistency across distributions and other operating systems. In practice, however, file location problems can be frustrating, and the need arises to find files in the system quickly. These file location tools are required for Exam 101: which, find, locate, updatedb, whatis, and apropos.

which uses the PATH variable to locate executable files. find searches specified areas in the filesystem. updatedb, whatis, and apropos utilize databases to do quick searches to identify and locate files. locate offers a quick alternative to find for filename searches and is suited for locating files that are not moved around in the filesystem. Without a fresh database to search, locate is not suitable for files recently created or renamed.

whatis and apropos work similarly to locate but use a different database. The whatis database is a set of files containing short descriptions of system commands, created by makewhatis. Note that these commands are not specifically mentioned in this Objective but may appear on Exam 101.

which

Syntax

which command

Description

Determine the location of command and display the full pathname of the executable program that the shell would launch to execute it. which has no options.

Example

Determine the shell that would be started by entering the tcsh command:

# which tcsh
/bin/tcsh

which is small and does only one thing: determines what executable program will be found and called by the shell. Such a search is particularly useful if you're having trouble with the setup of your PATH environment variable or if you are creating a new version of an existing utility and want to be certain you're executing the experimental version.

find

Syntax

find paths expression 

Description

Locate files that match an expression starting at paths and continuing recursively. The find command has a rich set of expression directives for locating just about anything in the filesystem.

Example

To find files by name located in the /usr directory hierarchy that might have something to do with the csh shell or its variants, you might use the -name filename directive:

# find /usr -name "*csh*"
/usr/bin/sun-message.csh
/usr/doc/tcsh-6.08.00
/usr/doc/tcsh-6.08.00/complete.tcsh
/usr/doc/vim-common-5.3/syntax/csh.vim
/usr/man/man1/tcsh.1
/usr/share/apps/ktop/pics/csh.xpm
/usr/share/apps/ktop/pics/tcsh.xpm
/usr/share/emacs/20.3/etc/emacs.csh
/usr/share/vim/syntax/csh.vim
/usr/src/linux-2.2.5/fs/lockd/svcshare.c

Some of these results are clearly related to csh or to tcsh, while others are questionable. In addition, this command may take a while because find must traverse the entire /usr hierarchy, examining each filename for a match. This example demonstrates that if filename wildcards are used, the entire string must be quoted to prevent expansion by the shell prior to launching find.

find is among the most useful commands in the Linux administrator's toolkit and has a variety of useful options. find is handy in certain cases. For example:

  • You need to limit a search to a particular location in the filesystem.

  • You must search for an attribute other than the filename.

  • Files you are searching for were recently created or renamed, in which case locate may not be appropriate.

Unfortunately, find can take a long time to run. Refer to Section 3.1for additional information on the find command.

On the Exam

You should have a general understanding of find. Remember that by default, find prints matching directory entries to the screen. However, detailed knowledge of find options and usage are beyond the scope of LPIC Level 1 exams.

locate


Syntax

locate patterns

Description

Locate files whose names match one or more patterns by searching an index of files previously created.

Example

Locate files by name in the entire directory hierarchy that might have something to do with the csh shell or its variants:

# locate "*csh*"
/home/jdean/.tcshrc
/root/.cshrc
/root/.tcshrc
/usr/bin/sun-message.csh
/usr/doc/tcsh-6.08.00
/usr/doc/tcsh-6.08.00/FAQ
/usr/doc/tcsh-6.08.00/NewThings
/usr/doc/tcsh-6.08.00/complete.tcsh
/usr/doc/tcsh-6.08.00/eight-bit.txt
/usr/doc/vim-common-5.3/syntax/csh.vim
/usr/man/man1/tcsh.1
/usr/share/apps/ktop/pics/csh.xpm
/usr/share/apps/ktop/pics/tcsh.xpm
/usr/share/emacs/20.3/etc/emacs.csh
/usr/share/vim/syntax/csh.vim
/usr/src/linux-2.2.5/fs/lockd/svcshare.c
/etc/csh.cshrc
/etc/profile.d/kde.csh
/etc/profile.d/mc.csh
/bin/csh
/bin/tcsh 

The locate command must have a recent database to search, and that database must be updated periodically to incorporate changes in the filesystem. If the database is stale, using locate yields a warning:

# locate tcsh
locate: warning: database /var/lib/slocate/slocate.db' is 
   more than 8 days old
updatedb

Syntax

updatedb [options]

Description

Refresh (or create) the slocate database in /var/lib/slocate/slocate.db.

Option

-e directories

Exclude a comma-separated list of directories from the database.

Example

Refresh the slocate database, excluding files in temporary locations:

# updatedb -e "/tmp,/var/tmp,/usr/tmp,/afs,/net,/proc"

updatedb is typically executed periodically via cron.

Some Linux distributions (Debian, for example) come with a version of updatedb that accepts additional options that can be specified on the command line:

Additional options

-- netpaths='path1 path2 ...'

Add network paths to the search list.

-- prunepaths='path1 path2 ...'

Eliminate paths from the search list.

-- prunefs='filesystems ...'

Eliminate entire types of filesystems, such as NFS.

These options modify the behavior of updatedb on some Linux systems by prohibiting the parsing of certain filesystem locations and by adding others. There are a few more of these options than those listed here, but these three are special in that they can also be specified through the use of environment variables set prior to updatedb execution. The variables are NETPATHS, PRUNEPATHS, and PRUNEFS. These variables and the options to updatedb are discussed here because this Objective makes specific mention of updatedb.conf, a sort of control file for updatedb. Despite its name, updatedb.conf isn't really a configuration file, but rather a fragment of a Bourne shell script that sets these environment variables. Example 4-2 shows a sample updatedb.conf file.

Example 4-2. Sample updatedb.conf File
# This file sets environment variables used by updatedb
# filesystems which are pruned from updatedb database:
PRUNEFS="NFS nfs afs proc smbfs autofs auto iso9660"
export PRUNEFS
# paths which are pruned from updatedb database:
PRUNEPATHS="/tmp /usr/tmp /var/tmp /afs /amd /alex"
export PRUNEPATHS
# netpaths which are added:
NETPATHS="/mnt/fs3"
export NETPATHS

In this example, the PRUNEFS and PRUNEPATHS variables cause updatedb to ignore types of filesystems and particular paths, respectively. NETPATHS is used to add network paths from remote directory /mnt/fs3.

updatedb.conf doesn't directly control updatedb, but eliminates the need for lengthy options on the updatedb command line, which can make crontab files a bit cleaner.

On the Exam

Remember that updatedb does not require configuration to execute. On systems that provide for configuration, updatedb.conf can specify a few extra options to updatedb by way of environment variables.

whatis


Syntax

whatis keywords

Description

Search the whatis database for exact matches to keywords and display results.

Example

# whatis mksw
mksw: nothing appropriate
apropos

Syntax

apropos keywords

Description

Search the whatis database for partial word matches to keywords and display results.

Example

# apropos mksw
mkswap (8)           - set up a Linux swap area

On the Exam

You must be familiar with the FHS concept and the contents of its major directories. Be careful about the differences between (and reasons for) /bin and /sbin, root partition and /usr partition, and locally supplied commands. Also practice with various file location techniques and be able to differentiate among them.