Book: LPI Linux Certification in a Nutshell
Section: Chapter 3.  GNU and Unix Commands (Topic 1.3)



3.3 Objective 3: Perform Basic File Management

This section covers basic file and directory management, including filesystems, files and directories, standard file management commands, their recursive capabilities where applicable, and wildcard patterns.

3.3.1 Filesystem Objects

Nearly every operating system that has ever been devised structures its collection of stored objects in a hierarchy,[10] which is a tree of objects containing other objects. This hierarchy allows a sane organization of objects and allows identically named objects to appear in multiple locations -- this is essential for multiuser systems like Linux. Information about each object in the filesystem is stored in a table (which itself is part of the filesystem), and each object is numbered uniquely within that table. Although there are a few special object types on Linux systems, the two most common are directories and files.

[10] However, it wasn't so long ago that MS-DOS was "flat" and had no hierarchy.

3.3.1.1 Directories and files

A directory is an object intended to contain other objects, while a file is an object intended to contain information. At the top of all Linux filesystem hierarchies is a directory depicted simply by /; this is known as the root directory.[11] Beneath / are named directories and files in an organized and well-defined tree. To describe these objects, you simply refer to them by name separated by the / character. For example, the object ls is an executable program stored in a directory called /bin under the root directory; it is depicted simply as /bin/ls.

[11] Not to be confused with the username root, which is separate and distinct. There's also often a directory named /root for the root user. Keeping /, /root andthe root user straight in a conversation can be a challenge.

3.3.1.2 Inodes

The identification information for a filesystem object is known as its inode. Inodes carry information about objects, such as where they are located on disk, their modification time, security settings, and so forth. Each Linux ext2 filesystem is created with a finite number of inodes, which is a number calculated based on the number of objects contained by the filesystem. Multiple objects in the filesystem can share the same inode; this concept is called linking.

3.3.1.3 File and directory management commands

Once a hierarchy is defined, there is a constant need to manage the objects in the filesystem. Objects are constantly created, read, modified, copied, moved, and deleted, and wisely managing the filesystem is one of the most important tasks of a system administrator. In this section, we discuss the basic command-line utilities used for file and directory management. While the GUI has tools for this task, the spirit of the Linux system and the requirements of Exam 101 require your understanding of these commands.

cp

Syntax

cp [options] file1 file2
cp [options] files directory

Description

In the first command form, copy file1 to file2. If file2 exists and you have appropriate privileges, it will be overwritten without warning (unless you use the -i option). Both file1 and file2 can be any valid filename, either fully qualified or in the local directory. In the second command form, copy one or more files to directory. Note that the presence of multiple files implies that you wish to copy files to a directory. If directory doesn't exist, an error message will be printed. This command form can get you in trouble if you attempt to copy a single file into a directory that doesn't exist, as the command will be interpreted as the first form and you'll end up with file2 instead of directory.

Frequently used options

-f

Force an overwrite of existing files in the destination.

-i

Prompt interactively before overwriting destination files. It is common practice (and advised) to alias the cp command to cp -i to prevent accidental overwrites. You may find that this is already done for you for user root on your Linux system.

-p

Preserve all information, including owner, group, permissions, and timestamps. Without this option, the copied file or files will have the present date and time, default permissions, owner, and group.

-r, -R

Recursively copy directories. You may use either upper- or lowercase for this option. If file1 is actually a directory instead of a file and the recursive option is specified, file2 will be a copy of the entire hierarchy under directory file1.

-v

Display the name of each file verbosely before copying.

Example 1

Copy the messages file to the local directory (specified by .):

$ cp /var/log/messages .

Example 2

Make an identical copy, including preservation of file attributes, of directory src in new directory src2:

$ cp -Rp src src2

Copy file1, file2, file5, file6, and file7 from the local directory into your home directory (under bash):

$ cp file1 file2 file[567] ~

On the Exam

Be sure to know the difference between a file destination and a directory destination and how to force an overwrite of existing objects.

mkdir


Syntax

mkdir [options] directories

Description

Create one or more directories. You must have write permission in the directory where directories are to be created.

Frequently used options

-m mode

Set the access mode for directories.

-p

Create intervening parent directories if they don't exist.

Examples

Create a read-only directory named personal:

$ mkdir -m 444 personal

Create a directory tree in your home directory, as indicated with a leading tilde (~), using a single command:

$ mkdir -p ~/dir1/dir2/dir3

In this case, all three directories are created. This is faster than creating each directory individually.

On the Exam

Verify your understanding of the tilde (~) shortcut for the home directory.

mv


Syntax

mv [options] source target

Description

Move or rename files and directories. For targets on the same filesystem (partition), moving a file doesn't relocate the contents of the file itself. Rather, the directory entry for the target is updated with the new location. For targets on different filesystems, such a change can't be made, so files are copied to the target location and the original sources are deleted.

Note that mv is used to rename files and directories, because a rename operation requires the same directory entry update as a move.

If a target file or directory does not exist, source is renamed to target. If a target file already exists, it is overwritten with source. If target is an existing directory, source is moved into that directory. If source is one or more files and target is a directory, the files are moved into the directory.

Frequently used options

-f

Force the move even if target exists, suppressing warning messages.

-i

Query interactively before moving files.

On the Exam

Remember that, from the filesystem's point of view on a single partition, renaming a file and moving it to a different location are nearly identical operations. This eliminates the need for a rename command.

rm


Syntax

rm [options] files

Description

Delete one or more files from the filesystem. To remove a file, you must have write permission in the directory that contains the file, but you do not need write permission on the file itself. The rm command also removes directories when the -d, -r, or -R option is used.

Frequently used options

-d

Remove directories even if they are not empty. This option is reserved for privileged users.

-f

Force removal of write-protected files without prompting.

-i

Query interactively before removing files.

-r, -R

If the file is a directory, recursively remove the entire directory and all of its contents, including subdirectories.

rmdir

Syntax

rmdir [option] directories

Description

Delete directories, which must be empty.

Frequently used option

-p

Remove directories and any intervening parent directories that become empty as a result. This is useful for removing subdirectory trees.

On the Exam

Remember that recursive remove using rm -R removes directories too, even if they're not empty.

touch


Syntax

touch [options] files

Description

Change the access and/or modification times of files. This command is used to refresh timestamps on files. Doing so may be necessary, for example, to cause a program to be recompiled using the date-dependant make utility.

Frequently used options

-a

Change only the access time.

-m

Change only the modification time.

-t timestamp

Instead of the current time, use timestamp in the form of [[CC]YY]MMDDhhmm[.ss]. For example, the timestamp for January 12, 2001, at 6:45 p.m. is 200101121845.

3.3.2 File-Naming Wildcards

When working with files on the command line, you'll often run into situations in which you need to perform operations on many files at once. For example, if you are developing a C program, you may want to touch all of your .c files in order to be sure to recompile them the next time you issue the make utility to build your program. There will also be times when you need to move or delete all the files in a directory or at least a selected group of files. At other times, filenames may be long or difficult to type, and you'll want to find an abbreviated alternative to typing the filenames for each command you issue.

In order to make these operations simpler, all shells[12] on Linux offer file-naming wildcards (Table 3-3). Rather than explicitly specifying every file or typing long filenames, specifying wildcard characters in place of portions of the filenames can usually do the work for you. For example, the shell expands things like *.txt to a list of all the files that end in .txt. File wildcard constructs like this are called file globs, and their use is awkwardly called globbing. Using file globs to specify multiple files is certainly a convenience, and in many cases is required to get anything useful accomplished.

[12] Wildcards are expandedby the shell, not by commands. When a command is entered with wildcards included, the shell first expands all the wildcards (and other types of expansion) and passes the full result on to the command. This process is invisible to you.

Table 3-3. Common File-Naming Wildcards

Wildcard

Description

*

Commonly thought to "match anything." It actually will match zero or more characters (which includes "nothing"!). For example, x* matches files or directories x, xy, xyz, x.txt, xy.txt, xyz.c, and so on.

?

Match exactly one character. For example, x? matches files or directories xx, xy, xz, but not x and not xyz. The specification x?? matches xyz, but not x and xy.

[characters]

Match any single character from among characters listed between the brackets. For example, x[yz] matches xy and xz.

[!characters]

Match any single character other than characters listed between the brackets. For example, x[!yz] matches xa and x1 but does not match xy and does not match xz.

[a-z]

Match any single character from among the range of characters listed between the brackets and indicated by the dash (the dash character is not matched). For example, x[0-9] matches x0 and x1, but does not match xx. Note that to match both upper- and lowercase letters,[13] you specify [a-zA-Z]. Using x[a-zA-Z] matches xa and xA.

[!a-z]

Match any single character from among the characters not in the range listed between the brackets.

{ frag1, frag2, frag3...}

Create strings frag1, frag2, frag3, etc. For example, file_{one,two,three} yields the strings file_one, file_two, and file_three. This is a special operator named brace expansion that can be used to match filenames but isn't specifically a file wildcard operator and does not examine directories for existing files to match. Instead, it will expand any string.

For example, it can be used with echo to yield strings totally unrelated to existing filenames:

$ echo string_{a,b,c}

string_a string_b string_c

[13] Linux filenames are case-sensitive.

Here are a few common applications for wildcards:

  • If you remember part of a filename but not the whole thing, use wildcards with the portion you remember to help find the file. For example, if you're working in a directory with a large number of files and you know you're looking for a file named for Linux, you may enter a command like this:

    $ ls -l *inux* 
  • When working with groups of related files, wildcards can be used to help separate the groups. For example, suppose you have a directory full of scripts you've written. Some are Perl scripts, for which you've used an extension of .pl, and some are Python, with a .py extension. You may wish to separate them into new separate directories for the two languages like this:

    $ mkdir perl python
    $ mv *.pl perl
    $ mv *.py python
  • Wildcards match directory names as well. Suppose you have a tree of directories starting with contracting, where you've created a directory for each month (that is, contracting/january, contracting/february, through contracting/december). In each of these directories are stored invoices, named simply invoice_custa_01.txt, invoice_custa_02.txt, invoice_custb_01.txt, and so on, where custa and custb are customer names of some form. To display all of the invoices, wildcards can be used:

    $ ls con*/*/inv*.txt

    The first * matches tracting. The second matches all directories under the contracting directory ( january through december). The last matches all the customers and each invoice number for each customer.

See the bash man or info pages for additional information on how bash handles expansions and on other expansion forms.