Book: LPI Linux Certification in a Nutshell
Section: Chapter 3.  GNU and Unix Commands (Topic 1.3)



3.1 Objective 1: Work Effectively on the Unix Command Line

Every computer system requires a human interface component. For Linux system administration, a text interface is typically used. The system presents the administrator with a prompt, which at its simplest is a single character such as $ or #. The prompt signifies that the system is ready to accept typed commands, which usually occupy one or more lines of text. This interface is generically called the command line.

It is the job of a program called a shell to provide the command prompt and to interpret commands. The shell provides an interface layer between the Linux kernel and the human user, which is how it gets its name. The original shell for Unix systems was written by Steve Bourne and was called simply sh. The default Linux shell is bash, the Bourne-Again Shell, which is a GNU variant of sh. The popular tcsh shell, a variant of the original csh (or C shell), is also provided. The bash shell is the subject of an entire LPI Topic, covered in Chapter 17. At this point, we are primarily concerned with our interaction with bash and the effective use of commands.

3.1.1 The Interactive Shell

The shell is a powerful programming environment, capable of automating nearly anything you can imagine on your Linux system. The shell is also your interactive interface to your system. When you first start a shell, it does some automated housekeeping to get ready for your use, and then presents a command prompt. The command prompt tells you that the shell is ready to accept commands from its standard input device, which is usually the keyboard. Shells can run standalone, as on a physical terminal, or within a window in a GUI environment. Whichever the case, their use is the same.

3.1.1.1 Shell variable basics

During execution, bash maintains a set of shell variables that contain information important to the execution of bash. Most of these variables are set when bash starts, but they can be set manually at any time.

The first shell variable of interest in this Topic is called PS1 (which simply stands for Prompt String 1). This special variable holds the contents of the command prompt that are displayed when bash is ready to accept commands (there is also a PS2 variable, used when bash needs multiple-line input to complete a command). You can easily display the contents of PS1, or any other shell variable, by using the echo command with the variable name preceded by the $ symbol:

$ echo $PS1
\$

The \$ output tells us that PS1 contains the two characters \ and $. The backslash character tells the shell not to interpret the dollar symbol in any special way (that is, as a metacharacter, described later in this section). A simple dollar sign such as this was the default prompt for sh, but bash offers options to make the prompt much more informative. On your system, the default prompt stored in PS1 is probably something like:

[\u@\h \W]\$ 

Each of the characters preceded by backslashes have a special meaning to bash, while those without backslashes are interpreted literally. In this example, \u is replaced by the username, \h is replaced by the system's hostname, \W is replaced by the "bottom" portion of the current working directory, and \$ is replaced by a $ character.[1] This yields a prompt of the form:

[1] Unless you are root, in which case \$ is replaced by #.

[jdean@linuxpc jdean]$ 

How your prompt is formulated is really just a convenience and does not affect how the shell interprets your commands. However, adding information to the prompt, particularly regarding system, user, and directory location, can make life easier when hopping from system to system and logging in as multiple users (as yourself and root, for example). See the documentation on bash for more information on customizing prompts.

Another shell variable that is extremely important during interactive use is PATH , which contains a list of all the directories that hold commands or other programs you are likely to execute. A default path is set up for you when bash starts. You may wish to modify the default to add other directories that hold programs you need to run.

Every file in the Linux filesystem can be specified in terms of its location. The less program, for example, is located in the directory /usr/bin. Placing /usr/bin in your PATH enables you to execute less by simply typing less rather than the explicit /usr/bin/less.

In order for bash to find and execute the command you enter at the prompt, the command must be either:

  • A bash built-in command that is part of bash itself

  • An executable program located in a directory listed in the PATH variable

  • Explicitly defined

The shell holds PATH and other variables for its own use. However, many of the shell's variables are needed during the execution of programs launched from the shell (including other shells). For these variables to be available, they must be exported, at which time they become environment variables. Environment variables are passed on to programs and other shells, and together they are said to form the environment in which the programs execute. PATH is always made into an environment variable.[2] Exporting a shell variable to turn it into an environment variable is done using the export command:

[2] In the case of csh and tcsh, there are both shell and environment variables for PATH; the shell takes care of keeping them synchronized.

$ export MYVAR

When a variable is exported to the environment, it is passed into the environment of all child processes. That is, it will be available to all programs run by your shell.

3.1.1.2 Entering commands at the command prompt

Commands issued to the shell on a Linux system generally consist of four components:

  • A valid command (a shell built-in, a program or script found among directories listed in the PATH, or an explicitly defined program)

  • Command options, usually preceded by a dash

  • Arguments

  • Line acceptance (i.e., pressing the Enter key), which we assume in the examples

Each command has its own unique syntax, though most follow a fairly standard form. At minimum, a command is necessary:

$ ls

This simple command lists files in the current working directory. It requires neither options nor arguments. Generally, options are letters or words preceded by a single or double dash and are added after the command and separated from it by a space:

$ ls -l

The -l option modifies the behavior of the ls program by listing files in a longer, more detailed format. In most cases, single-dash options can be either combined or specified separately. To illustrate this, consider these two equivalent commands:

$ ls -l -a
$ ls -la

By adding the -a option, ls does not hide files beginning with a dot (which it does by default). Adding that option by specifying -la yields the same result. Some commands offer alternative forms for the same option. In the preceding example, the -a option can be replaced with -- all:

$ ls -l --all

These double-dash full-word options are frequently found in programs from the GNU project. They cannot be combined as the single-dash options can. Both types of options can be freely intermixed. Although the longer GNU-style options require more typing, they are easier to remember and easier to read in scripts than the single-letter options.

Adding an argument further refines the command's behavior:

$ ls -l *.c

Now the command will give a detailed listing only of C program source files (those with the .c extension), if they exist, in the current working directory. In this example, if no .c files exist, no output will be given.[3] Sometimes, options and arguments can be mixed in order:

[3] If a Unix or GNU command has nothing of significance to tell you, it most likely will remain silent. This brevity may take some users by surprise, particularly if they are used to systems that yield messages indicating something like "successful completion, but sorry, no results."

$ ls --all *.c -l

In this case, ls was able to determine that -l is an option and not another file descriptor.

Some commands, such as tar and ps, don't require the dash preceding an option because at least one option is expected or required. Also, an option often instructs the command that the subsequent item on the command line is a specific argument. For example:

$ tar cf mytarfile file1 file2 file3
$ tar -cf mytarfile file1 file2 file3

These equivalent commands use tar to create an archive file named mytarfile and put three files ( file1, file2, and file3) into it. In this case, the f option tells tar that archive filename mytarfile follows immediately after the option.

Just as any natural language contains exceptions and variations, so does the syntax used for GNU and Unix commands. You should have no trouble learning the essential syntax for the commands you need to use often. The capabilities of the command set offered on Linux are extensive, making it highly unlikely that you'll memorize all of the command syntax you need. Most systems administrators are constantly learning about features they've never used in commands they use regularly. It is standard practice to regularly refer to man or info pages and other documentation on commands you're using, so feel free to explore and learn as you go.

3.1.1.3 Entering commands not in the PATH

Occasionally, you will need to execute a command that is not in your path and not built into your shell. If this need arises often, it may be best to simply add the directory that contains the command to your path. However, there's nothing wrong with explicitly specifying a command's location and name completely. For example, the ls command is located in /bin. This directory is most certainly in your PATH variable (if not, it should be!), which allows you to enter the ls command by itself on the command line:

$ ls

The shell will look for an executable file named ls in each successive directory listed in your PATH variable and will execute the first one it finds. Specifying the fully qualified filename for the command eliminates the directory search and yields identical results:

$ /bin/ls

Any executable file on your system may be started in this way. However, it is important to remember that some programs may have requirements during execution about what is listed in your PATH. A program can be launched normally but may fail if it is unable to find a required resource if the PATH is incomplete.

3.1.1.4 Entering multiple-line commands interactively

In addition to its interactive capabilities, the shell also has a complete programming language of its own. Many programming features can be very handy at the interactive command line as well. Looping constructs, including for, until, and while are often used this way. When you begin a command such as these, which normally spans multiple lines, bash prompts you for the subsequent lines until a valid command has been completed. The prompt you receive in this case is stored in shell variable PS2, which by default is >. For example, if you wanted to repetitively execute a series of commands each time with a different argument from a known series, you could enter the following:

$ ...series of commands on arg1...
command output
$ ...series of commands on arg2...
command output
$ ...series of commands on arg2...
command output

Rather than entering each command manually, you can interactively use bash's for loop construct to do the work for you. Note that indented style, such as what you might use in traditional programming, isn't necessary when working interactively with the shell:

$ for var in arg1 arg2 arg3
> do
> echo $var
> ...series of commands...
> done
arg1
command output
arg2
command output
arg3
command output

Mixing the command-line world with the shell-scripting world in this way can make certain tasks surprisingly efficient.

3.1.1.5 Entering command sequences

There may be times when it is convenient to place multiple commands on a single line. Normally, bash assumes you have reached the end of a command (or the end of the first line of a multiple-line command) when you press Return. To add more than one command to a single line, the commands can be separated and entered sequentially with the command separator , a semicolon. Using this syntax, the following commands:

$ ls
$ ps

are, in essence, identical to and will yield the same result as the following single-line command that employs the command separator:

$ ls; ps  

On the Exam

Command syntax and the use of the command line is very important. Pay special attention to the use of options and arguments and how they are differentiated. Also be aware that some commands expect options to be preceded by a dash while other commands do not.

3.1.2 Command History and Editing

If you consider interaction with the shell as a kind of conversation, it's a natural extension to refer back to things "mentioned" previously. You may type a long and complex command that you want to repeat, or perhaps you need to execute a command multiple times with slight variation.

If you work interactively with the original Bourne shell, maintaining such a "conversation" can be a bit difficult. Each repetitive command must be entered explicitly, each mistake must be retyped, and if your commands scroll off the top of your screen, you have to recall them from memory. Modern shells such as bash and tcsh include a significant feature set called command history, expansion, and editing. Using these capabilities, referring back to previous commands is painless, and your interactive shell session becomes much simpler and more effective.

The first part of this feature set is command history. When bash is run interactively, it provides access to a list of commands previously typed. The commands are stored in the history list prior to any interpretation by the shell. That is, they are stored before wildcards are expanded or command substitutions are made. The history list is controlled by the HISTSIZE shell variable. By default, HISTSIZE is set to 500 lines, but you can control that number by simply adjusting HISTSIZE's value. In addition to commands entered in your current bash session, commands from previous bash sessions are stored by default in a file called ~/.bash_history (or the file named in shell variable HISTFILE).[4] To view your command history, use the bash built-in history command. A line number will precede each command. This line number may be used in subsequent history expansion. History expansion uses either a line number from the history or a portion of a previous command to reexecute that command.[5] Table 3-1 lists the basic history expansion designators. In each case, using the designator as a command causes a command from the history to be executed again.

[4] If you use multiple shells in a windowed environment (as just about everyone does), the last shell to exit will write its history to ~/.bash_history. For this reason you may wish to use one shell invocation for most of your work.

[5] History expansion also allows a fair degree of command editing using syntax you'll find in the bash documentation.

Table 3-1. Command History Expansion Designators

Designator

Description

!!

Often called bang-bang,[6] this command refers to the most recent command.

!n

Refer to command n from the history. You'll use the history command to display these numbers.

!-n

Refer to the current command minus n from the history.

! string

Refer to the most recent command starting with string.

!? string

Refer to the most recent command containing string.

^ string1^string2

Quick substitution. Repeat the last command, replacing the first occurrence of string1 with string2.

[6] The exclamation point is often called bang on Linux and Unix systems.

While using history substitution can be useful for executing repetitive commands, command history editing is much more interactive. To envision the concept of command history editing, think of your entire bash history (including that obtained from your ~/.bash_history file) as the contents of an editor's buffer. In this scenario, the current command prompt is the last line in an editing buffer, and all of the previous commands in your history lie above it. All of the typical editing features are available with command history editing, including movement within the "buffer," searching, cutting, pasting, and so on. Once you're used to using the command history in an editing style, everything you've done on the command line becomes available as retrievable, reusable text for subsequent commands. The more familiar you become with this concept, the more useful it can be.

By default, bash uses key bindings like those found in the Emacs editor for command history editing.[7] If you're familiar with Emacs, moving around in the command history will be familiar and very similar to working in an Emacs buffer. For example, the key command Ctrl-p (depicted as C-p) will move up one line in your command history, displaying your previous command and placing the cursor at the end of it. This same function is also bound to the up arrow key. The opposite function is bound to C-n (and the down arrow). Together, these two key bindings allow you to examine your history line by line. You may reexecute any of the commands shown simply by pressing Return when it is displayed. For the purposes of Exam 101, you'll need to be familiar with this editing capability, but detailed knowledge is not required. Table 3-2 lists some of the common Emacs key bindings you may find useful in bash. Note that C- indicates the Ctrl key, while M- indicates the Meta key, which is usually Alt on PC keyboards.[8]

[7] An editing style similar to the vi editor is also available.

[8] In unusual circumstances, such as on a terminal, using the meta key means pressing the Escape (Esc) key, releasing it, and then pressing the defined key. The Esc key is not a modifier, but serves to modify meta keys when an Alt-style key is unavailable.

Table 3-2. Basic Command History Editing Emacs Key Bindings

Key

Description

C-p

Previous line (also up arrow)

C-n

Next line (also down arrow)

C-b

Back one character (also left arrow)

C-f

Forward one character (also right arrow)

C-a

Beginning of line

C-e

End of line

C-l

Clear the screen, leaving the current line at the top of the screen

M-<

Top of history

M->

Bottom of history

C-d

Delete character from right

C-k

Delete (kill) text from cursor to end of line

C-y

Paste (yank) text previously cut (killed)

M-d

Delete (kill) word

C-rtext

Reverse search for text

C-stext

Forward search for text

3.1.2.1 Command substitution

bash offers a handy ability to do command substitution. This feature allows you to replace the result of a command with a script. For example, wherever $( command) is found, its output will be substituted. This output could be assigned to a variable, as in the number of lines in the .bashrc file:

$ RCSIZE=$(wc -l ~/.bashrc)

Another form of command substitution is `command`. The result is the same, except that the backquote syntax has some special rules regarding metacharacters that the $(command) syntax avoids.

3.1.2.2 Applying commands recursively through a directory tree

There are many times when it is necessary to execute commands recursively. That is, you may need to repeat a command throughout all the branches of a directory tree. Recursive execution is very useful but also can be dangerous. It gives a single interactive command the power to operate over a much broader range of your system than your current directory, and the appropriate caution is necessary. Think twice before using these capabilities, particularly when operating as the superuser.

Some of the GNU commands on Linux systems have built-in recursive capabilities as an option. For example, chmod modifies permissions on files in the current directory:

$ chmod g+w *.c

In this example, all files with the .c extension in the current directory are modified with the group-write permission. However, there may be a number of directories and files in hierarchies that require this change. chmod contains the -R option (note the uppercase option letter; you may also use -- recursive), which instructs the command to operate not only on files and directories specified on the command line, but also on all files and directories contained under the specified directories. For example, this command gives the group-write permission to all files in a source-code tree named src:

$ chmod -R g+w src

Provided you have the correct privileges, this command will descend into each subdirectory in the src directory and add the requested permission to each file and directory it finds. Other example commands with this ability include cp (copy), ls (list files), and rm (remove files).

A more general approach to recursive execution through a directory is available by using the find command. This is an extremely powerful command because it can tell you a lot about your system's file structure. find is inherently recursive and is intended to descend through directories looking for files with certain attributes or executing commands. At its simplest, find displays an entire directory hierarchy when you simply enter the command with a target directory:

$ find src
...files and directories are listed recursively...

To get more specific, add the -name option to search the same directories for C files:

$ find src -name "*.c"
....c files are listed recursively[9]...

[9] This can be done recursively with the ls command as well.

find can also execute commands against its results with the -exec option, which can execute any command against each successive element listed by find. During execution, a special variable {} is replaced by these find results. The command entered after the -exec option must be terminated by a semicolon; any metacharacters used -- including the semicolon -- must be either quoted or escaped. To take the previous example a little further, rather than execute the chmod recursively against all files in the src directory, find can execute it against the C files only, like this:

$ find src -name "*.c" -exec chmod g+w {} \;

The find command is capable of much more than this simple example and can locate files with particular attributes such as dates, protections, file types, access times, and others. While the syntax can be confusing, the results are worth some study of find.