Book: LPI Linux Certification in a Nutshell
Section: Chapter 14.  Linux Installation and Package Management (Topic 2.2)



14.3 Objective 3: Make and Install Programs from Source

Open source software is credited with offering value that rivals or even exceeds that of proprietary vendors' products. While binary distributions make installation simple, you sometimes won't have access to a binary package. In these cases, you'll have to compile the program from scratch.

14.3.1 Getting Open Source and Free Software

Source code for the software that makes up a Linux distribution is available from a variety of sources. Your distribution media contain both source code and compiled binary forms of many software projects. Since much of the code that comes with Linux originates from the Free Software Foundation (FSF), the GNU web site contains a huge array of software.[5] Major projects, such as Apache (http://www.apache.org/), distribute their own code. Whatever outlet you choose, the source code must be packaged for your use, and among the most popular packaging methods for source code is the tarball.

[5] Not just for Linux, either. Although Linux distributions are largely made up of GNU software, that software runs on many other Unix and Unix-like operating systems, including the various flavors of BSD (e.g., FreeBSD, NetBSD, and OpenBSD).

14.3.1.1 What's a tarball?

Code for a significant project that a software developer wishes to distribute is originally stored in a hierarchical tree of directories. Included are the source code (in the C language), a Makefile, and some documentation. In order to share the code, the entire tree must be encapsulated in a way that is efficient and easy to send and store electronically. A common method of doing this is to use tar to create a single tarfile containing the directory's contents, and then use gzip to compress it for efficiency. The resulting compressed file is referred to as a tarball. This method of distribution is popular because both tar and gzip are widely available and understood, ensuring a wide audience. A tarball is usually indicated by the use of the multiple extensions .tar and .gz, put together into .tar.gz. A combined single extension of .tgz is also popular.

14.3.1.2 Opening a tarball

The contents of a tarball is obtained through a two-step process. The file is first uncompressed with gzip and then extracted with tar. Following is an example, starting with tarball.tar.gz:

# gzip -d tarball.tar.gz
# tar xvf tarball.tar

The -d option to gzip indicates "decompress mode." If you prefer, you can use gunzip in place of gzip -d to do the same thing:

# gunzip tarball.tar.gz
# tar xvf tarball.tar

You can also avoid the intermediate unzipped file by piping the output of gzip straight into tar:

# gzip -dc tarball.tar.gz | tar xv

In this case, the -c option to gzip tells it to keep the compressed file in place. By avoiding the full-sized version, disk space is saved. For even more convenience, avoid using gzip entirely and use the decompression capability in tar:[6]

[6] GNU tar offers compression; older tar programs didn't.

# tar zxvf tarball.tar.gz

On the Exam

All of these methods achieve the same result. Be sure you understand that tar can archive directly to files (not just to a tape drive) and that a compressed version of a tarfile is made with gzip. Be familiar with the various ways you could extract files from a tarball, including gzip -d; tar, gunzip; tar, gzip -d | tar; and tar z. You should be comfortable using tar and gzip and their more common options.

14.3.2 Compiling Open Source Software

Once you've extracted the source code, you're ready to compile it. You'll need to have the appropriate tools available on your system, namely a configure script, the GNU C compiler, gcc, and the dependency checker, make.

14.3.2.1 configure

Most larger source code packages include a configure script[7] located at the top of the source code tree. This script needs no modification or configuration from the user. When it executes, it examines your system to verify the existence of a compiler, libraries, utilities, and other items necessary for a successful compile. It uses the information it finds to produce a custom Makefile for the software package on your particular system. If configure finds that something is missing, it fails and gives you a terse but descriptive message. configure succeeds in most cases, leaving you ready to begin the actual compile process.

[7] configure is produced for you by the programmer using the autoconf utility. autoconf is beyond the scope LPIC Level 1 exams.

14.3.2.2 make

make is a utility for compiling software. When multiple source-code files are used in a project, it is rarely necessary to compile all of them for every build of the executable. Instead, only the source files that have changed since the last compilation really need to be compiled again.

make works by defining targets and their dependencies. The ultimate target in a software build is the executable file or files. They depend on object files, which in turn depend on source-code files. When a source file is edited, its date is more recent than that of the last compiled object. make is designed to automatically handle these dependencies and do the right thing.

To illustrate the basic idea, consider this trivial and silly example. Suppose you're writing a program with code in two files. The C file, main.c, holds the main( ) function:

int main(  ) {
  printit(  );
}

and printit.c contains the printit( ) function, which is called by main( ):

#include <stdio.h>
void printit(  ) {
  printf("Hello, world\n");
}

Both source files must be compiled into objects main.o and printit.o, and then linked together to form an executable application called hw. In this scenario, hw depends on the two object files, a relationship that could be defined like this:

hw: main.o printit.o

Using this syntax, the dependency of the object files on the source files would look like this:

main.o: main.c
printit.o: printit.c

With these three lines, there is a clear picture of the dependencies involved in the project. The next step is to add the commands necessary to satisfy each of the dependencies. Compiler directives are added next:

gcc -c main.c
gcc -c printit.c
gcc -o hw main.o printit.o

To allow for a change of compilers in the future, a variable can be defined to hold the actual compiler name:

CC = gcc

To use the variable, use the syntax $(variable) for substitution of the contents of the variable. Combining all this, the result is:

CC = gcc

hw: main.o printit.o
      $(CC) -o hw main.o printit.o

main.o: main.c
      $(CC) -c main.c

printit.o: printit.c
      $(CC) -c printit.c

This illustrates a simple Makefile, the default control file for make. It defines three targets: hw (the application), and main.o and printit.o (the two object files). A full compilation of the hw program is invoked by running make and specifying hw as the desired target:

# make hw
gcc -c main.c
gcc -c printit.c
gcc -o hw main.o printit.o

make automatically expects to find its instructions in Makefile. If a subsequent change is made to one of the source files, make will handle the dependency:

# touch printit.c
# make hw
gcc -c printit.c
gcc -o hw main.o printit.o

This trivial example doesn't illustrate a real-world use of make or the Makefile syntax. make also has powerful rule sets that allow commands for known dependency relationships to be issued automatically. These rules would shorten even this tiny Makefile.

14.3.2.3 Installing the compiled software

Most mature source-code projects come with a predetermined location in the filesystem for the executable files created by compilation. In many cases, they're expected to go to /usr/local/bin. To facilitate installation to these default locations, many Makefiles contain a special target called install. By executing the make install command, files are copied and set with the appropriate attributes.

The default installation directory included in a project's Makefile may differ from that defined by your Linux distribution. If you upgrade software you are already using, this could lead to confusion over versions.

On the Exam

A basic understanding of make is sufficient for Exam 102. In addition, be prepared to add to or modify the contents of variables in a Makefile, such as include directories or paths. This could be necessary, for example, if additional libraries must be included in the compilation or if a command must be customized.

14.3.2.4 Example: Compiling bash

GNU's bash shell is presented here as an example of the process of compiling. You can find a compressed tarball of the bash source at the GNU FTP site, ftp://ftp.gnu.org/gnu/bash/. Multiple versions might be available. Version 2.03 is used in this example (you will find more recent versions). The compressed tarball is bash-2.03.tar.gz. As you can see by its name, it is a tar file that has been compressed with gzip. To uncompress the contents, use the compression option in tar:

# tar zxvf bash-2.03.tar.gz
bash-2.03/
bash-2.03/CWRU/
bash-2.03/CWRU/misc/
bash-2.03/CWRU/misc/open-files.c
bash-2.03/CWRU/misc/sigs.c
bash-2.03/CWRU/misc/pid.c
... (extraction continues) ...

Next move into the new directory, take a look around, and read some basic documentation:

# cd bash-2.03
# ls
AUTHORS        NEWS
CHANGES        NOTES
COMPAT         README
COPYING        Y2K
CWRU           aclocal.m4
INSTALL        alias.c
MANIFEST       alias.h
Makefile.in    ansi_stdlib.h
... (listing continues) ...
# less README

The build process for bash is started by using the dot-slash prefix to launch configure:

# ./configure
creating cache ./config.cache
checking host system type... i686-pc-linux-gnu
Beginning configuration for bash-2.03 for i686-pc-linux-gnu
checking for gcc... gcc
checking whether the C compiler (gcc  ) works... yes
checking whether the C compiler (gcc  ) is a 
  cross-compiler... no
checking whether we are using GNU C... yes
checking whether gcc accepts -g... yes
checking whether large file support needs explicit 
  enabling... yes
checking for POSIXized ISC... no
checking how to run the C preprocessor... gcc -E # make
... (configure continues) ...

Next, compile:

# make
/bin/sh ./support/mkversion.sh -b -s release -d 2.03 \
  -p 0 -o newversion.h && mv newversion.h version.h

***********************************************************
*                                                         *
* Making Bash-2.03.0-release for a i686 running linux-gnu
*                                                         *
***********************************************************

rm -f shell.o
gcc  -DPROGRAM='"bash"' -DCONF_HOSTTYPE='"i686"' \
  -DCONF_OSTYPE='"linux-gnu"' -DCONF_MACHTYPE='"i686
-pc-linux-gnu"' -DCONF_VENDOR='"pc"' -DSHELL \
  -DHAVE_CONFIG_H  -D_FILE_OFFSET_BITS=64  -I.  -I. -I./
lib -I/usr/local/include -g -O2 -c shell.c
rm -f eval.o
... (compile continues) ...

If the compile yields fatal errors, make terminates and the errors must be addressed before installation. Errors might include problems with the source code (unlikely), missing header files or libraries, and other problems. Error messages will usually be descriptive enough to lead you to the source of the problem.

The final step of installation requires that you are logged in as root in order to copy the files to the system directories:

# make install
/usr/bin/install -c -m 0755 bash /usr/local/bin/bash
/usr/bin/install -c -m 0755 bashbug /usr/local/bin/bashbug
( cd ./doc ; make  \
        man1dir=/usr/local/man/man1 man1ext=1 \
        man3dir=/usr/local/man/man3 man3ext=3 \
        infodir=/usr/local/info install )
make[1]: Entering directory `/home/ftp/bash-2.03/doc'
test -d /usr/local/man/man1 || /bin/sh ../support/mkdirs /usr/local/man/man1
test -d /usr/local/info || /bin/sh ../support/mkdirs 
  /usr/local/info
/usr/bin/install -c -m 644 ./bash.1 
  /usr/local/man/man1/bash.1
/usr/bin/install -c -m 644 ./bashbug.1 
  /usr/local/man/man1/bashbug.1
/usr/bin/install -c -m 644 ./bashref.info 
  /usr/local/info/bash.info
if /bin/sh -c 'install-info --version' 
  >/dev/null 2>&1; then \
  install-info --dir-file=/usr/local/info/dir 
  /usr/local/info/bash.info; \
else true; fi
make[1]: Leaving directory `/home/ftp/bash-2.03/doc'

The installation places the new version of bash in /usr/local/bin. Now, two working versions of bash are available on the system:

# which bash
/bin/bash
# /bin/bash -version
GNU bash, version 1.14.7(1)
# /usr/local/bin/bash -version
GNU bash, version 2.03.0(1)-release (i686-pc-linux-gnu)
Copyright 1998 Free Software Foundation, Inc. 

On the Exam

Familiarize yourself with the acquisition, configuration, compilation, and installation of software from source. Be prepared to answer questions on make and Makefile, the function of the configure utility, gzip, and tar.