[ Team LiB ] |
5.3 Using kill to Control ProcessesLinux and other Unix-like operating systems support a form of interprocess communication called signals. The kill command is used to send a signal to a running process. How a process responds to a signal, if it responds at all, depends on the specific signal sent and on the handler set by the process. If you are familiar with Unix signal handling, you will find that Apache adheres to the usual conventions, and you can probably skip this section. This section describes the use of kill in relation to Apache for readers who aren't accustomed to working with signals. The name "kill" is a misnomer; it sounds as if the command is inherently destructive, but kill simply sends signals to programs. Only a few signals will actually kill the process by default. Most signals can be caught by the process, which may choose to either perform a specific action or ignore the signal. When a process is in a zombie or uninterruptible sleep( ) state, it might ignore any signals. The following example will help dispel any fear of using this command. Most people who are familiar with the command line know that pressing Ctrl-C will usually terminate a process running in a console. For example, it is common to execute: panic% tail -f /home/httpd/httpd_perl/logs/error_log to monitor the Apache server's error_log file. The only way to stop tail is by pressing Ctrl-C in the console in which the process is running. The same result can be achieved by sending the INT (interrupt) signal to this process. For example: panic% kill -INT 17084 When this command is run, the tail process is aborted, assuming that the process identifier (PID) of the tail process is 17084. Every process running in the system has its own PID. kill identifies processes by their PIDs. If kill were to use process names and there were two tail processes running, it might send the signal to the wrong process. The most common way to determine the PID of a process is to use ps to display information about the current processes on the machine. The arguments to this utility vary depending on the operating system. For example, on BSD-family systems, the following command works: panic% ps auxc | grep tail On a System V Unix flavor such as Solaris, the following command may be used instead: panic% ps -eaf | grep tail In the first part of the command, ps prints information about all the current processes. This is then piped to a grep command that prints lines containing the text "tail". Assuming only one such tail process is running, we get the following output: root 17084 0.1 0.1 1112 408 pts/8 S 17:28 0:00 tail The first column shows the username of the account running the process, the second column shows the PID, and the last column shows the name of the command. The other columns vary between operating systems. Processes are free to ignore almost all signals they receive, and there are cases when they will. Let's run the less command on the same error_log file: panic% less /home/httpd/httpd_perl/logs/error_log Neither pressing Ctrl-C nor sending the INT signal will kill the process, because the implementers of this utility chose to ignore that signal. The way to kill the process is to type q. Sometimes numerical signal values are used instead of their symbolic names. For example, 2 is normally the numeric equivalent of the symbolic name INT. Hence, these two commands are equivalent on Linux: panic% kill -2 17084 panic% kill -INT 17084 On Solaris, the -s option is used when working with symbolic signal names: panic% kill -s INT 17084 To find the numerical equivalents, either refer to the signal(7) manpage, or ask Perl to help you: panic% perl -MConfig -e 'printf "%6s %2d\n", $_, $sig++ \ for split / /, $Config{sig_name}' If you want to send a signal to all processes with the same name, you can use pkill on Solaris or killall on Linux. 5.3.1 kill Signals for Stopping and Restarting ApacheApache performs certain actions in response to the KILL, TERM, HUP, and USR1 signals (as arguments to kill). All Apache system administrators should be familiar with the use of these signals to control the Apache web server. By referring to the signal.h file, we learn the numerical equivalents of these signals: #define SIGHUP 1 /* hangup, generated when terminal disconnects */ #define SIGKILL 9 /* last resort */ #define SIGTERM 15 /* software termination signal */ #define SIGUSR1 30 /* user defined signal 1 */ The four types of signal are:
By default, if a server is restarted using the USR1 or the HUP signal and mod_perl is not compiled as a DSO, Perl scripts and modules are not reloaded. To reload modules pulled in via PerlRequire, PerlModule, or use, and to flush the Apache::Registry cache, either completely stop the server and then start it again, or use this directive in httpd.conf: PerlFreshRestart On (This directive is not always recommended. See Chapter 22 for further details.) 5.3.2 Speeding Up Apache's Termination and RestartRestart or termination of a mod_perl server may sometimes take quite a long time, perhaps even tens of seconds. The reason for this is a call to the perl_destruct( ) function during the child exit phase, which is also known as the cleanup phase. In this phase, the Perl END blocks are run and the DESTROY method is called on any global objects that are still around. Sometimes this will produce a series of messages in the error_log file, warning that certain child processes did not exit as expected. This happens when a child process, after a few attempts have been made to terminate it, is still in the middle of perl_destruct( ). So when you shut down the server, you might see something like this: [warn] child process 7269 still did not exit, sending a SIGTERM [error] child process 7269 still did not exit, sending a SIGKILL [notice] caught SIGTERM, shutting down First, the parent process sends the TERM signal to all of its children, without logging a thing. If any of the processes still doesn't quit after a short period, it sends a second TERM, logs the PID of the process, and marks the event as a warning. Finally, if the process still hasn't terminated, it sends the KILL signal, which unconditionaly terminates the process, aborting any operation in progress in the child. This event is logged as an error. If the mod_perl scripts do not contain any END blocks or DESTROY methods that need to be run during shutdown, or if the ones they have are nonessential, this step can be avoided by setting the PERL_DESTRUCT_LEVEL environment variable to -1. (The -1 value for PERL_DESTRUCT_LEVEL is special to mod_perl.) For example, add this setting to the httpd.conf file: PerlSetEnv PERL_DESTRUCT_LEVEL -1 What constitutes a significant cleanup? Any change of state outside the current process that cannot be handled by the operating system itself. Committing database transactions and removing the lock on a resource are significant operations, but closing an ordinary file is not. For example, if DBI is used for persistent database connections, Perl's destructors should not be switched off. 5.3.3 Finding the Right Apache PIDIn order to send a signal to a process, its PID must be known. But in the case of Apache, there are many httpd processes running. Which one should be used? The parent process is the one that must be signaled, so it is the parent's PID that must be identified. The easiest way to find the Apache parent PID is to read the httpd.pid file. To find this file, look in the httpd.conf file. Open httpd.conf and look for the PidFile directive. Here is the line from our httpd.conf file: PidFile /home/httpd/httpd_perl/logs/httpd.pid When Apache starts up, it writes its own process ID in httpd.pid in a human-readable format. When the server is stopped, httpd.pid should be deleted, but if Apache is killed abnormally, httpd.pid may still exist even if the process is not running any more. Of course, the PID of the running Apache can also be found using the ps(1) and grep(1) utilities (as shown previously). Assuming that the binary is called httpd_perl, the command would be: panic% ps auxc | grep httpd_perl or, on System V: panic% ps -ef | grep httpd_perl This will produce a list of all the httpd_perl (parent and child) processes. If the server was started by the root user account, it will be easy to locate, since it will belong to root. Here is an example of the sort of output produced by one of the ps command lines given above: root 17309 0.9 2.7 8344 7096 ? S 18:22 0:00 httpd_perl nobody 17310 0.1 2.7 8440 7164 ? S 18:22 0:00 httpd_perl nobody 17311 0.0 2.7 8440 7164 ? S 18:22 0:00 httpd_perl nobody 17312 0.0 2.7 8440 7164 ? S 18:22 0:00 httpd_perl In this example, it can be seen that all the child processes are running as user nobody whereas the parent process runs as user root. There is only one root process, and this must be the parent process. Any kill signals should be sent to this parent process. If the server is started under some other user account (e.g., when the user does not have root access), the processes will belong to that user. The only truly foolproof way to identify the parent process is to look for the process whose parent process ID (PPID) is 1 (use ps to find out the PPID of the process). If you have the GNU tools installed on your system, there is a nifty utility that makes it even easier to discover the parent process. The tool is called pstree, and it is very simple to use. It lists all the processes showing the family hierarchy, so if we grep the output for the wanted process's family, we can see the parent process right away. Running this utility and greping for httpd_perl, we get: panic% pstree -p | grep httpd_perl |-httpd_perl(17309)-+-httpd_perl(17310) | |-httpd_perl(17311) | |-httpd_perl(17312) And this one is even simpler: panic% pstree -p | grep 'httpd_perl.*httpd_perl' |-httpd_perl(17309)-+-httpd_perl(17310) In both cases, we can see that the parent process has the PID 17309. ps's f option, available on many Unix platforms, produces a tree-like report of the processes as well. For example, you can run ps axfwwww to get a tree of all processes. |
[ Team LiB ] |