[66]Before you start deleting core files, you should figure out who or what is dropping them and see if the owner wants these files. In some cases this core file may be their only means of debugging.
The logic is simple, though it's somewhat hard to see since most of it happens implicitly. The key is the call to find( ), which sets up lots of things. It descends into every directory underneath the directory specified by $PATH and automatically sets $_ (so the if statement at the beginning of the wanted() subroutine works). Furthermore, it defines the variable name to be the full pathname to the current file; this allows us to test whether or not the current file is really a directory, which we wouldn't want to delete. Therefore, we loop through all the files, looking for files with the name specified on the comand line (or named core, if no -lookfor option is specified). When we find one we store its statistics, delete the file, and send a trap to the NMS reporting the file's name and other information. We use the variable SPEC to store the specific trap ID. We use two specific IDs: 1535 if the file was deleted successfully and 1536 if we tried to delete the file but couldn't. Again, we wrote the trap code to use either native Perl, Net-SNMP, or OpenView. Uncomment the version of your choice. We pack the trap with three variable bindings, which contain the name of the file, the results of ls -l on the file, and the results of running /bin/file. Together, these give us a fair amount of information about the file we deleted. Note that we had to define object IDs for all three of these variables; furthermore, although we placed these object IDs under 1535, nothing prevents us from using the same objects when we send specific trap 1536. Now we have a program to delete core files and send traps telling us about what was deleted; the next step is to tell our trap receiver what to do with these incoming traps. Let's assume that we're using OpenView. To inform it about these traps, we have to add two entries to trapd.conf, mapping these traps to events. Here they are:#!/usr/local/bin/perl # Finds and deletes core files. It sends traps upon completion and # errors. Arguments are: # -path directory : search directory (and subdirectories); default / # -lookfor filename : filename to search for; default core # -debug value : debug level while ($ARGV[0] =~ /^-/) { if ($ARGV[0] eq "-path") { shift; $PATH = $ARGV[0]; } elsif ($ARGV[0] eq "-lookfor") { shift; $LOOKFOR = $ARGV[0]; } elsif ($ARGV[0] eq "-debug") { shift; $DEBUG = $ARGV[0]; } shift; } ################################################################# ########################## Begin Main ######################### ################################################################# require "find.pl"; # This gives us the find function. $LOOKFOR = "core" unless ($LOOKFOR); # If we don't have something # in $LOOKFOR, default to core $PATH = "/" unless ($PATH); # Let's use / if we don't get # one on the command line (-d $PATH) || die "$PATH is NOT a valid dir!"; # We can search # only valid # directories &find("$PATH"); ################################################################# ###################### Begin SubRoutines ###################### ################################################################# sub wanted { if (/^$LOOKFOR$/) { if (!(-d $name)) # Skip the directories named core { &get_stats; &can_file; &send_trap; } } } sub can_file { print "Deleting :$_: :$name:\n" unless (!($DEBUG)); $RES = unlink "$name"; if ($RES != 1) { $ERROR = 1; } } sub get_stats { chop ($STATS = `ls -l $name`); chop ($FILE_STATS = `/bin/file $name`); $STATS =~ s/\s+/ /g; $FILE_STATS =~ s/\s+/ /g; } sub send_trap { if ($ERROR == 0) { $SPEC = 1535; } else { $SPEC = 1536; } print "STATS: $STATS\n" unless (!($DEBUG)); print "FILE_STATS: $FILE_STATS\n" unless (!($DEBUG)); # Sending a trap using Net-SNMP # #system "/usr/local/bin/snmptrap nms public .1.3.6.1.4.1.2789.2500 '' 6 $SPEC '' #.1.3.6.1.4.1.2789.2500.1535.1 s \"$name\" #.1.3.6.1.4.1.2789.2500.1535.2 s \"$STATS\" #.1.3.6.1.4.1.2789.2500.1535.3 s \"$FILE_STATS\""; # Sending a trap using Perl # use SNMP_util "0.54"; # This will load the BER and SNMP_Session for us snmptrap("public\@nms:162", ".1.3.6.1.4.1.2789.2500", mylocalhostname, 6, $SPEC, ".1.3.6.1.4.1.2789.2500.1535.1", "string", "$name", ".1.3.6.1.4.1.2789.2500.1535.2", "string", "$STATS", ".1.3.6.1.4.1.2789.2500.1535.3", "string", "$FILE_STATS"); # Sending a trap using OpenView's snmptrap # #system "/opt/OV/bin/snmptrap -c public nms #.1.3.6.1.4.1.2789.2500 \"\" 6 $SPEC \"\" #.1.3.6.1.4.1.2789.2500.1535.1 octetstringascii \"$name\" #.1.3.6.1.4.1.2789.2500.1535.2 octetstringascii \"$STATS\" #.1.3.6.1.4.1.2789.2500.1535.3 octetstringascii \"$FILE_STATS\""; }
For each trap, we have an EVENT statement specifying an event name, the trap's specific ID, the category into which the event will be sorted, and the severity. The FORMAT statement defines a message to be used when we receive the trap; it can be spread over several lines and can use the parameters $1, $2, etc. to refer to the variable bindings that are included in the trap. Although it would be a good idea, we don't need to add our variable bindings to our private MIB file; trapd.conf contains enough information for OpenView to interpret the contents of the trap. Here are some sample traps[67] generated by the throwcore script:EVENT foundNDelCore .1.3.6.1.4.1.2789.2500.0.1535 "Status Alarms" Warning FORMAT Core File Found :$1: File Has Been Deleted - LS :$2: FILE :$3: SDESC This event is called when a server using cronjob looks for core files and deletes them. $1 - octetstringascii - Name of file $2 - octetstringascii - ls -l listing on the file $3 - octetstringascii - file $name EDESC # # # EVENT foundNNotDelCore .1.3.6.1.4.1.2789.2500.0.1536 "Status Alarms" Minor FORMAT Core File Found :$1: File Has Not Been Deleted For Some Reason - LS :$2: FILE :$3: SDESC This event is called when a server using cronjob looks for core files and then CANNOT delete them for some reason. $1 - octetstringascii - Name of file $2 - octetstringascii - ls -l listing on the file $3 - octetstringascii - file $name EDESC # # #
[67]We've removed most of the host and date/time information.
Here is root's crontab, which runs the throwcore script at specific intervals. Notice that we use the -path switch, which allows us to check the development area every hour:Core File Found :/usr/sap/HQD/DVEBMGS00/work/core: File Has Been \ Deleted - LS :-rw-rw---- 1 hqdadm sapsys 355042304 Apr 27 17:04 \ /usr/sap/HQD/DVEBMGS00/work/core: \ FILE :/usr/sap/HQD/DVEBMGS00/work/core: ELF 32-bit MSB core file \ SPARC Version 1, from 'disp+work': Core File Found :/usr/sap/HQI/DVEBMGS10/work/core: File Has Been \ Deleted - LS :-rw-r--r-- 1 hqiadm sapsys 421499988 Apr 28 14:29 \ /usr/sap/HQI/DVEBMGS10/work/core: \ FILE :/usr/sap/HQI/DVEBMGS10/work/core: ELF 32-bit MSB core file \ SPARC Version 1, from 'disp+work':
# Check for core files every night and every hour on special dirs 27 * * * * /opt/local/mib_programs/scripts/throwcore.pl -path /usr/sap 23 2 * * * /opt/local/mib_programs/scripts/throwcore.pl
Copyright © 2002 O'Reilly & Associates. All rights reserved.