External Polling (Essential SNMP)

9.2.2. OpenView Graphing

Figure 9-3 shows the sort of graph you can create with NNM. To create this graph, we started the browser (Figure 8-2) and clicked down through the MIB tree until we found the .iso.org.dod.internet.mgmt.mib-2.interfaces.ifTable.ifEntry list. Once there, we clicked on ifInOctets; then, while holding down the Ctrl key, we clicked on ifOutOctets. After both were selected and we verified that the "Name or IP Address" field displayed the node we wanted to poll, we clicked on the "Graph" button.

Figure 9-3. OpenView xnmgraph of octets in/out

Once the graph has started, you can change the polling interval and the colors used to display different objects. You can also turn off the display of some or all of the object instances. The menu item "View Figure 9.2.2

Line Configuration" lets you specify which objects you would like to display; it can also set multipliers for different items. For example, to display everything in K, multiply the data by .001. There is also an option ("View Figure 9.2.2

Statistics") that shows a statistical summary of your graph. Figure 9-4 shows some statistics from the graph in Figure 9-3. While the statistics menu is up, you can left-click on the graph; the statistics window will display the values for the specific date and time to which you are pointing with the mouse.

Figure 9-4. xnmgraph statistics

WARNING: Starting xnmgraph from the command line allows you to start the grapher at a specific polling period and gives you several other options. By default, OpenView polls at 10-second intervals. In most cases this is fine, but if you are polling a multiport router to check if some ports are congested, a 10-second polling interval may be too quick and could cause operational problems. For example, if the CPU is busy answering SNMP queries every 10 seconds, the router might get bogged down and become very slow, especially if the router is responsible for OSPF or other CPU-intensive tasks. You may also see messages from OpenView complaining that another poll has come along while it is still waiting for the previous poll to return. Increasing the polling interval usually gets rid of these messages.

Some of NNM's default menus let you use the grapher to poll devices depending on their type. For example, you can select the object type "router" on the NNM and generate a graph that includes all your routers. Whether you start from the command line or from the menu, there are times when you will get a message back that reads "Requesting more lines than number of colors (25). Reducing number of lines." This message means that there aren't enough colors available to display the objects you are trying to graph. The only good ways to avoid this problem are to break up your graphs so that they poll fewer objects or to eliminate object instances you don't want. For example, you probably don't want to graph router interfaces that are down (for whatever reason) and other "dead" objects. We will soon see how you can use a regular expression as one of the arguments to the xnmgraph command to graph only those interfaces that are up and running.

Although the graphical interface is very convenient, the command-line interface gives you much more flexibility. The following script displays the graph in Figure 9-3 (i.e., the graph we generated through the browser):

#!/bin/sh
# filename: /opt/OV/local/scripts/graphOctets
# syntax: graphOctets <hostname>
/opt/OV/bin/xnmgraph -c public -mib \
".iso.org.dod.internet.mgmt.mib-2.interfaces.ifTable.ifEntry.ifInOctets::::::::,\
.iso.org.dod.internet.mgmt.mib-2.interfaces.ifTable.ifEntry.ifOutOctets::::::::" \
$1

You can run this script with the command:

$ /opt/OV/local/scripts/graphOctets orarouter1

The worst part of writing the script is figuring out what command-line options you want -- particularly the long strings of nine colon-separated options. All these options give you the ability to refine what you want to graph, how often you want to poll the objects, and how you want to display the data. (We'll discuss the syntax of these options as we go along, but for the complete story, see the xnmgraph(1) manpage.) In this script, we're graphing the values of two MIB objects, ifInOctets and ifOutOctets. Each OID we want to graph is the first (and in this case, the only) option in the string of colon-separated options. On our network, this command produces eight traces: input and output octets for each of our four interfaces. You can add other OIDs to the graph by adding sets of options, but at some point the graph will become too confusing to be useful. It will take some experimenting to use the xnmgraph command efficiently, but once you learn how to generate useful graphs you'll wonder how you ever got along without it.

WARNING: Keeping your scripts neat is not only good practice, but also aesthetically pleasing. Using a "\" at the end of a line indicates that the next line is a continuation of the current line. Breaking your lines intelligently makes your scripts more readable. Be warned that the Unix shells do not like extra whitespace after the "\". The only character after each "\" should be one carriage return.

Now, let's modify the script to include more reasonable labels -- in particular, we'd like the graph to show which interface is which, rather than just showing the index number. In our modified script, we've used numerical object IDs, mostly for formatting convenience, and we've added a sixth option to the ugly sequence of colon-separated options: .1.3.6.1.2.1.2.2.1.2 (this is the ifDescr, or interface description, object in the interface table). This option says to poll each instance and use the return value of snmpget .1.3.6.1.2.1.2.2.1.2.INSTANCE as the label. This should give us meaningful labels. Here's the new script:

#!/bin/sh
# filename: /opt/OV/local/scripts/graphOctets
# syntax: graphOctets <hostname>
/opt/OV/bin/xnmgraph -c public -title Bits_In_n_Out -mib \
".1.3.6.1.4.1.9.2.2.1.1.6:::::.1.3.6.1.2.1.2.2.1.2:::,\
.1.3.6.1.4.1.9.2.2.1.1.8:::::.1.3.6.1.2.1.2.2.1.2:::" $1

To see what we'll get for labels, here's the result of walking .1.3.6.1.2.1.2.2.1.2:

$ snmpwalk orarouter1 .1.3.6.1.2.1.2.2.1.2
interfaces.ifTable.ifEntry.ifDescr.1 : DISPLAY STRING- (ascii):  Ethernet0
interfaces.ifTable.ifEntry.ifDescr.2 : DISPLAY STRING- (ascii):  Serial0
interfaces.ifTable.ifEntry.ifDescr.3 : DISPLAY STRING- (ascii):  Serial1

Figure 9-5 shows our new graph. With the addition of this sixth option, the names and labels are much easier to read.

Figure 9-5. OpenView xnmgraph with new labels

Meaningful labels and titles are important, especially if management is interested in seeing the graphs. A label that contains an OID and not a textual description is of no use. Some objects that are useful in building labels are ifType (.1.3.6.1.2.1.2.2.1.3) and ifOperStatus (.1.3.6.1.2.1.2.2.1.8). Be careful when using ifOperStatus; if the status of the interface changes during a poll, the label does not change. The label is evaluated only once.

One of the most wasteful things you can do is poll a useless object. This often happens when an interface is administratively down or not configured. Imagine that you have 20 serial interfaces, but only one is actually in use. If you are looking for octets in and out of your serial interfaces, you'll be polling 40 times and 38 of the polls will always read 0. OpenView's xnmgraph allows you to specify an OID and regular expression to select what should be graphed. To put this feature to use, let's walk the MIB to see what information is available:

$ snmpwalk orarouter1 .1.3.6.1.2.1.2.2.1.8
interfaces.ifTable.ifEntry.ifOperStatus.1 : INTEGER: up
interfaces.ifTable.ifEntry.ifOperStatus.2 : INTEGER: up
interfaces.ifTable.ifEntry.ifOperStatus.3 : INTEGER: down

This tells us that only two interfaces are currently up. By looking at ifDescr, we see that the live interfaces are Ethernet0 and Serial0; Serial1 is down. Notice that the type of ifOperStatus is INTEGER, but the return value looks like a string. How is this? RFC 1213 defines string values for each possible return value:

ifOperStatus OBJECT-TYPE
    SYNTAX  INTEGER {
                up(1),       -- ready to pass packets
                down(2),
                testing(3)   -- in some test mode
            }
    ACCESS  read-only
    STATUS  mandatory
    DESCRIPTION
        "The current operational state of the interface. The testing(3)
         state indicates that no operational packets can be passed."
    ::= { ifEntry 8 }

It's fairly obvious how to read this: the integer value 1 is converted to the string up. We can therefore use the value 1 in a regular expression that tests ifOperStatus. For every instance we will check the value of ifOperStatus; we will poll that instance and graph the result only if the status returns 1. In pseudocode, the operation would look something like this:

if (ifOperStatus == 1) {
    pollForMIBData;
    graphOctets;
}

Here's the next version of our graphing script. To put this logic into a graph, we use the OID for ifOperStatus as the fourth colon option, and the regular expression (1) as the fifth option:

#!/bin/sh
# filename: /opt/OV/local/scripts/graphOctets
# syntax: graphOctets <hostname>
/opt/OV/bin/xnmgraph -c public \
-title Octets_In_and_Out_For_All_Up_Interfaces \
-mib ".1.3.6.1.2.1.2.2.1.10:::.1.3.6.1.2.1.2.2.1.8:1::::, \
.1.3.6.1.2.1.2.2.1.16:::.1.3.6.1.2.1.2.2.1.8:1::::" $1

This command graphs the ifInOctets and ifOutOctets of any interface that has a current operational state equal to 1, or up. It therefore polls and graphs only the ports that are important, saving on network bandwidth and simplifying the graph. Furthermore, we're less likely to run out of colors while making the graph because we won't assign them to useless objects. Note, however, that this selection happens only during the first poll and stays effective throughout the entire life of the graphing process. If the status of any interface changes after the graph has been started, nothing in the graph will change. The only way to discover any changes in interface status is to restart xnmgraph.

Finally, let's look at:

How to add a label to each of the OIDs we graph
How to multiply each value by a constant
How to specify the polling interval

The cropped graph in Figure 9-6 shows how the labels change when we run the following script:

#!/bin/sh
# filename: /opt/OV/local/scripts/graphOctets
# syntax: graphOctets <hostname>
/opt/OV/bin/xnmgraph -c public -title Internet_Traffic_In_K -poll 68 -mib \
".1.3.6.1.4.1.9.2.2.1.1.6:Incoming_Traffic::::.1.3.6.1.2.1.2.2.1.2::.001:,\
.1.3.6.1.4.1.9.2.2.1.1.8:Outgoing_Traffic::::.1.3.6.1.2.1.2.2.1.2::.001:" \
$1

The labels are given by the second and sixth fields in the colon-separated options (the second field provides a textual label to identify the objects we're graphing and the sixth uses the ifDescr field to identify the particular interface); the constant multiplier (.001) is given by the eighth field; and the polling interval (in seconds) is given by the -poll option.

Figure 9-6. xnmgraph with labels and multipliers

By now it should be apparent how flexible OpenView's xnmgraph program really is. These graphs can be important tools for troubleshooting your network. When a network manager receives complaints from customers regarding slow connections, he can look at the graph of ifInOctets generated by xnmgraph to see if any router interfaces have unusually high traffic spikes.

Graphs like these are also useful when you're setting thresholds for alarms and other kinds of traps. The last thing you want is a threshold that is too triggery (one that goes off too many times) or a threshold that won't go off until the entire building burns to the ground. It's often useful to look at a few graphs to get a feel for your network's behavior before you start setting any thresholds. These graphs will give you a baseline from which to work. For example, say you want to be notified when the battery on your UPS is low (which means it is being used) and when it is back to normal (fully charged). The obvious way to implement this is to generate an alarm when the battery falls below some percentage of full charge, and another alarm when it returns to full charge. So the question is: what value can we set for the threshold? Should we use 10% to indicate that the battery is being used and 100% to indicate that it's back to normal? We can find the baseline by graphing the device's MIBs.[39] For example, with a few days' worth of graphs, we can see that our UPS's battery stays right around 94-97% when it is not in use. There was a brief period when the battery dropped down to 89%, when it was performing a self-test. Based on these numbers, we may want to set the "in use" threshold at 85% and the "back to normal" threshold at 94%. This pair of thresholds gives us plenty of notification when the battery's in use, but won't generate useless alarms when the device is in self-test mode. The appropriate thresholds depend on the type of devices you are polling, as well as the MIB data that is gathered. Doing some initial testing and polling to get a baseline (normal numbers) will help you set thresholds that are meaningful and useful.

[39]Different vendors have different UPS MIBs. Refer to your particular vendor's MIB to find out which object represents low battery power.

Before leaving xnmgraph, we'll take a final look at the nastiest aspect of this program: the sequence of nine colon-separated options. In the examples, we've demonstrated the most useful combinations of options. In more detail, here's the syntax of the graph specification:

object:label:instances:match:expression:instance-label:truncator:multiplier:nodes

The parameters are:

object

label: A string to use in making the label for all instances of this object. This can be a literal string or the OID of some object with a string value. The label used on the graph is made by combining this label (for all instances of the object) with instance-label, which identifies individual instances of an object in a table. For example, in Figure 9-6, the labels are Incoming_Traffic and Outgoing_Traffic; instance-label is 1.3.6.1.2.1.2.2.1.2, or the ifDescr field for each object being graphed.

instances: A regular expression that specifies which instances of object to graph. If this is omitted, all instances are graphed. For example, the regular expression 1 limits the graph to instance 1 of object; the regular expression [4-7] limits the graph to instances 4 through 7. You can use the match and expression fields to further specify which objects to match.

match: The OID of an object (not including the instance ID) to match against a regular expression (the match-expression), to determine which instances of the object to display in the graph.