10.3 Killing Sniped SessionsA sniped session occurs when a user has exceeded his idle time. The situation has been noted in the database, and the user's actual database session has been suspended. However, the user is still consuming a dedicated server resource that has not yet been allocated to someone else. This situation can have an adverse effect on overall database performance. 10.3.1 Limiting Resource ConsumptionOracle provides the ability to limit resource consumption via the user PROFILE, a collection of limits holding resource hogs in check. We've found that some of these limits are very useful, particularly IDLE_TIME. The IDLE_TIME limit disconnects user sessions if they remain unused for too long. When a session is disconnected in this manner, Oracle changes the status of the session to SNIPED in the V$SESSION system view. We've used this limit effectively — especially in data warehouse applications where a session may be consuming large swathes of memory even it is when idling. Here's how you create a PROFILE called IDLE_LIMIT with an IDLE_TIME of 1 minute.[4] We'll assign it to scott:
SQL> create profile idle_limit limit idle_time 1; SQL> alter user scott profile idle_limit; This following displays all non-default profile parameters: SELECT * FROM DBA_PROFILES WHERE PROFILE != 'DEFAULT' AND LIMIT != 'DEFAULT'; PROFILE RESOURCE_NAME RESOURCE LIMIT ---------- -------------------------------- -------- ---------- IDLE_LIMIT IDLE_TIME KERNEL 1 1 row selected. This next example shows all users with nondefault profile assignments: SELECT B.USERNAME, A.RESOURCE_NAME, A.LIMIT FROM DBA_PROFILES A, DBA_USERS B WHERE A.PROFILE = B.PROFILE AND A.PROFILE != 'DEFAULT' AND A.LIMIT != 'DEFAULT'; USERNAME RESOURCE_NAME LIMIT ---------- --------------- --------------- SCOTT IDLE_TIME 1 SCOTT_DUP IDLE_TIME 1 2 rows selected. When a session has been idle for longer than IDLE_TIME, Oracle changes the session status to SNIPED. The user typically notices this session suspension in the following way when he tries to run some more SQL commands, perhaps after a very long lunch break: SQL> select USER from dual; select USER from dual * ERROR at line 1: ORA-02396: exceeded maximum idle time, please connect again Even though the session has timed out, memory resources are still being consumed, as you can see in Example 10-12. The ps -fp18471 command shows that the Oracle dedicated session server is still in place. The SQL used to select this information from the V$SESSION view is shown in Example 10-13. Example 10-12. Status of timed-out session in V$SESSION13:24:31 SQL> / SRVR USERNAME SID SERIAL# STATUS PID LOGON TIME IDLE TIME -------- --- ------- -------- ------ ----------------- ----------- SCOTT 16 1321 INACTIVE 18471 02/17/02 13:23:53 00:00:00:41 1 row selected. 13:24:33 SQL> / SRVR USERNAME SID SERIAL# STATUS PID LOGON TIME IDLE TIME -------- --- ------- -------- ------ ----------------- ----------- SCOTT 16 1321 SNIPED 18471 02/17/02 13:23:53 00:00:02:10 1 row selected. 13:26:02 SQL> !ps -fp18471 UID PID PPID C STIME TTY TIME CMD oracle 18471 18470 0 13:23 ? 00:00:00 oraclets01 (DESCRIPTION=(LOCAL=YES) 13:26:11 SQL> Example 10-13. Displaying status of sessionsSELECT S.USERNAME, S.SID, S.SERIAL#, S.STATUS, TO_CHAR(LOGON_TIME, 'MM/DD/YY HH24:MI:SS') LOGON_TIME, SUBSTR('0'||TRUNC(LAST_CALL_ET/86400),-2,2) || ':' || SUBSTR('0'||TRUNC(MOD(LAST_CALL_ET,86400)/3600),-2,2) || ':' || SUBSTR('0'||TRUNC(MOD(MOD(LAST_CALL_ET,86400),3600)/60),-2,2)|| ':' || SUBSTR('0'||MOD(MOD(MOD(LAST_CALL_ET,86400),3600),60),-2,2) IDLE_TIME FROM V$SESSION S, V$PROCESS P WHERE S.USERNAME = 'SCOTT' AND P.ADDR(+) = S.PADDR ORDER BY USERNAME, SID;
On a busy database with frequently created sessions, lapsed memory resource consumption may be tolerable. Sniped sessions are reused by newly logged-in sessions, minimizing resource wastage. However, on databases with infrequently created sessions, snipes can remain with us for quite some time. We've seen this happen in data warehouses with plenty of memory wastage, so we decided to go snipe hunting[5] — and lo, the kss.pl (Kill Sniped Sessions) script was born.
When removing a session from Oracle, the ALTER SYSTEM KILL SESSION command is often sufficient. In the toolkit, we've taken it one step further, though, and we actually kill[6] the session's dedicated server process. We then use ALTER SYSTEM KILL SESSION if the session still exists with a status of KILLED.
So why the literal overkill? On numerous occasions over the years — and through several versions of Oracle — we've run into serious problems when killing sessions. Nearly every time, the standard method works just fine, but every once in a while it fails. The sessions may possess a status of KILLED, but they're never actually removed. When this happens, and the session is holding a vital table lock, it seems that nothing less than a plasma cannon[7] is sufficient to remove these poltergeist sessions (or a database bounce, but we'd rather avoid going there). We've seen this happen on every version of Oracle from 7.0.16 through to 8.1.6 and on both Unix and Win32. It's sporadic enough that we've never learned how to reproduce it, but regular enough so we've learned how to cope with it. We cope by killing the dedicated server process on Unix (or the thread on Win32). We've never experienced a hanging session using this method. Remember, though, that this method is inappropriate for Multi-Threaded Servers (MTS), where you would end up disconnecting a great many other sessions by killing the MTS dispatcher.
Enough rationale. Now it's time to set up kss.pl and run it. Because this script runs as a daemon on Unix and a service on Win32, we'll show you how to set it up for both. 10.3.2 Installing kss.pl on UnixInstalling this script on Unix systems is easy. Simply make sure the kss.conf file is in the right place. If it is still uninstalled, copy it in like this from the toolkit installation directory: $ cp /u01/build/PDBA-1.00/routine_tasks/kss.conf $PDBA_HOME That's it — we're done! Running the kss.pl daemon is equally simple. Here's the command we used on our Linux server; Table 10-6 summarizes the command-line options: $ kss.pl -machine sherlock -database ts01 -username system Without the password server (see Chapter 9), add the password argument: $ kss.pl -machine sherlock -database ts01 -username system \
-password manager
10.3.3 Installing kss_NT.pl on Win32Follow these steps on Win32 systems:
Figure 10-2. The kss service on Windows 200010.3.4 Configuring kss.plThe configuration file for the kss.pl script requires little editing. The contents of kss.conf are shown in Example 10-14. Note the following about this example:
Example 10-14. kss.confpackage kss; use PDBA; use File::Spec; use vars qw(%config); %config = ( sleepTime => 180, killSql => q(ALTER SYSTEM KILL SESSION '<<SID>>,<<SERIAL>>'), snipeSql => Q( SELECT S.USERNAME USERNAME, S.SID SID, S.STATUS STATUS, S.SERIAL# SERIAL, P.SPID SPID FROM V$SESSION S, V$PROCESS P WHERE S.USERNAME IS NOT NULL AND P.ADDR(+) = S.PADDR AND S.STATUS = 'SNIPED' ORDER BY USERNAME, SID ), killCmd => PDBA->osname( ) eq 'unix' ? q(/bin/kill -9 <<PID>> >/dev/null 2>&1) : File::Spec->catfile( PDBA->oracleHome, 'bin','orakill') . q{ <<ORACLE_SID>> <<PID>> }, ); 1; After starting the Oracle_ts20_kss_monitor service, on Win32, we created an IDLE_LIMIT profile with a one-minute threshold for IDLE_TIME. The profile was then assigned to scott. Example 10-15 shows the action taken by kss_NT.pl after scott's session timed out. At marker 20020217184600 the thread for scott's session is first killed with Oracle's orakill utility, then terminated with ALTER SYSTEM KILL SESSION. Example 10-15. scott session cleaned up by kss_NT.pl20020217184355:attempting to load Win32::Daemon 20020217184356:password retrieved for user system 20020217184356:Service running 20020217184359:SCANNING 20020217184359:SLEEP: 30 20020217184429:Service running 20020217184430:SCANNING 20020217184430:SLEEP: 30 20020217184500:Service running 20020217184500:SCANNING 20020217184500:SLEEP: 30 20020217184530:Service running 20020217184530:SCANNING 20020217184530:SLEEP: 30 20020217184600:Service running 20020217184600:SCANNING 20020217184600:STATUS:SCOTT:8:9:1384 20020217184600:OSKILL:SCOTT:8:9:1384:D:\oracle\ora81\bin\orakill ts20 1384 20020217184600:DBKILL:SCOTT:8:9:1384:alter system kill session '8,9' 20020217184600:SLEEP: 30 20020217184630:Service running 20020217184630:SCANNING 20020217184630:SLEEP: 30 20020217184700:Service running |