Interpreting Wait Events

What are Wait Events?
Common Wait Events
Collecting Wait Events
Tracing Wait Events
Event Wait Analysis with Statspack
How Wait Events can be used to improve performance (with good examples)

What are Wait Events?

"Wait Event" Defined

At any given moment, every Oracle process is either busy servicing a request or waiting for something specific to happen. By "busy" we mean that the process wishes to use the CPU. For example, a dedicated server process might be searching the buffer cache to see if a certain data block is in memory. This process would be said to be busy and not waiting. The ARC0 process, meanwhile, might be waiting for LGWR to signal that an online redo log needs archiving. In this case, the ARC0 process is waiting.
The kernel developers at Oracle have defined a list of every possible event that an Oracle process could be waiting for. In Oracle 8i there were 217 such wait events. In Oracle 9i Release 2 there are 363. At any moment, every Oracle process that is not busy is waiting for one of these events to occur. Suppose an application has submitted a COMMIT statement and the server process is waiting for the LGWR process to signal that the redo log buffer has been flushed to disk. This wait event is called "log file sync." Another dedicated server process might be waiting for a row-level lock on the INVOICES table to be freed so that a SELECT FOR UPDATE statement can continue. That wait event is called "enqueue."
By querying v$ views, we can see what events processes are waiting on to an amazing level of detail. For example, we might learn that a dedicated server process has been waiting 30 milliseconds for the operating system to read eight blocks from data file 42, starting at block 18042. We can also see summary information of how much time each Oracle process has spent waiting on each type of wait event for the duration of the process. In addition, we can direct an Oracle process to write detailed wait event data to a trace file for later analysis using TKPROF.

Why Wait Event Information is Useful

Using the wait event interface, you can get insights into where time is being spent. If a report takes four hours to complete, for example, the wait event interface will tell you how much of that four hours was spent waiting for disk reads caused by full table scans, disk reads caused by index lookups, latch contention, and so on.
The wait event interface provides much more information to work with than cache hit ratios do.You get data that can touch upon so many different areas of your database such as disk I/O, latches, parallel processing, network traffic, checkpoints, and row-level locking. At the same time, you get incredible detail such as the file number and block number of a block being read from disk, or the name of a latch being waited on along with the number of retries.
The wait event interface will not always give you all of the information you need in order to diagnose and solve a problem, but it will certainly point you in the right direction. You might think the buffer cache is too small because the cache hit ratio is only 70%, but in fact, the application's slow response time could be caused by latch contention in the shared pool, a bottleneck in the log writer, or any of a number of other things.


Common Wait Event Names and What They Mean

Although there are many different types of wait events, the majority of them come up very infrequently or tend not to be significant. In practice, only a few dozen wait events tend to be of interest to most DBAs. The rest are rather obscure, pertain to Oracle features not in use, or occur so infrequently that you don't need to worry about them. You'll see different wait events surfacing in different environments based on which Oracle features have been implemented and which capabilities of the database are being taxed the most. For example, the PX wait events won't appear if you aren't using parallel query, and the virtual circuit status wait event won't appear if you are not using the multi-threaded server architecture (or shared server architecture as it is more recently called). Along those lines, the log file sync and enqueue wait events probably won't be prevalent in a primarily read-only system.
Here are some of the most common wait events and what they mean:

Wait Event Description
buffer busy waits The session wants to access a data block that is either 1) currently not in memory, but another process has already issued an I/O request to read the block into memory, or 2) in memory but in an incompatible mode (current versus consistent, for example).
control file parallel write The session has issued multiple I/O requests in parallel to write blocks to all control files, and is waiting for all of the writes to complete.
control file sequential read The session is waiting for blocks to be read from a control file.
db file parallel read The session has issued multiple I/O requests in parallel to read blocks from data files into memory and is waiting for all requests to complete. This may occur during recovery or during regular activity when a session batches many single block I/O requests together and issues them in parallel.
db file parallel write The process, typically DBWR, has issued multiple I/O requests in parallel to write dirty blocks from the buffer cache to disk and is waiting for all requests to complete.
db file scattered read The session has issued an I/O request to read a series of contiguous blocks from a data file into the buffer cache and is waiting for the operation to complete. This typically happens during a full table scan or fast full index scan. Oracle reads up to DB_FILE_MULTIBLOCK_READ_COUNT consecutive blocks at a time and scatters them into buffers in the buffer cache
db file sequential read The session has issued an I/O request to read one block from a data file into the buffer cache and is waiting for the operation to complete. This typically happens during an index lookup or a fetch from a table by ROWID when the required data block is not already in memory. This call differs from "db file scattered read" in that a sequential read reads data into contiguous memory (whilst a scattered read reads multiple blocks and scatters them into different buffers in the SGA).
direct path read,
direct path write
The session has issued asynchronous I/O requests that bypass the buffer cache and is waiting for them to complete. These wait events often involve temporary segments.
enqueue The session is waiting on an enqueue (a lock you can see in v$lock). This commonly occurs when one user is trying to update a row in a table that is currently being updated by another user.
free buffer waits The session needs a free buffer so it can bring a data block into the buffer cache and is waiting for a buffer that is not dirty to become available. This can occur if DBWR is not writing dirty buffers to disk fast enough.
latch free The session is waiting for a latch held by another session. (This event does not apply to processes that are spinning while waiting for a latch; when a process is spinning, it is not waiting.)
library cache load lock The session is waiting for the opportunity to load an object or a piece of an object into the library cache. (Only one process can load an object or a piece of an object at a time.)
library cache pin The session wants to pin an object in memory in the library cache for examination, ensuring no other processes can update the object at the same time. This happens when you are compiling or parsing a PL/SQL object or a view.
log buffer space The session is waiting for space in the log buffer. (Space becomes available only after LGWR has written the current contents of the log buffer to disk.) This typically happens when applications generate redo faster than LGWR can write it to disk.
log file parallel write The session is waiting for blocks to be written to all online redo log members in one group. LGWR is typically the only process to see this wait event. It will wait until all blocks have been written to all members.
log file sequential read The session is waiting for blocks to be read from the online redo log into memory. This primarily occurs at instance startup and when the ARCH process archives filled online redo logs.
log file switch completion The session is waiting for a log file switch to complete, typically so more redo can be generated.
log file sync The session is waiting for LGWR to finish flushing the log buffer to disk. This occurs when a user commits a transaction. (A transaction is not considered committed until all of the redo to recover the transaction has been successfully written to disk). Probably you are commiting too often
undo segment extension The session is waiting for an undo segment to be extended or shrunk.
write complete waits The session is waiting for a requested buffer to be written to disk; the buffer cannot be used while it is being written.

There are several wait events that we call "idle events" because each of these wait events typically occurs when the Oracle process has nothing to do and is waiting for somebody to give it a task. Idle events are usually not very interesting from a tuning standpoint, so we usually overlook them when evaluating data extracted from the wait event interface. The common idle events are as follows:

Idle Events
client message PX Deq: Execute Reply
dispatcher timer PX Deq: Execution Msg
gcs for action PX Deq: Signal ACK
gcs remote message PX Deq: Table Q Normal
ges remote message PX Deque wait
i/o slave wait PX Idle Wait
jobq slave wait queue messages
lock manager wait for remote message rdbms ipc message
null event slave wait
parallel query dequeue smon timer
pipe get SQL*Net message from client
PL/SQL lock timer SQL*Net message to client
pmon timer SQL*Net more data from client
PX Deq Credit: need buffer virtual circuit status
PX Deq Credit: send blkd wakeup time manager

What Wait Event Information Does Not Provide

If an Oracle process has work to do but must wait for something to happen before it can continue, then the process will be waiting on a non-idle wait event. If a process has nothing to do, it will be waiting on an idle wait event. So what happens if a process has work to do and is busy doing it? When a process is busy, there will be no information in the wait event interface since the process is not waiting.

When we look at the wait event information extracted from an Oracle instance, we see detailed information about how many times and how much time was spent waiting for specific events to occur. But we do not see anything about the time periods in which the process requested use of the CPU. This means that the wait event interface is not able to provide information about the following:

This is important to keep in mind because there is an easy trap to fall into. You could be troubleshooting a very slow SELECT statement and learn from the wait event interface that the process does not have significant wait events when running the query. This could lead you to think that the statement is optimal, and that it just takes a long time to run. In fact, however, the query might be performing huge numbers of logical reads and the number of buffer gets could be reduced by rewriting the query.

When Oracle needs to access a data block and the block is already in the buffer cache, a logical read occurs with no physical read. The process is able to read the block without the occurrence of any wait events. Large amounts of CPU time could be consumed on significant numbers of logical reads, and the wait event interface will have nothing to show for the elapsed time.
Statement parsing and spinning while waiting for a latch to become available are two other examples of activities not accounted for by the wait event interface. An Oracle process uses CPU time while parsing a statement or spinning on a busy latch; since no waiting is involved, the wait event interface does not have anything to report.
Beginning in Oracle 9i, the portion of the Statspack report that lists wait event information also lists CPU usage. This is very helpful information. It allows us to easily compare time spent waiting to time spent processing, so we know where to focus our tuning efforts. However, it should be pointed out that the CPU usage information in this section of the Statspack report does not come from the wait event interface. Instead, Statspack merges data collected from the wait event interface with CPU usage information collected from the v$sysstat dynamic performance view.

A Note About the timed_statistics Parameter

The Oracle kernel is capable of timing many activities including wait events. In Oracle 9i Release 2 timed statistics are collected by default, and some timings are collected in units of microseconds. The timed_statistics instance parameter is used to enable and disable timed statistics collection. When timed_statistics is set to FALSE, all times in the wait event interface will appear as zero. You may enable timed statistics collection at the instance or the session level with the following commands:

You may enable timed statistics at the instance level on the next and all subsequent instance restarts by adding the following line to the instance parameter file:

        timed_statistics = true

In practice, the overhead of collecting timed statistics is extremely small. In most cases, the benefit you'll get from having timing information at your disposal will outweigh the performance overhead. For several years, many DBAs have been running their production systems with timed statistics enabled at all times. With Oracle 9i Release 2, this is now the default behavior.


Collecting Wait Event Information

The wait event interface consists of four dynamic performance views (also known as "v$ views") and a tracing facility. Any user with the SELECT ANY TABLE system privilege or the SELECT_CATALOG_ROLE role can query the v$ views. Only users who can connect to the database as SYSDBA or execute the DBMS_SUPPORT or DBMS_SYSTEM packages can activate wait event tracing in other database sessions. In a typical environment, DBAs have access to the wait event interface but general users do not.
The v$system_event and v$session_event views provide cumulative wait event information for the instance as a whole and for each process, respectively. The v$session_wait view provides detailed information about the active or most recent wait event for each process. The contents of the v$event_name view, meanwhile, do not change. This view lists all wait events built into the Oracle kernel and the parameters for each one.
In addition to the four dynamic performance views, Oracle provides a tracing facility where extremely detailed wait event information is written to a trace file for later evaluation. In more recent releases of Oracle, a PL/SQL package is provided for activating the wait event tracing facility. In older releases of Oracle, the wait event tracing facility is activated by setting debug event number 10046 (not to be confused with a wait event).

The v$system_event View
The v$system_event view shows one row for each wait event name, along with the total number of times a process has waited for this event, the number of timeouts, the total amount of time waited, and the average wait time. All of these figures are cumulative for all Oracle processes since the instance started. Wait events that have not occurred at least once since instance startup do not appear in this view.
The v$system_event view has the following columns:

        Name                            Null?    Type
------------------------------- -------- ----
EVENT VARCHAR2(64)
TOTAL_WAITS NUMBER
TOTAL_TIMEOUTS NUMBER
TIME_WAITED NUMBER
AVERAGE_WAIT NUMBER
TIME_WAITED_MICRO NUMBER

EVENT is the name of the wait event. You can see a list of all wait event names known to your Oracle kernel with the following query:
        SELECT name FROM v$event_name;

TOTAL_WAITS is the total number of times a process has waited for this event since instance startup. This includes daemon processes like PMON and SMON, in addition to dedicated server and shared server processes. It also includes processes from database sessions that have subsequently ended.

TOTAL_TIMEOUTS is the total number of times a process encountered a timeout while waiting for an event. When a process begins to wait for an event, it specifies a timeout period after which the operating system should wake it up if the event has not yet transpired. For example, when an Oracle process issues an I/O request to read a block from a data file (the db file sequential read wait event), the process sets a timeout of one second. Usually the I/O request will complete in less than one second and no timeout will occur. But if the read should take longer than one second for whatever reason, a timeout will occur and the process will wake up. The process might do some minor housekeeping, but it will likely just begin another timeout period of one second and continue waiting for the same event.

TIME_WAITED and AVERAGE_WAIT show the cumulative and average time spent by processes waiting for this event, in centiseconds. Divide these figures by 100 in order to get the wait time in seconds. These two columns will show as zero if timed statistics are not enabled.

TIME_WAITED_MICRO is the same as TIME_WAITED, except that the time is in microseconds. Divide this figure by 1000000 in order to get the wait time in seconds. Note that this column is not present in Oracle 8i and earlier.

Consider the following query from v$system_event:

        SELECT event, time_waited
FROM v$system_event
WHERE event IN ('SQL*Net message from client', 'smon timer',
'db file sequential read', 'log file parallel write');

EVENT TIME_WAITED(sec)
------------------------------------------------------ ------------
log file parallel write 1596.92
db file sequential read 286.57
smon timer 1306738.37
SQL*Net message from client 165289.89

Since instance startup, processes on this system have waited a total of 286.57 seconds while reading single data file blocks from disk, and over 1,596 seconds (26 minutes) while writing redo to the online redo logs. A huge amount of time has been spent waiting on the smon timer and SQL*Net message from client events, but these are both idle wait events so we don't worry about them. (The SMON process spends a lot of time sleeping between consecutive checks of the system, and many dedicated server processes spend a lot of their time waiting for the application to submit a SQL statement for processing.)

The v$session_event View

The v$session_event view is a lot like the v$system_event view, except that it shows separate rows of information for each Oracle process. As with v$system_event, event names do not appear in this view if the process has not waited for them at least once. Also, when an Oracle process terminates (as in the case of when a user logs off the database) all of the rows in v$session_event for that process permanently disappear.
The v$session_event view has the following columns:

        Name                            Null?    Type
------------------------------- -------- ----
SID NUMBER
EVENT VARCHAR2(64)
TOTAL_WAITS NUMBER
TOTAL_TIMEOUTS NUMBER
TIME_WAITED NUMBER
AVERAGE_WAIT NUMBER
MAX_WAIT NUMBER
TIME_WAITED_MICRO NUMBER

SID indicates the session ID of the process waiting for the event. You can query v$session in order to determine the SID of the session whose wait events you want to investigate. The next five columns in the v$session_event view are the same as in the v$system_event view, except that now they pertain to the one specific process instead of all processes.

MAX_WAIT indicates the maximum amount of time the process had to wait for the event. Like TIME_WAITED and AVERAGE_WAIT, the unit of measure is centiseconds and will display as zero if timed statistics are not enabled.

As in v$system_event, TIME_WAITED_MICRO is the same as TIME_WAITED except that the time is in microseconds. Again, this column is not present in Oracle 8i and earlier.
Consider the following query which displays all wait event information for the current SQL*Plus session:

        SELECT event, total_waits, time_waited_micro, max_wait
FROM v$session_event
WHERE SID =
(SELECT sid FROM v$session
WHERE audsid = USERENV ('sessionid') );

EVENT TOTAL_WAITS TIME_WAITED_MICRO MAX_WAIT
------------------------------ ----------- ----------------- ----------
log file sync 1 19629 2
db file sequential read 3 26128 1
db file scattered read 678 1154632 8
SQL*Net message to client 76 352 0
SQL*Net more data to client 1 63 0
SQL*Net message from client 75 1118656279 26930
SQL*Net break/reset to client 4 22281 2

You can see that the Oracle process serving this session has spent 1.180760 seconds waiting for disk I/O. Although there have been 76 instances where the process waited for the networking software to allow it to send a message to the client (the SQL*Plus program), each of these waits was shorter than the 0.01 second resolution of the MAX_WAIT column in v$session_event. (All 76 waits combined added up to only 0.000352 seconds.) Far and away, the Oracle process has spent the vast majority of its time (over 18 minutes) waiting for a SQL statement to be entered at the SQL*Plus prompt.

If you run the query against v$session_event shown above and get back no rows, it is possible that you are encountering Oracle bug number 2429929. In some releases of Oracle 9i, including 9.2.0.1, the SID column in v$session_event actually shows the SID minus one instead of the SID.

The v$event_name View
The v$event_name view shows reference information rather than up-to-the-moment information about instance performance. This view shows one row for each wait event known to the Oracle kernel. Associated with each wait event are up to three parameters-additional pieces of information that provide more detail about a wait situation. The v$event_name view has the following columns:

        Name                            Null?    Type
------------------------------- -------- ----
EVENT# NUMBER
NAME VARCHAR2(64)
PARAMETER1 VARCHAR2(64)
PARAMETER2 VARCHAR2(64)
PARAMETER3 VARCHAR2(64)

EVENT# indicates a unique number assigned to each wait event name. These numbers frequently change from one release of Oracle to the next, so it is better to reference wait events by their names and not their numbers. Wait event names appear to stay consistent across Oracle releases.

The v$system_event and v$session_event views show cumulative information about past waits in summary form, leaving out parameter information from each individual wait. As we will see in the next sections, wait event parameters come into play in the v$session_wait view and wait event trace files.

The v$session_wait View
The v$session_wait view shows one row for each Oracle process. The row indicates the name of the wait event and additional parameters that provide further information about exactly what the process is waiting for (or information about the most recent wait event if the process is not currently waiting). While the v$system_event and v$session_event views show cumulative wait event information, the v$session_wait view shows information as of the present moment only.
The v$session_wait view has the following columns:

        Name                            Null?    Type
------------------------------- -------- ----
SID NUMBER
SEQ# NUMBER
EVENT VARCHAR2(64)
P1TEXT VARCHAR2(64)
P1 NUMBER
P1RAW RAW(8)
P2TEXT VARCHAR2(64)
P2 NUMBER
P2RAW RAW(8)
P3TEXT VARCHAR2(64)
P3 NUMBER
P3RAW RAW(8)
WAIT_TIME NUMBER
SECONDS_IN_WAIT NUMBER
STATE VARCHAR2(19)

SID indicates the process. SEQ# is a sequentially increasing number that starts at one for each process and increments each time the process begins a new wait.

The STATE column indicates how we should interpret the data in this row of the view. If the value in the STATE column is WAITING, then the process is currently waiting for an event. In this case, we can see information about the event and how long the process has been waiting so far. Otherwise, the process is currently not waiting, but we can see information about the last event that the process waited for.

EVENT is the name of a wait event. P1TEXT is the name of a parameter for the wait event, P1 is the value of the parameter, and P1RAW is the value in binary form. The P2 and P3 columns provide additional parameter information.

When the value in the STATE column is WAITING, the value in the WAIT_TIME column will be zero and SECONDS_IN_WAIT will show the number of seconds the process has been waiting for the event thus far. Note that SECONDS_IN_WAIT shows the time in seconds, not centiseconds or microseconds.

When the value in the STATE column is WAITED KNOWN TIME, WAIT_TIME will show the length of the last wait (in centiseconds) and SECONDS_IN_WAIT will not be relevant. (It appears to be the number of seconds since the last wait began, but this is not clear.) The STATE could also be WAITED UNKNOWN TIME or WAITED SHORT TIME, the latter indicating that the last wait was less than one centisecond in duration.

The following query shows the parameters associated with the db file scattered read wait event:

SELECT *    
FROM v$event_name
WHERE name = 'db file scattered read';

EVENT# NAME PARAMETER1 PARAMETER2 PARAMETER3
---------- ---------------------- ------------ ----------- -----------
188 db file scattered read file# block# blocks

This tells us that when a process is waiting for a multi-block read from disk to complete (as in the case of a full table scan where the data blocks were not already in the buffer cache), we can see the file from which the blocks are being read as well as the starting block and number of blocks.

The following query was run while a session was performing a full table scan:

        SQL> SELECT * FROM v$session_wait WHERE sid = 16;

SID SEQ# EVENT
---------- ---------- ------------------------------
P1TEXT P1 P1RAW
------------------------------ ---------- --------
P2TEXT P2 P2RAW
------------------------------ ---------- --------
P3TEXT P3 P3RAW WAIT_TIME SECONDS_IN_WAIT
------------------------------ ---------- -------- ---------- ---------------
STATE
-------------------
16 303 db file scattered read
file# 17 00000011
block# 2721 00000AA1
blocks 8 00000008 -1 0
WAITED SHORT TIME

You can see that the process was not waiting at the moment this query was run, but its last wait had been for an I/O request to read eight blocks from file 17 starting at block 2721. That I/O request had completed in less than 0.01 second. (Note the -1 in the WAIT_TIME column when the STATE is WAITED SHORT TIME.) Why did Oracle choose to read eight blocks at a time? Because the db_file_multiblock_read_count instance parameter was set to eight.


Tracing Wait Event Activity (SYSTEM EVENT 10046)

DBAs should be familiar with the SQL trace facility built into Oracle. By using the commands below, you can enable SQL trace for your session or another session:

When SQL trace is enabled for a session, the Oracle process writes detailed trace information (including timing data if timed statistics are enabled) to a trace file in a directory specified by the user_dump_dest instance parameter. These trace files are plain text files and human readable, but rather tedious and repetitive. You can optionally run trace files through a processor such as TKPROF instead of looking at them directly.

Oracle has the ability to direct a process to include additional information in the trace file, including wait event information. In earlier releases of Oracle, we activated the wait event tracing facility by setting debug event 10046. Debug event 10046 still works in Oracle 9i, but recent releases of Oracle include a PL/SQL package built-in that gives a more friendly means for activating wait event tracing.

Setting debug events allows a DBA to instruct an Oracle instance to take on a special, atypical behavior. Debug events can be used, for example, to cause Oracle to write a system level dump file whenever an ORA-0600 error occurs or skip over corrupted blocks in a table when performing a full table scan. Most debug events should never be set unless you are directed to do so by Oracle Support. Some debug events can put your database at risk.

Debug event 10046 affects the amount of information written to trace files. It is a very safe debug event, and one of the few that you are allowed to set without special permission from Oracle Support. Debug event 10046 can be set to the following values:

Debug Event Setting Effect
10046 trace name context forever, level 1 Enables ordinary SQL trace
10046 trace name context forever, level 4 Enables SQL trace with bind variable values included in trace file
10046 trace name context forever, level 8 Enables SQL trace with wait event information included in trace file
10046 trace name context forever, level 12 Equivalent of level 4 and level 8 together
10046 trace name context off Turns off tracing

You can set the 10046 debug event to trace your session and collect wait event information in the trace file with either of the following commands:

You can set the 10046 debug event to trace another session on the database with any of the following commands:

Calling DBMS_SUPPORT as shown here is equivalent to activating debug event 10046 at level 8. You may include optional additional parameters in the procedure call to activate level 4 or 12. The DBMS_SUPPORT package is not installed in the database by default. You need to run the dbmssupp.sql script found in the rdbms/admin directory as SYS before you can call DBMS_SUPPORT. This script was left out of many releases of Oracle 8i by mistake, but it is available separately as a patch.

It should also be pointed out that the SET_EV procedure in the DBMS_SYSTEM package is not officially supported by Oracle Corporation-they prefer that you use DBMS_SUPPORT instead. The reason is that DBMS_SYSTEM.SET_EV allows you to set any debug event in any session. As mentioned above some debug events can potentially be dangerous, and DBMS_SYSTEM.SET_EV lets you set such events in any session.

Unlike timed statistics, tracing consumes a significant amount of system resources. Therefore it is important to use the tracing facility sparingly. Trace only the sessions you need to trace, and only for as long as you need. Turn off tracing as soon as it is no longer needed with any of the following commands:

When you set debug event 10046 to a level of 8 or 12 (or call START_TRACE or START_TRACE_IN_SESSION in DBMS_SUPPORT), the Oracle process will write a line into the trace file every time it finishes waiting for an event. The line in the trace file will contain almost the same information that would have appeared in the v$session_wait view, but perhaps in a slightly less friendly format. You can also see in the trace file which cursor (and therefore which SQL statement) the wait event was associated with.

Here is an excerpt from a trace file generated by setting debug event 10046 to level 12 on an Oracle 8i database:

        =====================
PARSING IN CURSOR #1 len=80 dep=0 uid=502 oct=3 lid=502
tim=2293771931 hv=2293373707 ad='511dca20'
SELECT /*+ FULL */ SUM (LENGTH(notes))
FROM customer_calls
WHERE status = :x
END OF STMT
PARSE #1:c=0,e=0,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=0,tim=2293771931
BINDS #1:
bind 0: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0
size=24 offset=0
bfp=09717724 bln=22 avl=02 flg=05
value=43
EXEC #1:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=4,tim=2293771931
WAIT #1: nam='SQL*Net message to client' ela= 0 p1=675562835 p2=1 p3=0
WAIT #1: nam='db file scattered read' ela= 3 p1=17 p2=923 p3=8
WAIT #1: nam='db file scattered read' ela= 1 p1=17 p2=931 p3=8
WAIT #1: nam='db file scattered read' ela= 2 p1=17 p2=939 p3=8
WAIT #1: nam='db file sequential read' ela= 0 p1=17 p2=947 p3=1
WAIT #1: nam='db file scattered read' ela= 3 p1=17 p2=1657 p3=8
WAIT #1: nam='db file scattered read' ela= 1 p1=17 p2=1665 p3=8
WAIT #1: nam='db file scattered read' ela= 2 p1=17 p2=1673 p3=8
WAIT #1: nam='db file scattered read' ela= 0 p1=17 p2=1681 p3=8
WAIT #1: nam='db file scattered read' ela= 3 p1=17 p2=1761 p3=8
WAIT #1: nam='db file scattered read' ela= 1 p1=17 p2=1769 p3=8
WAIT #1: nam='db file scattered read' ela= 1 p1=17 p2=1777 p3=8
WAIT #1: nam='db file scattered read' ela= 0 p1=17 p2=1785 p3=8
WAIT #1: nam='db file scattered read' ela= 2 p1=17 p2=1841 p3=8
WAIT #1: nam='db file scattered read' ela= 1 p1=17 p2=1849 p3=8

The "ela=" figures indicate the duration of the wait. In trace files generated by Oracle 8i and earlier, the elapsed time is shown in centiseconds. Beginning in Oracle 9i, the elapsed time is shown in microseconds.

You can see that when the session executed the query, there was a wait shorter than one centisecond for a message to be sent to the client, followed by a bunch of waits for I/O requests against file 17. Most of the I/O requests were multi-block reads, reading eight blocks at a time while performing a full table scan of the CUSTOMER_CALLS table.

You can probably imagine how large and overwhelming trace files can get when anything but a trivial application is traced. This is why Oracle provides the TKPROF utility. TKPROF takes a trace file as input and generates a nicely formatted, easy to read report. Unfortunately, the TKPROF that comes with Oracle 8i and earlier releases ignores wait event information written in the trace file. To analyze the wait event information collected by a trace in an Oracle 8i or earlier environment, you will need to either pore through the trace file manually or write your own trace file parser and formatter. (We've heard of people writing perl scripts to do this, but we have never come across them.)

The TKPROF utility that comes with Oracle 9i is capable of reporting wait event information summarized by distinct statement. By default Oracle 9i TKPROF ignores wait event information in the trace file as earlier releases did. However, if the "waits=yes" command line argument is provided, a separate table of wait event statistics will appear in the report for each distinct statement. Here is a sample excerpt from a TKPROF report showing wait event information:

        ********************************************************************************
SELECT A.customer_id, A.customer_name, COUNT (*) purchases,
MAX (B.transaction_date) last_purchase_date
FROM customers A, purchase_history B
WHERE B.customer_id = A.customer_id
GROUP BY A.customer_id, A.customer_name
ORDER BY A.customer_id

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.06 0.12 0 27 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1461 21.41 25.80 7801 5905 5 21893
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 1463 21.47 25.93 7801 5932 5 21893

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 17

Rows Row Source Operation
------- ---------------------------------------------------
21893 SORT GROUP BY
1525494 HASH JOIN
31212 TABLE ACCESS FULL CUSTOMERS
1525494 INDEX FAST FULL SCAN PURCHASE_HISTORY_PK (object id 7824)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL*Net message to client 1461 0.00 0.00
db file sequential read 4 0.00 0.01
db file scattered read 750 0.07 2.25
direct path write 271 0.83 2.79
direct path read 254 0.00 0.01
SQL*Net message from client 1461 1.28 22.40
********************************************************************************

You can see that the query was parsed and executed once, and there were 1461 fetches to read 21893 rows. The Oracle process used 21.47 CPU seconds and 25.93 seconds of elapsed time to parse, execute, and fetch the query. The elapsed time was greater than the CPU time because the Oracle process had to wait on several wait events. The listing at the bottom of the report shows the process waited on network roundtrips to the client and reads and writes to disk.

All of these waits seem to make sense: There was one network roundtrip to the client for each fetch call, and it took SQL*Plus a long time (over 22 seconds) to display the 21,893 rows of data on my display. The db file sequential reads and db file scattered reads resulted from the table access and index fast full scan. The direct path writes and reads resulted from the hash join and sort operations. If you add the CPU time of 21.47 seconds and the wait times (5.06 seconds not including the SQL*Net message from client waits) you come up with an elapsed time of 26.53 seconds. However, the report shows the total elapsed time as 25.93 seconds. This is a good example of the round-off error that is not unusual when thousands of very short time intervals are added together.

In some situations you may not be able to identify exactly which session to trace. This happens frequently when your database is supporting a web application that has frequent short connections, or an application server tier that maintains a pool of database connections all logged on as the same Oracle user. You may have a specific query you want to collect wait event information for, but you may not know which session will be executing the query. One way to deal with this situation is to trace all of the sessions being started by your application for a brief period. This can be done easily with the following SQL*Plus script:

        SPOOL traceall.sql
SET HEADING OFF FEEDBACK OFF
SELECT 'EXECUTE SYS.dbms_system.set_ev (' || TO_CHAR (sid) ||
', ' || TO_CHAR (serial#) || ', 10046, 8, '''')'
FROM v$session
WHERE username = 'WEB_USER';
SPOOL OFF
SET FEEDBACK ON
@traceall.sql

This script will start wait event tracing on all sessions connected to the database as the WEB_USER user. To stop tracing simply change the 8 after the 10046 to a 0 and run the script again. You should think about how many sessions this script will trace and how that will affect server load before running on a production system.
After you've traced a number of sessions, you can scan the trace files for occurrences of a specific query with a command like the following:

        grep -n 1234567890 *.trc

Replace 1234567890 with the hash value or address of the statement you are interested in. (You can get these values by querying v$sql.) The output of the above grep command will tell you which trace files the query appears in and on which lines. You can then go to that point in the trace files and get detailed wait event information.
Trace files are easiest to work with when you use a dedicated server connection to the database. If you connect to the database using Oracle's shared server architecture then different SQL statements may be executed by different server processes. Each server process writes to its own trace file. Thus the trace information for the session can be spread over several trace files.


Event Wait Analysis with STATSPACK

In STATSPACK, you can get a snapshot of Oracle wait events every hour and plot changes in wait behavior over time. You can also set thresholds, and report only on wait events that exceed your pre-defined threshold. Here is the script that is commonly used for exception reporting of wait events.

-- prompt
-- prompt
-- prompt ***********************************************************
-- prompt Excessive waits on background events
-- prompt ***********************************************************
-- prompt

ttitle 'High waits on background events|Rollup by hour'

column mydate heading 'Yr. Mo Dy Hr' format a13;
column event format a30;
column total_waits heading 'tot waits' format 999,999;
column time_waited heading 'time wait' format 999,999;
column total_timeouts heading 'timeouts' format 9,999;

break on to_char(snap_time,'yyyy-mm-dd') skip 1;

select
to_char(snap_time,'yyyy-mm-dd HH24') mydate,
e.event,
e.total_waits - nvl(b.total_waits,0) total_waits,
e.time_waited - nvl(b.time_waited,0) time_waited,
e.total_timeouts - nvl(b.total_timeouts,0) total_timeouts
from
stats$bg_event_summary b,
stats$bg_event_summary e,
stats$snapshot sn
where snap_time > sysdate-&1
and e.event not like '%timer'
and e.event not like '%message%'
and e.event not like '%slave wait%'
and e.snap_id = sn.snap_id
and b.snap_id = e.snap_id-1
and b.event = e.event
and e.total_timeouts > 100
and (e.total_waits - b.total_waits > 100
or
e.time_waited - b.time_waited > 100
);

Here is the output from this script. As we can see, we get a time-series result, showing us those days and hours when our thresholds are exceeded. If we look at this listing, we see that every evening between 10:00 p.m. and 11:00 p.m., we experience high waits on the redo logs.

Wed Aug 21                                                            page    1
High waits on background events
Rollup by hour

Yr. Mo Dy Hr EVENT tot waits time wait timeouts
------------- ------------------------------ --------- --------- --------
2002-08-18 22 LGWR wait for redo copy 9,326 1,109 286
2002-08-18 23 LGWR wait for redo copy 8,506 316 8
2002-08-18 23 buffer busy waits 214 21,388 206
2002-08-19 00 LGWR wait for redo copy 498 5 0
2002-08-19 01 LGWR wait for redo copy 497 15 0
2002-08-19 22 LGWR wait for redo copy 9,207 1,433 427
2002-08-19 22 buffer busy waits 529 53,412 515
2002-08-19 23 LGWR wait for redo copy 9,066 367 9
2002-08-19 23 buffer busy waits 250 24,479 236
2002-08-20 00 LGWR wait for redo copy 771 16 0
2002-08-20 22 LGWR wait for redo copy 8,030 2,013 634
2002-08-20 22 buffer busy waits 356 35,583 343
2002-08-20 23 LGWR wait for redo copy 8,021 579 86
2002-08-20 23 buffer busy waits 441 44,677 432
2002-08-21 00 LGWR wait for redo copy 1,013 26 1
2002-08-21 00 rdbms ipc reply 160 30,986 149
2002-08-21 01 LGWR wait for redo copy 541 17 0

Using Queries Against v$event_wait

Here is the most commonly used script for displaying system-wide events, called current_waits.sql.

set pages 999
set lines 90

column c1 heading 'Event|Name' format a30
column c2 heading 'Total|Waits' format 999,999,999
column c3 heading 'Seconds|Waiting' format 999,999
column c4 heading 'Total|Timeouts' format 999,999,999
column c5 heading 'Average|Wait|(in secs)' format 99.999

ttitle 'System-wide Wait Analysis|for current wait events'

select
event c1,
total_waits c2,
time_waited / 100 c3,
total_timeouts c4,
average_wait /100 c5
from
sys.v_$system_event
where
event not in (
'dispatcher timer',
'lock element cleanup',
'Null event',
'parallel query dequeue wait',
'parallel query idle wait - Slaves',
'pipe get',
'PL/SQL lock timer',
'pmon timer',
'rdbms ipc message',
'slave wait',
'smon timer',
'SQL*Net break/reset to client',
'SQL*Net message from client',
'SQL*Net message to client',
'SQL*Net more data to client',
'virtual circuit status',
'WMON goes to sleep'
)
AND event not like 'DFS%'
and event not like '%done%'
and event not like '%Idle%'
AND event not like 'KXFX%'
order by c2 desc;

Below is the output from this script. As we see, the report shows all cumulative waits since instance startup time.

Tue Aug 20                                                          page    1
System-wide Wait Analysis
for current wait events

Average
Event Total Seconds Total Wait
Name Waits Waiting Timeouts (in secs)
------------------------------ ------------ -------- ------------ ---------
db file sequential read 2,902,850 3,829 0 .001
latch free 2,248,864 496 1,524,235 .000
PX Deq: Table Q Normal 2,080,463 4,469 248 .002
PX Deq Credit: send blkd 1,321,019 52,251 23,032 .040
direct path read 986,951 6,931 0 .007
PX Deq Credit: need buffer 800,970 1,091 84 .001
log file parallel write 415,175 5,078 2 .012
direct path write 321,096 9,342 0 .029
PX Deq: Execution Msg 198,768 56,384 26,020 .284
log file sequential read 118,480 164 0 .001
PX Deq: Execute Reply 92,487 5,628 2,407 .061
log file sync 87,295 1,327 7 .015
db file scattered read 70,112 162 0 .002
enqueue 44,335 1,468 354 .033
PX Deq: Join ACK 42,503 77 0 .002
file open 28,545 8 0 .000
PX Deq: Signal ACK 26,753 253 6,705 .009
log file switch completion 54 18 5 .341
control file parallel write 23,785 340 0 .014
SQL*Net more data from client 19,847 878 0 .044
PX Deq: Parse Reply 17,885 26 0 .001
db file parallel write 17,745 1,293 4 .073
PX Deq: Msg Fragment 15,612 18 8 .001
rdbms ipc reply 11,013 167 66 .015
PX Deq: Table Q qref 7,778 2 0 .000
LGWR wait for redo copy 7,608 2 43 .000
control file sequential read 4,098 1 0 .000
buffer busy waits 2,970 9 0 .003
PX Deq: Table Q Sample 2,053 25 27 .012
library cache pin 1,353 6 0 .004
PX Deq: Table Q Get Keys 530 1 0 .001
local write wait 317 177 163 .560
file identify 315 0 0 .000
refresh controlfile command 214 0 0 .002
PX qref latch 135 105 102 .780
log file single write 80 2 0 .021
imm op 67 0 0 .000
process startup 61 2 0 .036
write complete waits 33 34 33 1.032
library cache load lock 25 6 2 .256
log buffer space 20 6 0 .286
db file single write 6 0 0 .007
row cache lock 5 0 0 .006
db file parallel read 2 0 0 .010
instance state change 2 0 0 .000
reliable message 1 0 0 .010
library cache lock 1 0 0 .020

This should give a list of "current" wait events, but some events are more important than others. These are the one's to watch-out for:

Once we have a system wide picture of wait events, we can drill-down into wait events that are directly associated with background processes.
set pages 999
set lines 90

column c1 heading 'System|ID' format 999
column c2 heading 'Prcs' format a10
column c3 heading 'Event|Name' format a30
column c4 heading 'Total|Waits' format 999,999
column c5 heading 'Time|Waited|(in secs)' format 999,999
column c6 heading 'Avg|Wait|secs' format 999
column c7 heading 'Avg|Wait|secs' format 999
column c8 heading 'Max|Wait|(in secs)' format 999


ttitle 'System-wide Wait Analysis|Detail Analysis|for current wait events'

select
b.sid c1,
decode(b.username,NULL,c.name,b.username) c2,
event c3,
a.total_waits c4,
round((a.time_waited / 100),2) c5,
a.total_timeouts c6,
round((average_wait / 100),2) c7,
round((a.max_wait / 100),2) c8
from
sys.v_$session_event a,
sys.v_$session b,
sys.v_$bgprocess c
where event NOT LIKE 'DFS%'
and event NOT LIKE 'KXFX%'
and a.sid = b.sid
and b.paddr = c.paddr (+)
and event NOT IN
('lock element cleanup',
'pmon timer',
'rdbms ipc message',
'smon timer',
'SQL*Net message from client',
'SQL*Net break/reset to client',
'SQL*Net message to client',
'SQL*Net more data to client',
'dispatcher timer',
'Null event',
'io done',
'parallel query dequeue wait',
'parallel query idle wait - Slaves',
'pipe get',
'PL/SQL lock timer',
'slave wait',
'virtual circuit status',
'WMON goes to sleep')
order by 4 desc
;

Here is the output from this script. Here we see each background process and the time each has spent waiting within Oracle.

Tue Aug 20                                                                      page    1
System-wide Wait Analysis
Detail Analysis
for current wait events

Time Avg Avg Max
System Event Total Waited Wait Wait Wait
ID Prcs Name Waits (in secs) secs secs (in secs)
------ ---------- ------------------------------ -------- --------- ---- ---- ---------
3 LGWR log file parallel write 415,082 5,076 2 0 1
11 ARC0 log file sequential read 108,834 139 0 0 0
4 CKPT control file parallel write 23,035 327 0 0 1
4 CKPT file open 20,842 4 0 0 0
2 DBW0 db file parallel write 17,684 1,292 4 0 1
4 CKPT direct path read 9,745 106 0 0 6
25 ARC1 log file sequential read 9,603 25 0 0 0
3 LGWR LGWR wait for redo copy 7,608 2 43 0 0
4 CKPT direct path write 3,743 2 0 0 0
5 SMON db file scattered read 1,824 7 0 0 0
4 CKPT control file sequential read 1,355 0 0 0 0
5 SMON db file sequential read 794 2 0 0 0
156 CPA1_OWNER db file sequential read 677 3 0 0 0
3 LGWR control file sequential read 468 0 0 0 0
11 ARC0 control file sequential read 433 0 0 0 0
3 LGWR control file parallel write 271 6 0 0 0
2 DBW0 control file sequential read 253 0 0 0 0
3 LGWR file open 190 0 0 0 0
160 CPA1_USER log file sync 152 2 0 0 0
25 ARC1 control file sequential read 143 0 0 0 0
2 DBW0 file open 112 0 0 0 0
5 SMON file open 111 0 0 0 0
2 DBW0 direct path read 108 0 0 0 0
3 LGWR direct path read 108 2 0 0 0
2 DBW0 file identify 108 0 0 0 0
3 LGWR direct path write 107 0 0 0 0
11 ARC0 control file parallel write 103 2 0 0 0
11 ARC0 file identify 102 0 0 0 0
12 file open 91 0 0 0 0
16 file open 90 1 0 0 0
20 file open 89 1 0 0 0
3 LGWR log file single write 80 2 0 0 0
3 LGWR file identify 80 0 0 0 0
183 CPA1_USER log file sync 73 1 0 0 0
11 ARC0 file open 70 0 0 0 0
3 LGWR imm op 67 0 0 0 0
3 LGWR log file sequential read 43 0 0 0 0
15 file open 38 0 0 0 0
6 RECO db file sequential read 36 0 0 0 0
156 CPA1_OWNER latch free 34 0 21 0 0
160 CPA1_USER latch free 30 0 20 0 0
2 DBW0 latch free 29 1 29 0 0
5 SMON latch free 17 0 11 0 0
125 SYS latch free 14 0 7 0 0
183 CPA1_USER latch free 14 0 9 0 0
3 LGWR latch free 11 0 11 0 0
156 CPA1_OWNER file open 11 0 0 0 0
25 ARC1 control file parallel write 9 0 0 0 0
183 CPA1_USER db file sequential read 9 0 0 0 0
25 ARC1 file identify 9 0 0 0 0
115 CPA1_USER log file sync 8 0 0 0 0
25 ARC1 file open 8 0 0 0 0
4 CKPT file identify 4 0 0 0 0
15 latch free 3 0 3 0 0
2 DBW0 process startup 3 0 0 0 0
160 CPA1_USER db file sequential read 3 0 0 0 0
11 ARC0 enqueue 2 4 0 2 3
132 CPA1_USER log file sync 2 0 0 0 0
125 SYS db file sequential read 2 0 0 0 0
4 CKPT latch free 1 0 1 0 0
12 latch free 1 0 1 0 0
11 ARC0 process startup 1 0 0 0 0
3 LGWR enqueue 1 3 0 3 3
6 RECO file open 1 0 0 0 0
183 CPA1_USER file open 1 0 0 0 0
125 SYS file open 1 0 0 0 0
160 CPA1_USER file open 1 0 0 0 0
166 ANAND file open 1 0 0 0 0
166 ANAND db file sequential read 1 0 0 0 0
125 SYS log file sync 1 0 0 0 0
5 SMON buffer busy waits 1 0 0 0 0
25 ARC1 enqueue 1 0 0 0 0
3 LGWR process startup 1 0 0 0 0
16 latch free 1 0 1 0 0
6 RECO latch free 1 0 1 0 0

From all of the above reports, we see the following areas for improvement:

1 - Event waits for parallel query dequeues. We need to check the default degree of parallelism for database objects and turn-off parallelism at the system level.

2 - Event waits for "db file sequential reads." This is most likely due to segment header contention on the indexes, but it could also be due to disk contention. We will start by increasing the number of freelists on the indexes. If the waits persist, the offending index should be striped across multiple disk spindles.

These are very general wait conditions, but they can sometimes be fixed by changing parameters or object characteristics. Some possible solutions include:


Session Detail Event Waits
Once we have covered the system and background waits, we can find tasks that are currently waiting. This script is tricky because it must be running at the exact moment of a wait. Some Oracle professionals run this script every 60 seconds, sending an e-mail when an important wait occurs:

wait_detail_process.sql script.

Tue Aug 20                                                                      page    1
System-wide Wait Analysis
Detail Analysis
for current wait events

Time Avg Avg Max
System Event Total Waited Wait Wait Wait
ID Prcs Name Waits (in secs) secs secs (in secs)
------ ---------- ------------------------------ -------- --------- ---- ---- ---------
3 LGWR log file parallel write 415,082 5,076 2 0 1
11 ARC0 log file sequential read 108,834 139 0 0 0
4 CKPT control file parallel write 23,035 327 0 0 1
4 CKPT file open 20,842 4 0 0 0
2 DBW0 db file parallel write 17,684 1,292 4 0 1
4 CKPT direct path read 9,745 106 0 0 6
25 ARC1 log file sequential read 9,603 25 0 0 0
3 LGWR LGWR wait for redo copy 7,608 2 43 0 0
4 CKPT direct path write 3,743 2 0 0 0
5 SMON db file scattered read 1,824 7 0 0 0
4 CKPT control file sequential read 1,355 0 0 0 0
5 SMON db file sequential read 794 2 0 0 0
156 CPA1_OWNER db file sequential read 677 3 0 0 0
3 LGWR control file sequential read 468 0 0 0 0
11 ARC0 control file sequential read 433 0 0 0 0
3 LGWR control file parallel write 271 6 0 0 0
2 DBW0 control file sequential read 253 0 0 0 0
3 LGWR file open 190 0 0 0 0
160 CPA1_USER log file sync 152 2 0 0 0
25 ARC1 control file sequential read 143 0 0 0 0
2 DBW0 file open 112 0 0 0 0
5 SMON file open 111 0 0 0 0
2 DBW0 direct path read 108 0 0 0 0
3 LGWR direct path read 108 2 0 0 0
2 DBW0 file identify 108 0 0 0 0
3 LGWR direct path write 107 0 0 0 0
11 ARC0 control file parallel write 103 2 0 0 0
11 ARC0 file identify 102 0 0 0 0
12 file open 91 0 0 0 0
16 file open 90 1 0 0 0
20 file open 89 1 0 0 0
3 LGWR log file single write 80 2 0 0 0
3 LGWR file identify 80 0 0 0 0
183 CPA1_USER log file sync 73 1 0 0 0
11 ARC0 file open 70 0 0 0 0
3 LGWR imm op 67 0 0 0 0
3 LGWR log file sequential read 43 0 0 0 0
15 file open 38 0 0 0 0
6 RECO db file sequential read 36 0 0 0 0
156 CPA1_OWNER latch free 34 0 21 0 0
160 CPA1_USER latch free 30 0 20 0 0
2 DBW0 latch free 29 1 29 0 0
5 SMON latch free 17 0 11 0 0
125 SYS latch free 14 0 7 0 0
183 CPA1_USER latch free 14 0 9 0 0
3 LGWR latch free 11 0 11 0 0
156 CPA1_OWNER file open 11 0 0 0 0
25 ARC1 control file parallel write 9 0 0 0 0
183 CPA1_USER db file sequential read 9 0 0 0 0
25 ARC1 file identify 9 0 0 0 0
115 CPA1_USER log file sync 8 0 0 0 0
25 ARC1 file open 8 0 0 0 0
4 CKPT file identify 4 0 0 0 0
15 latch free 3 0 3 0 0
2 DBW0 process startup 3 0 0 0 0
160 CPA1_USER db file sequential read 3 0 0 0 0
11 ARC0 enqueue 2 4 0 2 3
132 CPA1_USER log file sync 2 0 0 0 0
125 SYS db file sequential read 2 0 0 0 0
4 CKPT latch free 1 0 1 0 0
12 latch free 1 0 1 0 0
11 ARC0 process startup 1 0 0 0 0
3 LGWR enqueue 1 3 0 3 3
6 RECO file open 1 0 0 0 0
183 CPA1_USER file open 1 0 0 0 0
125 SYS file open 1 0 0 0 0
160 CPA1_USER file open 1 0 0 0 0
166 ANAND file open 1 0 0 0 0
166 ANAND db file sequential read 1 0 0 0 0
125 SYS log file sync 1 0 0 0 0
5 SMON buffer busy waits 1 0 0 0 0
25 ARC1 enqueue 1 0 0 0 0
3 LGWR process startup 1 0 0 0 0
16 latch free 1 0 1 0 0
6 RECO latch free 1 0 1 0 0

Here is the output from this script. Here we see two types of wait conditions:

               Individual process wait times

SID Name Name Wait Time state P1 text Val Text
---- ---------------- ---------- ----- ----- ---------- ---------- ------------ ----------
P2 P3 P3
Val Text Val
------------ ---------- ------------
75 CPA1_USER db file se .00 .00 WAITING file# 116 block#
quential r
ead
51,253 blocks 1

52 IUD1_READ_ONLY PX Deq Cre .00 .00 WAITING sleeptime/ 268,566,527 passes
dit: send senderid
blkd
284 qref 0

253 IUD1_READ_ONLY PX Deq Cre .00 .00 WAITING sleeptime/ 268,566,527 passes
dit: send senderid
blkd
284 qref 0

151 IUD1_READ_ONLY PX Deq Cre .00 .00 WAITING sleeptime/ 268,566,527 passes
dit: send senderid
blkd
284 qref 0

147 IUD1_READ_ONLY PX Deq Cre .00 .00 WAITING sleeptime/ 268,566,527 passes
dit: send senderid
blkd
284 qref 0


Methods for Using Event Wait Data
Because this is a short article, let's confine our wait discussion to the most onerous wait event, the db file sequential read wait event.
In our example from the system-wide reports we noted that the number one wait in v$system_event is "db file sequential reads." This wait event is usually for index reads. If we look at the detail listing for these wait events, we see that the script returns the file number and block number where Oracle was waiting for the event.

SID Name             Name        Wait  Time state      P1 text             Val Text
---- ---------------- ---------- ----- ----- ---------- ---------- ------------ ----------
P2 P3 P3
Val Text Val
------------ ---------- ------------
75 CPA1_USER db file se .00 .00 WAITING file# 116 block#
quential r
ead
51,253 blocks
1

Here we see that the CPA1_USER was waiting on access to file number 16, block number 51,253. Given this information, we can drill down into dba_extents and dba_data_files to see the data file that is causing the waits.
Here is a simple script that will display the file name for any file number:

accept filenbr prompt 'Enter the file#: '

select tablespace_name, file_name
from dba_data_files
where file_id = &&filenbr;

When we run the script we can quickly see the name of the file that is experiencing the "db file sequential reads" wait:

Enter the file#: 10

TABLESPACE_NAME
------------------------------
FILE_NAME
--------------------------------------
USERS
C:\ORACLE\ORADATA\DIOGENES\USERS02.DBF

To properly locate the source of the wait we need more than just the file name We can use the get_object_by_block_nbr.sql script to display all object on the target block:

set pages 999
set verify off
set lines 80

col c1 heading 'Segment|Name' format a40
col c2 heading 'Segment|Type' format a10
col c3 heading 'First|Block|of Segment' format 999,999,999,999
col c4 heading 'Last|Block|of Segment' format 999,999,999,999

accept filenbr prompt 'Enter the file number: '
accept inblock prompt 'Enter the block number: '


select segment_name c1,
segment_type c2,
block_id c3
-- block_id+blocks c4
from dba_extents
where &filenbr = file_id
and &inblock >= block_id
and &inblock <= block_id+blocks;

Here we see the exact source of the wait event is an index called IDX_EVENTCASE_STATUS_OVERAGE.

Enter the file  number: 116
Enter the block number: 51253
First
Segment Segment Block
Name Type of Segment
---------------------------------------- ---------- ----------------
IDX_EVENTCASE_STATUS_OVERAGE INDEX 51,205

So, why are sessions waiting on index access? The most common cause is a freelist shortage. Prior to Oracle bitmap freelists (automatic extent management), segment header contention can happen if multiple tasks compete to simultaneously update the same index. The script can easily check the number of freelists for this object:

clear columns

col c1 heading 'Table|Name' format a20
col c2 heading 'Table|Freelists' format 99
col c3 heading 'Index|Name' format a20
col c4 heading 'Index|Freelists' format 99

select distinct t.table_name c1,
t.freelists c2,
index_name c3,
i.freelists c4
from dba_tables t, dba_indexes i
where t.table_name = i.table_name
and i.index_name = 'IDX_EVENTCASE_STATUS_OVERAGE'

Of course, there could be other problems such as a high disk enqueue on the disk that contains the index, but we should always try to add freelists first. If more freelists do not solve the wait issue, then the index must be placed on a faster disk, a less busy disk, or striped across multiple disks.

Also, note that because we have the SID for the waiting task, we can use the SID to join into v$session and v$sql to see the SQL that was causing the wait condition.



How Wait Event Information Can Be Used to Boost System Performance

In the examples below, we will apply the techniques described above for collecting wait event information, and we will evaluate the data in order to better understand how the system is functioning and determine what we can do to boost system performance. Hopefully these examples will give you insights into researching performance and other types of problems on your systems.

Example #1: Buffer Busy Waits
A magazine publisher had developed a web site that offered web access to many of the articles in the magazine. At times, the website would get completely bogged down and response time would become very poor. Operating system statistics on the database server indicated that during busy times, server idle time would be near zero. We used Statspack to gather database statistics on an hourly basis. Excerpts from one Statspack report are as follows:

                                           Snap Id     Snap Time      Sessions
------- ------------------ --------
Begin Snap: 61 11-Dec-02 13:00:52 145
End Snap: 71 11-Dec-02 14:00:26 145
Elapsed: 59.57 (mins)

Cache Sizes
~~~~~~~~~~~
db_block_buffers: 38400 log_buffer: 1048576
db_block_size: 8192 shared_pool_size: 80000000

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 6,390.41 5,731.32
Logical reads: 33,962.40 30,459.63
Block changes: 44.72 40.11
Physical reads: 536.77 481.41
Physical writes: 163.02 146.21
User calls: 397.75 356.73
Parses: 237.64 213.13
Hard parses: 23.84 21.38
Sorts: 248.38 222.77
Logons: 0.41 0.36
Executes: 213.52 191.50
Transactions: 1.11

% Blocks changed per Read: 0.13 Recursive Call %: 86.07
Rollback per transaction %: 0.23 Rows per Sort: 30.73

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 98.39 Redo NoWait %: 99.97
Buffer Hit %: 98.42 In-memory Sort %: 100.00

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
buffer busy waits 1,962,372 1,278,649 50.03
db file sequential read 1,336,870 1,050,878 41.12
db file scattered read 47,717 49,326 1.93
direct path write 8,070 40,574 1.59
latch free 38,220 31,012 1.21

Buffer Pool Statistics for DB:
-> Pools D: default pool, K: keep pool, R: recycle pool
Free Write Buffer
Buffer Consistent Physical Physical Buffer Complete Busy
P Gets Gets Reads Writes Waits Waits Waits
- ----------- ------------- ----------- ---------- ------- -------- ----------
D 12,292 49,863,696 470 3,530 0 0 32
K 1,509,385 20,691,972 1,509,366 11 0 0 1,962,117
R 5 13 0 0 0 0 0

We noticed that the two dominant wait events were buffer busy waits and db file sequential read. These two wait events accounted for over 23,000 seconds of wait time during the one hour sample period. We decided to drill down to learn more about these wait events. Looking at v$event_name, we see that the parameters for the buffer busy waits event are as follows:

SELECT * FROM v$event_name WHERE name = 'buffer busy waits';

EVENT# NAME PARAMETER1 PARAMETER2 PARAMETER3
---------- -------------------- --------------- --------------- ---------------
75 buffer busy waits file# block# id

For this wait event, the parameter information from v$event_name points you in the right direction but does not necessarily give you the full story. The first and second parameters tell you the file and block number of the data block in the database that the session was trying to access when the wait event occurred. The third parameter is a numeric reason code indicating why the buffer in the buffer cache was busy and the process had to wait. Metalink bulletin 34405.1 explains what the different reason codes mean.
The following query will list all sessions waiting for a given wait event at the split second the query is begun:

SELECT   sid, event, state, seconds_in_wait, wait_time, p1, p2, p3
FROM v$session_wait
WHERE event = 'buffer busy waits'
ORDER BY sid;

Because only a couple (if any) sessions are likely to be encountering this wait at any one moment in time, we usually need to run this query several times in quick succession. One of the times, we got the following output:

         SID EVENT             STATE SECONDS_IN_WAIT WAIT_TIME     P1     P2     P3
---- ----------------- ----- --------------- --------- ------ ------ ------
12 buffer busy waits WAITE 1 0 30 62157 130
31 buffer busy waits WAITE 1 0 30 23558 130

We performed the following query for each session wait found, substituting P1 for the file number and P2 for the block number:

SELECT owner, segment_name, segment_type 
FROM dba_extents
WHERE file_id = &absolute_file_number
AND &block_number BETWEEN block_id AND block_id + blocks -1;

Using this query we found that session 12 above was waiting on a busy buffer in the SAMPLES table in the PRODMGR schema:

        Enter value for absolute_file_number: 30
Enter value for block_number: 62157

OWNER SEGMENT_NAME SEGMENT_TYPE
----------------- ------------------- ------------
PRODMGR SAMPLES TABLE

In order to figure out what reason code 130 means, we look at a reference table listed in Metalink bulletin 34405.1. For convenience this table appears below. We can see that the buffer is busy because the block is being read by another session.

Reason Code (Id) P3 Reason
<=8.0.6 >=8.1.6
0 0 A block is being read
1003 100 We want to NEW the block but the block is currently being read by another session (most likely for undo).
1007 200 We want to NEW the block but someone else has is using the current copy so we have to wait for them to finish.
1010 230 Trying to get a buffer in CR/CRX mode , but a modification has started on the buffer that has not yet been completed.
1012 - A modification is happening on a SCUR or XCUR buffer, but has not yet completed
1012 (duplicate) 231 CR/CRX scan found the CURRENT block, but a modification has started on the buffer that has not yet been completed.
1013 130 Block is being read by another session and no other suitable block image was found, so we wait until the read is completed. This may also occur after a buffer cache assumed deadlock. The kernel can't get a buffer in a certain amount of time and assumes a deadlock. Therefor it will read the CR version of the block.
1014 110 We want the CURRENT block either shared or exclusive but the Block is being read into cache by another session, so we have to wait until their read() is completed.
1014 (duplicate) 120 We want to get the block in current mode but someone else is currently reading it into the cache. Wait for them to complete the read. This occurs during buffer lookup.
1016 210 The session wants the block in SCUR or XCUR mode. If this is a buffer exchange or the session is in discrete TX mode, the session waits for the first time and the second time escalates the block as a deadlock and so does not show up as waiting very long. In this case the statistic: "exchange deadlocks" is incremented and we yield the CPU for the "buffer deadlock" wait event.
1016 (duplicate) 220 During buffer lookup for a CURRENT copy of a buffer we have found the buffer but someone holds it in an incompatible mode so we have to wait.

Next we looked for sessions waiting on the db file sequential read event-sessions waiting on index lookups or table accesses by ROWID:

        SELECT   sid, event, state, seconds_in_wait, wait_time, p1, p2, p3
FROM v$session_wait
WHERE event = 'db file sequential read'
ORDER BY sid;

Following the same approach as used with the buffer busy waits, we then drilled down to find which segments in the database sessions were accessing with the single block I/Os. In this case it was again the SAMPLES table.

From this we concluded that the buffer busy waits appeared related to the single block disk reads-buffer block reads were waiting for other buffer block reads to complete, and those in turn were waiting for sequential reads from disk to complete. Put another way, session A was waiting on a single block disk I/O to bring a block into the buffer cache. Meanwhile, session B wanted to read the same block. Since the block was not yet in the buffer cache and another session was already reading it from disk, session B waited on a buffer busy waits event while session A waited on the db file sequential read event.

When we looked at the SAMPLES table further, we found that it contained a LONG column and there was a fair amount of row chaining-some rows were larger than the database block size. Many queries against multiple tables were doing nested loops joins with the SAMPLES table. This was causing lots of single block reads from the SAMPLES table.

The values stored in the LONG column of the SAMPLES table were quite big. One of the many weaknesses of the LONG data type is that data is stored inline in the table segment regardless of size. This led to a very large SAMPLES table segment even though the table did not have a large number of rows. While most queries against the SAMPLES table were not fetching the LONG column, Oracle still had to bring the LONG data into the buffer cache because it was stored in the same blocks as the other columns. This led to a lot more physical I/O and churn in the buffer cache than should have been necessary.

One way to reduce buffer cache churn and physical I/O would have been to replace the LONG column with a CLOB column. CLOB data can be stored in a separate segment from the rest of the table data. Thus the SAMPLES table segment would be a lot smaller and more likely to fit entirely in the buffer cache. This would reduce the activity rate in the buffer cache and the amount of physical I/O.

Unfortunately, changing the database schema and the application code to replace the LONG data type with CLOB was not possible in a quick enough timeframe to address the immediate performance problem. So we had to develop a quick fix that could be deployed temporarily until the data type could be changed. First we enlarged the KEEP buffer pool, reducing physical reads as all of the SAMPLES table could stay in the buffer cache. Next, we created an index on all of the columns in the SAMPLES table except the LONG column. This allowed Oracle to satisfy most queries against the SAMPLES table without having to read the excessively large table segment, which greatly reduced the amount of data accessed during each query.

These two easy and quick changes virtually eliminated the buffer busy waits and reduced db file sequential read wait events as total physical reads dropped 90%. Response time for the most common queries involving the SAMPLES table dramatically improved, and overall web site response times returned to an acceptable level. Excerpts from a Statspack report after the two changes were made are as follows:

                                            Snap Id     Snap Time      Sessions
------- ------------------ --------
Begin Snap: 1192 20-Dec-02 13:00:49 102
End Snap: 1202 20-Dec-02 14:00:18 102
Elapsed: 59.48 (mins)

Cache Sizes
~~~~~~~~~~~
db_block_buffers: 38400 log_buffer: 1048576
db_block_size: 8192 shared_pool_size: 80000000

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
-------- ---------------
Redo size: 2,496.24 3,453.13
Logical reads: 30,270.86 41,874.69
Block changes: 17.08 23.62
Physical reads: 36.90 51.04
Physical writes: 79.62 110.14
User calls: 313.37 433.50
Parses: 173.91 240.57
Hard parses: 17.04 23.57
Sorts: 169.33 234.25
Logons: 0.36 0.50
Executes: 149.11 206.27
Transactions: 0.72

% Blocks changed per Read: 0.06 Recursive Call %: 84.55
Rollback per transaction %: 0.31 Rows per Sort: 62.86

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 99.98
Buffer Hit %: 99.88 In-memory Sort %: 99.89

Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
direct path write 6,467 13,545 30.61
log file sync 4,914 7,493 16.93
library cache pin 1,175 6,090 13.76
direct path read 5,488 3,428 7.75
latch free 14,528 2,931 6.62

Buffer Pool Statistics for DB
-> Pools D: default pool, K: keep pool, R: recycle pool
Free Write Buffer
Buffer Consistent Physical Physical Buffer Complete Busy
P Gets Gets Reads Writes Waits Waits Waits
- ----------- ------------- ----------- ---------- ------- -------- ----------
D 10,178 39,561,072 330 1,122 0 0 10
K 1 13,075,944 1 4 0 0 0
R 0 0 0 0 0 0 0

As you can see, with a similar amount of querying activity (roughly 300-400 user calls and 30,000-33,000 logical reads per second) we have reduced total wait event time to about 442 seconds during the one hour sample period. (Direct path writes totaled 13545 centiseconds and represented 30.61% of total waits; 135.45/0.3061 = 442 seconds of total wait time.) This is quite a difference from the 23,000 seconds of wait time during the sample period we were seeing before the KEEP pool was enlarged and the extra index was added to the SAMPLES table.

Over time, application changes were made so that even fewer queries against the SAMPLES table required retrieving the LONG column. Eventually, the LONG column was replaced with a CLOB.

Example #2: More Buffer Busy Waits, Plus Latch Contention for Good Measure
The magazine publisher's application demonstrated buffer busy waits where multiple Oracle processes want to read the same data block from disk at the same time. The first Oracle process would issue the I/O request and wait on a db file sequential read event, while subsequent Oracle processes would wait on a buffer busy waits event until the data block had made its way into the buffer cache. Now we will look at an example where multiple Oracle processes try to write to different copies of the same data block in the buffer cache at the same time. Since Oracle has row-level locking, multiple users can update different rows in the same table at the same time without locking each other out. Multiple users can even update different rows in the same data block concurrently. But, as we will see in this example, contention and buffer busy waits can sometimes occur.

A genetic research company used an Oracle database to store research data. Periodically, they would load large volumes of raw data into the database. Then they would run a processing application on multiple workstations. Each workstation would fetch raw data from the database, process it, and put the processed data back in the database. The company had planned to add more workstations to speed up the processing, but found that adding more workstations actually seemed to reduce the throughput. The Oracle database was being used merely as a repository, we were told. All Oracle had to do was fetch the raw data and then store the processed data. The actual processing was being performed on the workstations outside of the realm of Oracle. So it would seem that the Oracle database should not be the bottleneck.

We found about 100 active database sessions, each initiated by a separate workstation. We modified the application code running on workstation 30 to activate debug event 10046 for wait event tracing. We did this by inserting the following statement into the code just after connecting to the database:

        ALTER SESSION SET events '10046 trace name context forever, level 8';

We also modified the application to exit after iterating through its main loop 500 times. We ran the modified application on the workstation and retrieved the trace file that Oracle had created in the directory specified by the user_dump_dest parameter. The application ran for less than one minute before exiting. Since this was an Oracle 9i database, we were able to run TKPROF to generate a report including wait event statistics with the following command:

        tkprof prodgen_ora_16466.trc report_16466.prf waits=yes

An interesting excerpt from the TKPROF report was as follows:

        UPDATE processing_stations
SET status = 'ACTIVE',
status_date = SYSDATE,
data_set_id_being_processed = :b1
WHERE station_id = 30

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 500 0.23 10.14 0 3616 1010 500
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 501 0.24 10.14 0 3616 1010 500

Optimizer goal: CHOOSE
Parsing user id: 40

Rows Row Source Operation
------- ---------------------------------------------------
500 UPDATE
500 INDEX UNIQUE SCAN PROCESSING_STATIONS_PK (object id 9372)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
buffer busy waits 26 0.71 7.87
latch free 17 0.57 2.08
log file switch completion 3 0.09 0.20

The UPDATE statement here seemed quite trivial-changing three column values on one row, given a primary key value. Oracle only needed to do about nine logical reads per execution, and no physical reads were required at all. Yet it still took Oracle over 10 seconds to execute the statement 500 times. If we had not specified the "waits=yes" TKPROF option, we might conclude that the database server was CPU-starved. But a quick look at the wait event portion of the TKPROF report shows that most of the 10 seconds of elapsed execution time was spent waiting on busy buffers and latches.

Some of the buffer busy waits listed in the trace file were as follows:

        WAIT #2: nam='buffer busy waits' ela= 527727 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 498765 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 137611 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 124165 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 5237 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 264050 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 270177 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 330912 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 156317 p1=18 p2=10 p3=220
WAIT #2: nam='buffer busy waits' ela= 710696 p1=18 p2=10 p3=220

Since this trace file was generated by an Oracle 9i database, the elapsed times are shown in units of microseconds. The wait shown at the bottom of the list above had an elapsed time of 710,696 microseconds, or approximately 0.71 seconds. This was the longest buffer busy wait for this statement, as indicated by the "Max. Wait" figure in the TKPROF report.

From the previous example, recall that the parameters for the buffer busy waits event indicate file number, block number, and reason code. All of the buffer busy waits listed above involve block 10 in data file 18, with a reason code of 220. By querying the data dictionary, we found that block 10 in data file 18 was allocated to the PROCESSING_STATIONS table, the table being updated by the statement that encountered the buffer busy waits. According to Metalink bulletin 34405.1, reason code 220 indicates that the buffer is busy because another session is using a copy of the buffer in an incompatible mode.

From the TKPROF report we can see that the application updated the PROCESSING_STATIONS table 500 times during the period traced. Each update affected one row. While performing the updates, there were 26 times that the process had to wait on a buffer busy waits event. From looking at the trace file we know that all of the buffer busy waits occurred because data block 10 in data file 18 was already in the buffer cache, but in an incompatible mode. These 26 waits totaled 7.87 seconds.

A query against the data dictionary revealed the following statistics for the PROCESSING_STATIONS table:

        SELECT SYSDATE - last_analyzed, blocks, avg_row_len, avg_space, 
chain_cnt, num_rows
FROM user_tables
WHERE table_name = 'PROCESSING_STATIONS';

SYSDATE-LAST_ANALYZED BLOCKS AVG_ROW_LEN AVG_SPACE CHAIN_CNT NUM_ROWS
--------------------- ---------- ----------- ---------- ---------- ----------
2.132118056 1 62 1686 0 100

We can see that optimizer statistics were gathered recently and the table contains 100 very short rows, all stored in one data block. If the application session that we traced updated the PROCESSING_STATIONS table 500 times in less than a minute, we might suspect that the database sessions from the other 99 workstations are doing roughly the same thing. This adds up to over 50,000 updates to the same data block per minute. Since each session is updating a different row in the table, there is no waiting on row-level locks (which would show up in the wait event interface as enqueue waits). However, only one process at a time may write to a buffer in the buffer cache. Therefore, a lot of time is being wasted contending for this one very hot buffer.

All of this begs the question, "Why is the application updating a row in the PROCESSING_STATIONS table 500 times in less than one minute?" It turns out that the application uses this table to coordinate efforts between the workstations. There is one row in the table for each workstation. The application running on each workstation updates its row in PROCESSING_STATIONS to indicate which data set it is processing so that other workstations will not attempt to process the same data set. Also, each workstation periodically updates the status date so that if a workstation crashes, it will be easy to tell when it crashed and what data it was processing during the crash.

Processing of one data set could take anywhere from a fraction of a second to twenty minutes. The application code was written to not only update the PROCESSING_STATIONS table when a new data set was processed, but also to frequently update the timestamp during the processing of one data set. All stakeholders agreed that 500 updates per minute might be excessive. If the application code could be modified to update the table only once per second, then writes to the table would reduce by roughly 90% and contention for the hot buffer in the buffer cache would likely go away.

However, application changes needed to be tested thoroughly before deployment. A short-term fix was desired until the long-term fix of the application could be implemented. Luckily, a temporary fix was available that was simple to implement.

We rebuilt the PROCESSING_STATIONS table with the PCTFREE storage parameter set to 99 and PCTUSED set to 1. This caused Oracle to leave 99% of the available space in each data block empty for future row expansion. This change caused the 100 rows in the table to be spread over 100 data blocks instead of all rows being neatly packed into one data block. A query against the data dictionary confirmed this:

        SELECT SYSDATE - last_analyzed, blocks, avg_row_len, avg_space,
chain_cnt, num_rows
FROM user_tables
WHERE table_name = 'PROCESSING_STATIONS';

SYSDATE-LAST_ANALYZED BLOCKS AVG_ROW_LEN AVG_SPACE CHAIN_CNT NUM_ROWS
--------------------- ---------- ----------- ---------- ---------- ----------
.130868056 100 62 8014 0 100

Tracing another 500 iterations of the main loop in the application on workstation 30 after rebuilding the PROCESSING_STATIONS table yielded the following TKPROF output:

        UPDATE processing_stations
SET status = 'ACTIVE',
status_date = SYSDATE,
data_set_id_being_processed = :b1
WHERE station_id = 30

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 500 0.20 2.22 0 500 1009 500
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 501 0.20 2.22 0 500 1009 500

Optimizer goal: CHOOSE
Parsing user id: 40

Rows Row Source Operation
------- ---------------------------------------------------
500 UPDATE
500 INDEX UNIQUE SCAN PROCESSING_STATIONS_PK (object id 9379)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
latch free 2 0.35 0.61

The application is still updating the PROCESSING_STATIONS table very rapidly. But now each row in the table is stored in a separate data block, so each workstation is working with its own hot buffer in the buffer cache. The sessions are no longer competing with each other for the opportunity to write to the same buffer. Waits on the buffer busy waits event have been eliminated. As a side effect, logical reads have been reduced to one third of what they were originally.

The original TKPROF reports also indicated waits on the latch free wait event, indicating latch contention. A query against v$event_name shows that the parameters to the latch free wait event are as follows:

        SELECT * FROM v$event_name WHERE name = 'latch free';

EVENT# NAME PARAMETER1 PARAMETER2 PARAMETER3
---------- -------------------- --------------- --------------- ---------------
3 latch free address number tries

Some of the latch free wait events in the trace file generated before the PROCESSING_STATIONS table was rebuilt were as follows:

        WAIT #2: nam='latch free' ela= 47004 p1=15113593728 p2=97 p3=0
WAIT #2: nam='latch free' ela= 14629 p1=15113593728 p2=97 p3=1
WAIT #2: nam='latch free' ela= 20652 p1=15113593728 p2=97 p3=2
WAIT #2: nam='latch free' ela= 37737 p1=15113593728 p2=97 p3=3

A query against v$latch or v$latchname will give us the name of latch number 97:

        SELECT name
FROM v$latch
WHERE latch# = 97;

NAME
-------------------------
cache buffers chains

So we can see that the four wait events listed from the trace file above amount to a wait of approximately 0.12 seconds for the cache buffer chains latch. (Remember that in Oracle 9i trace files, elapsed times are listed in microseconds while in trace files from Oracle 8i and earlier the elapsed times are in centiseconds.) This is not unexpected, since we had 100 sessions contending for the right to modify one buffer in the buffer cache.

Comparing the TKPROF report before rebuilding the PROCESSING_STATIONS table with after, we see that rebuilding the table reduced the latch contention but did not eliminate it. It is likely that the long-term solution of modifying the application to reduce the update frequency will eliminate or further reduce the latch contention.

Example #3: Log File Waits
An application performed occasional loads into a small data warehouse. Unfortunately, the loads were taking a much longer time on the production system than they had in a test environment. To find out what was happening on the production database during a data load, we ran the following script that displays the wait event activity for the entire instance over a 30 second interval. The list of events in the NOT IN clause simply filters out the idle events, making the output a little easier to read:

        DROP TABLE previous_events;

CREATE TABLE previous_events
AS
SELECT SYSDATE timestamp, v$system_event.*
FROM v$system_event;

EXECUTE dbms_lock.sleep (30);

SELECT A.event,
A.total_waits - NVL (B.total_waits, 0) total_waits,
A.time_waited - NVL (B.time_waited, 0) time_waited
FROM v$system_event A, previous_events B
WHERE B.event (+) = A.event
AND A.event NOT IN
(
'client message', 'dispatcher timer', 'gcs for action',
'gcs remote message', 'ges remote message', 'i/o slave wait',
'jobq slave wait', 'lock manager wait for remote message',
'null event', 'Null event', 'parallel query dequeue', 'pipe get',
'PL/SQL lock timer', 'pmon timer', 'PX Deq Credit: need buffer',
'PX Deq Credit: send blkd', 'PX Deq: Execute Reply',
'PX Deq: Execution Msg', 'PX Deq: Signal ACK',
'PX Deq: Table Q Normal', 'PX Deque Wait', 'PX Idle Wait',
'queue messages', 'rdbms ipc message', 'slave wait',
'smon timer', 'SQL*Net message to client',
'SQL*Net message from client', 'SQL*Net more data from client',
'virtual circuit status', 'wakeup time manager'
)
ORDER BY time_waited;

Running this script during a data load yielded the following output:

        EVENT                               TOTAL_WAITS TIME_WAITED
----------------------------------- ----------- -----------
LGWR wait for redo copy 0 0
SQL*Net break/reset to client 2 0
SQL*Net more data to client 0 0
buffer busy waits 0 0
db file scattered read 6 0
direct path read 27 0
direct path write 34 0
direct path write (lob) 0 0
file identify 6 0
file open 93 0
library cache pin 0 0
local write wait 0 0
process startup 0 0
rdbms ipc reply 0 0
refresh controlfile command 0 0
single-task message 0 0
switch logfile command 0 0
control file sequential read 61 1
latch free 2 1
db file sequential read 6 7
control file parallel write 41 31
log file single write 6 164
db file parallel write 13 220
enqueue 6 486
log buffer space 24 2007
log file sequential read 30 2655
log file switch completion 33 2883
log file parallel write 19 3561
log file sync 113 10249

During the 30 second sample interval, Oracle processes spent roughly 215 seconds waiting on events related to the online redo log. The likely culprit in this situation is some combination of slow disks and contention on the disks holding the online redo log files. After a little research, we found that the redo logs had been striped on a RAID array on the faster test system. On the production system, however, the redo logs were all on a single disk that was shared with another very active database. When the online redo logs were moved to separate disks, the data loads ran more rapidly and the wait events pertaining to the redo log decreased significantly.

Example #4: Direct Path Read and Write Waits
A company experienced delays when querying summarized information about their orders. Querying individual orders was fast, but queries involving totals or rollups seemed to take unacceptably long. We ran our Database Rx monitoring tool at regular intervals to collect performance information from the system while the users were performing their typical tasks. (Database Rx is a proactive monitoring and alert notification tool that we developed for use with our remote database administration clients. For a free demo, see our website at dbrx.dbspecialists.com/guest.) 

Looking at the Top 5 Wait Events section, we see that the leading wait events are direct path write, db file scattered read, and direct path read. These three wait events combined accounted for roughly 99% of the non-idle wait time on the system during the sample interval. From our knowledge of the queries in the application, the db file scattered read (i.e., full table scan) events didn't surprise us. The absence of db file sequential read waits also did not surprise us as the relevant blocks for index lookups and row fetches by ROWID were likely present in the buffer cache.

We figured we might be able to reduce the db file scattered read waits with a larger buffer cache or judicious use of the KEEP pool in the short term and application code review in the longer term. What seemed amiss, however, were the large numbers of direct path read and write wait events. Querying v$session_wait while applications were running yielded the following:

        SELECT   sid, event, state, seconds_in_wait, wait_time, p1, p2, p3
FROM v$session_wait
WHERE event = 'direct path write'
ORDER BY sid;

SID EVENT STATE SECONDS_IN_WAIT WAIT_TIME P1 P2 P3
---- ----------------- ----- --------------- --------- ------ ------ ------
39 direct path write WAITI 0 0 201 65 7
47 direct path write WAITI 0 0 201 2248 7

Querying v$event_name shows the parameters for the direct path write event:

        SELECT * FROM v$event_name WHERE name = 'direct path write';

EVENT# NAME PARAMETER1 PARAMETER2 PARAMETER3
---------- -------------------- --------------- --------------- ---------------
115 direct path write file number first dba block cnt

We see that the first parameter is supposed to be the file number of the data file containing the blocks being written, but we couldn't find data file 201 using the following query:

        SELECT tablespace_name, file_id "AFN", relative_fno "RFN" 
FROM dba_data_files
WHERE file_id = 201;

no rows selected

This is because the database is using temp files in the temporary tablespace, so we used the following query:

        SELECT tablespace_name, file_id + value "AFN", relative_fno "RFN"
FROM dba_temp_files, v$parameter
WHERE name = 'db_files';

TABLESPACE_NAME AFN RFN
------------------------------ ---------- ----------
TEMP 201 1

This indicated that the TEMP tablespace was experiencing direct path write waits, which is not surprising as direct path reads and writes usually occur in connection with temporary segments. Direct path I/O operations were responsible for most of the waits on the system and appeared to be a bottleneck. Upon further examination, we discovered that the sort_area_size parameter was set to the default value of 65536. Since the database server had a fair amount of RAM free and there generally weren't too many sessions connected simultaneously, we increased the sort_area_size to 10 Mb, and the direct path waits disappeared as sorts no longer had to be written to disk.

If increasing the sort_area_size had not been an option, we might have looked at the disk storage where the TEMP tablespace temp files were located to see if there was I/O contention with other systems. We might also have looked at application code to see if the amount of sorting and hashing could be reduced.

This example also demonstrates something that might seem odd at first glance. During the Database Rx sample period, there were more physical reads than logical reads. How can this be-don't all physical reads also require logical reads? Well, this is where direct path reads get their name. As Oracle says in reference note 50415.1 on Metalink, "Direct path reads are generally used by Oracle when reading directly into PGA memory (as opposed to into the buffer cache)." Since direct path reads bypass the buffer cache, they aren't counted as logical reads. This is the reason that some people question the traditional definition of the buffer cache hit ratio, because it doesn't take account of direct path reads. We won't get into that debate, but be aware that direct path reads could distort your cache hit ratio. In this example, the cache hit ratio would be negative!

Example #5: Database Link Wait Events
Several of the SQL*Net wait events are considered idle events. For example, the "SQL*Net message from client" event indicates that the process is waiting for the application to submit a SQL statement for processing. However, the SQL*Net wait events pertaining to database links are not idle events-they come into play when queries use database links to access remote databases. The "SQL*Net message from dblink" event, for example, indicates that the local process has submitted a request to a remote database and is waiting for a response.

A company had deployed a data import application for receiving batches of customer transactions from its partners. There were five Oracle databases, one per geographic region. Customer transactions were sorted by geographic location, and loaded into the database serving that region. Due to human error, sometimes the same customer transaction would get imported into multiple Oracle databases. They had developed a reconciliation report to address this. The report was simply a query that compared transactions between two databases and listed records found on both. Unfortunately, the report took over 30 minutes to run and the runtime was expected to get much worse as the customer base grew.

Looking at the report code, we found the main query, which appeared to be the one using the lion's share of resources. We logged onto SQL*Plus as an application user in one window and logged onto SQL*Plus again as a DBA in another window. From the DBA session we queried v$session and found the SID corresponding to our application user SQL*Plus session. Then we launched the query from the application user session and monitored v$session_event from the other. Since the wait event interface does not account for CPU usage, we also monitored v$sesstat.

After the report query completed 31 minutes later, we queried v$session_event one last time and the results were as follows:

        SELECT   event, total_waits, time_waited, max_wait
FROM v$session_event
WHERE sid = 47
ORDER BY event;

EVENT TOTAL_WAITS TIME_WAITED MAX_WAIT
------------------------------ ----------- ----------- ----------
SQL*Net message from client 32 4435 2432
SQL*Net message from dblink 1525516 104919 31
SQL*Net message to client 33 0 0
SQL*Net message to dblink 1525516 466 9
db file sequential read 27199 8025 28
latch free 40 5 4
log file sync 1 2 2


SELECT A.name, B.value
FROM v$statname A, v$sesstat B
WHERE A.statistic# = 12
AND B.statistic# = A.statistic#
AND B.sid = 47;

NAME VALUE
------------------------------ ----------
CPU used by this session 67937

We can ignore the SQL*Net message from client wait event because this just indicates time Oracle spent waiting for us to submit the query in SQL*Plus. However, there were over 1.5 million network roundtrips to a database link totaling over 1053 seconds of wait time. (The 1053 seconds includes both network latency and waiting on the remote database server to respond to requests.) There were over 27,000 waits on single block disk I/Os totaling about 80 seconds of wait. There were also latch free and log file sync wait events, but the wait times were insignificant. The query from v$sesstat shows that about 679 seconds of CPU time were used.
Knowing nothing about the report query or database size, the number of network roundtrips seemed very high and the number of single block disk I/Os seemed like it might be high. Before we could go any further, we really needed to look at the query and the tables involved. The report query was as follows:

        SELECT   customer_id, batch_serial_number, batch_date, load_date,
batch_comment, control_total, control_amount
FROM customer_transfer_batches A
WHERE exists
(
SELECT 1
FROM customer_transfer_batches@prdwest B
WHERE B.customer_id = A.customer_id
AND B.batch_serial_number = A.batch_serial_number
)
ORDER BY customer_id, batch_serial_number;

Estimated statistics on the CUSTOMER_TRANSFER_BATCHES table showed that it had approximately 1,526,000 rows spanning approximately 21,825 blocks. Correlating the wait event information with the table size, it was clear that Oracle was reading the local table one block at a time and querying the remote table once per local row. The execution plan for the query confirmed this:

        Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT
1 0 FILTER
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'CUSTOMER_TRANSFER_BATCHES'
3 2 INDEX (FULL SCAN) OF 'CUSTOMER_TRANSFER_BATCHES_PK' (UNIQUE)
4 1 REMOTE* PRDWEST

4 SERIAL_FROM_REMOTE SELECT "CUSTOMER_ID","BATCH_SERIAL_NUMBER"
FROM "CUSTOMER_TRANSFER_BATCHES" "B"
WHERE "BATCH_SERIAL_NUMBER"=:1
AND "CUSTOMER_ID"=:2

Performing 1.5 million network roundtrips and making the remote database execute 1.5 million queries did not seem to be the most efficient way to go. It would probably be more efficient for the local database to pull all of the required data from the remote database in one shot and then join it to the local table. We probably could have added an optimizer hint to the query to get Oracle to rewrite the subquery as a join, but instead we chose to rewrite the query ourselves in a way that would be easier for Oracle's query optimizer to do the right thing. The rewritten query was as follows:

        SELECT   customer_id, batch_serial_number, batch_date, load_date,
batch_comment, control_total, control_amount
FROM customer_transfer_batches
WHERE (customer_id, batch_serial_number) IN
(
SELECT customer_id, batch_serial_number
FROM customer_transfer_batches
INTERSECT
SELECT customer_id, batch_serial_number
FROM customer_transfer_batches@prdwest
)
ORDER BY customer_id, batch_serial_number;

We ran the revised query and monitored v$session_event and v$sesstat as before, with the following results:

        SELECT   event, total_waits, time_waited, max_wait
FROM v$session_event
WHERE sid = 49
ORDER BY event;

EVENT TOTAL_WAITS TIME_WAITED MAX_WAIT
------------------------------ ----------- ----------- ----------
SQL*Net message from client 46 3680 2481
SQL*Net message from dblink 24 31 18
SQL*Net message to client 47 0 0
SQL*Net message to dblink 24 0 0
SQL*Net more data from dblink 5978 1337 13
db file scattered read 3430 675 8
db file sequential read 182 60 2
direct path read 148 233 11
direct path write 920 3572 33
log file sync 1 2 2


SELECT A.name, B.value
FROM v$statname A, v$sesstat B
WHERE A.statistic# = 12
AND B.statistic# = A.statistic#
AND B.sid = 49;

NAME VALUE
------------------------------ ----------
CPU used by this session 3227

This time the query completed in 92 seconds. Both elapsed time and CPU usage were reduced by about 95%. The waits on the database link were reduced from about 1053 seconds to under 14 seconds, indicating less network overhead and a lot less demand being placed on the remote database server. Waits on db file sequential read and db file scattered read combined were significantly shorter than the waits on db file sequential read in the original query because large numbers of single block I/Os were replaced with fewer multi-block I/Os.

However, there were about 38 seconds of direct path I/O waits. As we've seen, direct path I/O usually occurs in connection with temporary segments. Oracle had to do quite a bit of sorting to implement the INTERSECT operation added to the rewritten query, so the direct path I/O waits were not unexpected.

The performance gains achieved by the query rewrite were significant, and we were tempted to end the tuning exercise right here. However, the company indicated that this report was critical to their operation and they wanted to make it run even faster if possible. So, our next step was to speed up the sort operation and thereby reduce the direct path I/O waits.

Looking at v$parameter, we saw that the sort_area_size was set to 1 Mb. A larger sort area would allow Oracle to perform more of the sort in memory and less on disk, thus reducing the direct path I/O. To see the effect of a larger sort area, we repeated the timing exercise, first setting sort_area_size to 100 Mb dynamically in the SQL*Plus session where the report query was run. The query completed in 55 seconds this time, and statistics collected from v$session_event and v$sesstat were as follows:

        SELECT   event, total_waits, time_waited, max_wait
FROM v$session_event
WHERE sid = 46
ORDER BY event;

EVENT TOTAL_WAITS TIME_WAITED MAX_WAIT
------------------------------ ----------- ----------- ----------
SQL*Net message from client 47 442 287
SQL*Net message from dblink 25 25 14
SQL*Net message to client 48 0 0
SQL*Net message to dblink 25 0 0
SQL*Net more data from dblink 6050 1378 26
db file scattered read 3430 945 8
db file sequential read 191 59 1
log file sync 1 3 3


SELECT A.name, B.value
FROM v$statname A, v$sesstat B
WHERE A.statistic# = 12
AND B.statistic# = A.statistic#
AND B.sid = 46;

NAME VALUE
------------------------------ ----------
CPU used by this session 3296

You can see that increasing the sort area size did not reduce the CPU time, database link wait time, or db file scattered/sequential read times. However, the direct path I/O waits disappeared completely. This indicates that all sorting was performed in memory. Of course, it may not be feasible to keep the sort entirely in memory as the CUSTOMER_TRANSFER_BATCHES table grows. However, a larger sort area will still speed up the sort and overall query speed even if the entire sort cannot be performed in memory.

This example demonstrated the practice of iterative tuning: Curing one bottleneck often creates another or reveals another performance issue that was masked by the larger issue. Boosting performance of complex systems is usually an iterative process.

Example #6: Slow Batch Processing
The operations group noticed that its overnight batch processing was taking significantly longer after an additional data feed program was added to the nightly run schedule. The decision was made to add more processors to the database server to speed things up. Unfortunately, this did not do the trick; the programs ran at about the same overall speed as before.

The v$system_event view can come in handy in situations like this in order to take a quick glance at the overall system. By making a copy of the view's contents at two separate points in time, you can get a high level summary of what processes were waiting for during the time interval. Here is an extremely basic SQL*Plus script that displays the wait event activity for the instance over a 30 second time interval:

        CREATE TABLE previous_events
        AS
        SELECT SYSDATE timestamp, v$system_event.*
        FROM   v$system_event;

        EXECUTE dbms_lock.sleep (30);

        SELECT   A.event, 
                 A.total_waits - NVL (B.total_waits, 0) total_waits,
                 A.time_waited - NVL (B.time_waited, 0) time_waited
        FROM     v$system_event A, previous_events B
        WHERE    B.event (+) = A.event
        ORDER BY A.event;
Of course, you could use Statspack, utlbstat, or any number of other tools out there to capture this information as well. Running the above script while the batch programs were running generated the following output:
        EVENT                          TOTAL_WAITS TIME_WAITED
        ------------------------------ ----------- -----------
        LGWR wait for redo copy                115          41
        buffer busy waits                       53          26
        control file parallel write             45          44
        control file sequential read            25           0
        db file scattered read                 932         107
        db file sequential read              76089        6726
        direct path read                       211          19
        direct path write                      212          15
        enqueue                                 37        5646
        file identify                            2           0
        file open                               37           0
        free buffer waits                       11         711
        latch free                              52          44
        log buffer space                         2           8
        log file parallel write               4388        1047
        log file sequential read               153          91
        log file single write                    2           6
        log file switch completion               2          24
        write complete waits                     6         517
The idle events have been removed for clarity. What stands out here is the enqueue wait event. During the 30 second time interval, processes collectively spent over 56 seconds waiting for enqueues, or locks that you can see in v$lock. The most common types of locks to show up in an enqueue wait are the table-level lock (TM) and the transaction enqueue or row-level lock (TX).

The parameters for the enqueue wait event provide information about the type and location of the lock waited upon, but the information is fairly obscure. The first two bytes of P1RAW are the ASCII codes of the lock type, and the last two bytes are the requested lock mode. P2 and P3 correspond to the ID1 and ID2 columns in the v$lock view.

In order to get a sense of where the locking contention was coming from, I used the oradebug facility in SQL*Plus to activate system event 10046 for one of the batch programs as it was running. Since tracing incurs a lot of overhead, this significantly slowed down batch processing. But this was only done once, and with the greater good of eliminating the contention problems in mind.

To activate event 10046, I looked at v$session and v$process to find the Oracle PID of the dedicated server process running one of the batch programs. Then I ran the following commands in SQL*Plus:

        SQL> oradebug setorapid 13
        Unix process pid: 19751, image: oracle@dbserver.acme.com (TNS V1-V3)
        SQL> oradebug session_event 10046 trace name context forever, level 8
        Statement processed.
        SQL>
This caused the dedicated server process to begin writing a trace file including wait event information. An excerpt from the trace file is as follows:
        EXEC #5:c=0,e=0,p=3,cr=2,cu=1,mis=0,r=1,dep=1,og=4,tim=2313020980
        XCTEND rlbk=0, rd_only=0
        WAIT #1: nam='write complete waits' ela= 11 p1=3 p2=2 p3=0
        WAIT #4: nam='db file sequential read' ela= 4 p1=10 p2=12815 p3=1
        WAIT #4: nam='db file sequential read' ela= 1 p1=10 p2=12865 p3=1
        WAIT #4: nam='db file sequential read' ela= 5 p1=3 p2=858 p3=1
        =====================
        PARSING IN CURSOR #4 len=65 dep=1 uid=502 oct=6 lid=502 
        tim=2313021001 hv=417623354 ad='55855844'
        UPDATE CUSTOMER_CALLS SET ATTR_3 = :b1 WHERE CUSTOMER_CALL_ID=:b2
        END OF STMT
        EXEC #4:c=1,e=10,p=3,cr=2,cu=3,mis=0,r=1,dep=1,og=4,tim=2313021001
        WAIT #5: nam='db file sequential read' ela= 0 p1=10 p2=5789 p3=1
        WAIT #5: nam='enqueue' ela= 307 p1=1415053318 p2=196705 p3=6744
        WAIT #5: nam='enqueue' ela= 307 p1=1415053318 p2=196705 p3=6744
        WAIT #5: nam='enqueue' ela= 53 p1=1415053318 p2=196705 p3=6744
        WAIT #5: nam='db file sequential read' ela= 0 p1=10 p2=586 p3=1
        WAIT #5: nam='db file sequential read' ela= 1 p1=3 p2=858 p3=1
        EXEC #5:c=0,e=668,p=3,cr=5,cu=3,mis=0,r=1,dep=1,og=4,tim=2313021669
We can see the enqueue wait events pretty clearly. Using the magical decoder ring on the P1 values, we can see that Oracle was waiting on TX locks requested in mode 6 (exclusive mode). (1415053318 is 54580006 in hexadecimal. The 54 corresponds to a "T", the 58 to an "X", and 0006 is exclusive mode. Look up the v$lock view in the Server Reference Guide for a list of lock types and modes.)

What is not clear from the excerpt above is what table and row the process was trying to lock when it had to wait. If we were looking at the database while the lock contention was happening, we could look at the v$locked_object view, or the ROW_WAIT_OBJ# and ROW_WAIT_ROW# columns of v$session. But since all I had to work with was the contents of the trace file, I could see that the waits occurred while processing cursor number five. Searching backwards through the trace file I found the exact SQL statement that corresponded to cursor number five at the time of the lock contention.

This turned out to be enough information to diagnose the contention problem and develop a workaround. By using autonomous transactions, we were able to modify the batch programs that were contending with each other so that they held locks on critical path rows for extremely short periods of time. Adjusting the schedules of the programs in the nightly run schedule also helped.

Example #7: A Slow Client/Server Application
A client/server application was taking several seconds to bring up a certain screen. The users complained constantly, and the developers were confounded because the delay was occurring before the user had any chance to kick off a query. The only thing happening in the form on startup was some fetching of basic reference data. All of the SQL had been tuned and was known to work very quickly. Being frequently accessed reference data, the data blocks were likely to be in the buffer cache--eliminating slow I/O as a possible bottleneck.

Looking at v$session_event for a user who had just logged on to the application but not yet begun any activities showed almost 20,000 waits on the SQL*Net message from client event, totaling over 300 seconds of waiting. This is an idle event, which we would typically ignore, but something seemed wrong and warranted further investigation.

A process waits on the SQL*Net message from client event when it has completed a request from a user and now needs to wait for the next request. You typically see high wait times on this event in front-end applications that spend a lot of time waiting for a user to submit a query or initiate some other action that requires database activity. So the 300 seconds of wait time here did not seem unusual, but the high number of waits did not seem right.

The application code was modified in a test environment so that it would disable timed statistics for the session after all initialization was complete. This caused the time information in v$session_event to reflect wait times during the form startup phase only; the time spent waiting for a user to initiate an action would be excluded because timed statistics would be disabled at that point.

Running the modified application yielded the following information in v$session_event:

        EVENT                          TOTAL_WAITS TIME_WAITED
        ------------------------------ ----------- -----------
        SQL*Net message to client            18520           6
        SQL*Net message from client          18519        1064
A fetch of a few pertinent session statistics from v$sesstat showed the following:
        NAME                                VALUE
        ------------------------------ ----------
        session logical reads                9295
        CPU used by this session               82
        physical reads                          0
The data suggested that when the application started up, it was spending over 10 seconds performing more than 18,000 network roundtrips. The database server was using a little under one second of CPU time to perform 9,295 logical reads of data blocks that were already in the buffer cache.

A closer inspection of the reference data being retrieved by the application revealed 9,245 rows of reference data. The application code was fetching the data from cursors one row at a time. This resulted in two network roundtrips and one logical read for each row fetched.

The application was modified to use Oracle's array processing interface in order to fetch 100 rows at a time. This sped up the application dramatically by reducing network roundtrips. An additional benefit was that CPU usage on the database server was reduced because only about one logical read per data block was required, instead of one logical read per row retrieved.

When fetching 100 rows at a time, the waits and session statistics looked like this:

        EVENT                          TOTAL_WAITS TIME_WAITED
        ------------------------------ ----------- -----------
        SQL*Net message to client              200           0
        SQL*Net message from client            199          28

        NAME                                VALUE
        ------------------------------ ----------
        session logical reads                 135
        CPU used by this session                3
        physical reads                          0
By fetching rows up to 100 at a time, we were able to dramatically improve response time by reducing network roundtrips--and reduce CPU usage on the database server at the same time.

Conclusion
Because wait events touch all areas of Oracle server operation, the wait event interface can provide insights into just about any kind of performance problem you might experience. Other metrics, such as cache hit ratios, can give you valuable information, but it is the wait event interface that ties everything together. In the examples discussed here, wait event data directed us to a swift diagnosis of each problem.

The wait event interface, with its four dynamic performance views and tracing facility, enables you to see how an Oracle process is spending its time when it is not using the CPU. While wait event information may not give you everything you might need to cure every performance problem, it does give you access to a wealth of information that can guide you on the path to better system performance.

The basic concepts behind the wait event interface have not changed since Oracle 7. But each release of Oracle brings a few more enhancements that make the wait event interface easier to use or more informative. In Oracle 9i, TKPROF has been enhanced to parse wait event information from trace files and format it in a useful report. Also, some timing information is tracked in microseconds for more resolution, and the DBMS_SUPPORT package has been included for easier trace activation. However, these are just bells and whistles.