The SQL Trace Facility and TKPROF

.Introduction

The SQL trace facility and TKPROF are two basic performance diagnostic tools which can help you monitor and tune applications that run against Oracle. For best results, use these tools together rather than using EXPLAIN PLAN alone.

 About the SQL Trace Facility

The SQL trace facility provides performance information on individual SQL statements. It generates the following statistics for each statement:

     parse, execute, and fetch counts
     CPU and elapsed times
     physical reads and logical reads
     number of rows processed
     misses on the library cache
     username under which each parse occurred
     each commit and rollback

You can enable the SQL trace facility for a session or for an instance. When the SQL trace facility is enabled, performance statistics for all SQL statements executed in a user session or in an instance are placed into a trace file.
The additional overhead of running the SQL trace facility against an application with performance problems is normally insignificant, compared with the inherent overhead caused by the application's inefficiency.

How to Use the SQL Trace Facility and TKPROF
Follow these steps to use the SQL trace facility and TKPROF.

   1. Set initialization parameters for trace file management.
        TIMED_STATISTICS must be TRUE
        MAX_DUMP_FILE_SIZE
        USER_DUMP_DEST

   2. Enable the SQL trace facility for the desired session and run your application. This step produces a trace file containing statistics for the SQL statements issued by the application.
        ALTER SESSION SET SQL_TRACE = TRUE;
To disable the SQL trace facility for your session, enter:
        ALTER SESSION SET SQL_TRACE = FALSE;

To Trace another session:
First you need to give privileges to a user to use this package:
    GRANT EXECUTE ON DBMS_SYSTEM to <user/role>

Using the TOP Unix command or the v$session view you can check the SID and SERIAL# number of the session to monitor.
select sid, serial#, substr(terminal,1,10) terminal, substr(program,1,40) program
from v$session
where terminal in('OCR04','IMGSVR')
-- where username = 'FRAUDGUARD';;

Then, you activate the trace. There are several ways to do this, including the following three.

Using SET_SQL_TRACE_IN_SESSION (not recommended because doesn't catch extended SQL Tracing)
Do not use the DBMS_SUPPORT procedure called SET_SQL_TRACE_IN_SESSION if you want to acton DBMS_SUPPORT to system;
ivate extended SQL tracing. This procedure does not allow you to specify that you want waits and binds data in your trace file.

TURN ON:     exec sys.dbms_system.set_sql_trace_in_session(SID, SERIAL#, true);
/* code to be traced executes during this time window */
TURN OFF:    exec sys.dbms_system.set_sql_trace_in_session(SID, SERIAL#, false);


Using DBMS_SUPPORT (this option let you collect waits and binds data):
Run the files dbmssupp.sql and prvtsupp.plb from $ORACLE_HOME/rdbms/admin as SYS and grant execute on DBMS_SUPPORT to system;
TURN ON:     
exec dbms_support.start_trace_in_session(  SID, SERIAL, waits=> true, binds=> true)
/* code to be traced executes during this time window */
TURN OFF:     exec dbms_support.stop_trace_in_session( SID, SERIAL)

Using DBMS_MONITOR (10g only)
This package solves many of the complicated diagnostic data collection problems resulting from connection pooling and multithreading. Instead of specifying an Oracle Database session to be traced, you will be able to specify a service, module, or action to trace:

Example using SID and SERIAL
TURN ON:    exec dbms_monitor.session_trace_enable(SID, SERIAL, waits=> true, binds=> true)
/* code to be traced executes during this time window */
TURN OFF:     exec
exec dbms_monitor.session_trace_disable(SID, SERIAL)

Example using Service or Module:
TURN ON:  exec dbms_monitor.serv_mod_act_trace_enable(
  service_name  => 'APPS1',
  module_name   => 'PAYROLL',
  action_name   => 'PYUGEN',
  waits         => true,
  binds         => true,
  instance_name => null)
/* code to be traced executes during this time window */
TURN OFF: dbms_monitor.serv_mod_act_trace_disable(
  service_name  => 'APPS1',
  module_name   => 'PAYROLL',
  action_name  => 'PYUGEN')

With the DBMS_MONITOR package, Oracle offers a fully supported means of activating and deactivating diagnostic data collection for a specific business action (or actions) you want to trace.

   3. Run TKPROF to translate the trace file created in step 2 into a readable output file. This step can optionally create a SQL script that stores the statistics in the database.
    tkprof trace_file  file_name sys=no explain=user/passwd@connection

   4. Interpret the output file created in step 3.


Tabular Statistics in TKPROF
TKPROF lists the statistics for a SQL statement returned by the SQL trace facility in rows and columns. Each row corresponds to one of three steps of SQL statement processing. Statistics are identified by the value of the CALL column:

CALL- one of the three types of work required to process an SQL:

COUNT: Number of times a statement was parsed, executed, or fetched. Generally high numbers for anything is cause of investigation (simply because nothing is 'free' in Oracle), but in particular, high numbers of parse calls could be a performance killer in multi-user systems.
CPU: Total CPU time in seconds for each process. This value is zero (0) if TIMED_STATISTICS is not turned on.
ELAPSED: The elapsed time represents the time (in seconds) that was taken to identify and modify rows during statement execution on the server. This is not the same as the time taken to return that information back to the client
DISK: Total number of data blocks physically read (or physical I/O) from the datafiles on disk for all parse, execute, or fetch calls. Generally you want to see blocks being read from the buffer cache rather than disk.
QUERY: Total number of buffers retrieved in consistent mode for all parse, execute, or fetch calls. Usually, buffers are retrieved in consistent mode for queries.  A Consistent mode buffer is one that has been generated to give a consistent read snapshot for a long running transaction. The buffer actually contains this status in its header.
CURRENT: Total number of buffers retrieved in current mode. Buffers are retrieved in current mode for statements such as INSERT, UPDATE, and DELETE. This column is incremented if a buffer found in the buffer cache that is new enough for the current transaction and is in current mode (and it is not a CR buffer). This applies to buffers that have been read in to the cache as well as buffers that already exist in the cache in current mode.
ROWS: Total number of rows processed by the SQL statement. This total does not include rows processed by subqueries of the SQL statement.

For SELECT statements, the number of rows returned appears for the fetch step. For UPDATE, DELETE, and INSERT statements, the number of rows processed appears for the execute step.

First I think we need to make sure that these are hard and not soft parses. "Hard" parses are real parses so we want to minimize those. "Soft" parses occur when we are simply looking for that statement and are not as costly. You can use tkprof to determine if there is problem with hard parsing:

Look in the TKPROF output for the lines displayed after the "Total"  PARSE, EXECUTE, FETCH block. In particluar, we are looking for the following lines:
    "Misses in library cache during parse: n"
    "Misses in Library Cache during Execute: n"

Misses in library cache during parse: 0 -> indicates that the statement was found in the library cache (soft parse).

The true parsing is listed in 'Misses in library cache during parse'
The "Misses in library cache during parse: 1" indicates only 1 hard parse. The "Parse" value in TKPROF includes both hard and soft parses. Soft parse does not involve parsing the statement, it just hashes the statement and compares this with the hashed value in the SQL area.
The count of parses to executions should be as close to one as possible. If there are a high number of parses per execution, this could indicate the lack of use of bind variables in SQL statements or poor cursor reuse.

Hard Parse:
Statement not previously executed, needs to be parsed, syntax check, semantic check, permissions check, allocate space from shared pool. Requires a library cache latch AND a shared pool latch.

Soft Parse:
Statement previously parsed. Syntax and check is not required, if the user is different than the previously parsing user then partial semantic check is still required, as object names may resolve to different objects. Requires only library cache latch.


Understanding Recursive Calls
Sometimes, in order to execute a SQL statement issued by a user, Oracle must issue additional statements. Such statements are called recursive calls or recursive SQL statements. For example, if you insert a row into a table that does not have enough space to hold that row, then Oracle makes recursive calls to allocate the space dynamically. Recursive calls are also generated when data dictionary information is not available in the data dictionary cache and must be retrieved from disk.
If recursive calls occur while the SQL trace facility is enabled, then TKPROF produces statistics for the recursive SQL statements and marks them clearly as recursive SQL statements in the output file. You can suppress the listing of Oracle internal recursive calls (for example, space management) in the output file by setting the SYS command-line parameter to NO. The statistics for a recursive SQL statement are included in the listing for that statement, not in the listing for the SQL statement that caused the recursive call. So, when you are calculating the total resources required to process a SQL statement, consider the statistics for that statement as well as those for recursive calls caused by that statement.


Library Cache Misses in TKPROF
TKPROF also lists the number of library cache misses resulting from parse and execute steps for each SQL statement. These statistics appear on separate lines following the tabular statistics. If the statement resulted in no library cache misses, then TKPROF does not list the statistic. In "Sample TKPROF Output", the statement resulted in one library cache miss for the parse step and no misses for the execute step.

Optional Parameters for TKPROF

There are a huge number of sort options that can be accessed by simply typing 'TkProf' at the command prompt. A useful starting point is the 'fchela' sort option which orders the output by elapsed time fetching (remember that timing information is only available with timed_statistics set to true in the "init.ora" file). The resultant .prf file will contain the most time consuming SQL statement at the start of the file. Another useful parameter is sys. This can be used to prevent SQL statements run as user SYS from being displayed. This can make the output file much shorter an easier to manage.

EXPLAIN =  Uses the EXPLAIN PLAN statement to generate the execution plan of each SQL statement traced. TKPROF also displays the number of rows processed by each step of the execution plan
SYS = With the value of NO causes TKPROF to omit recursive SQL statements from the output file. In this way you can ignore internal Oracle statements such as temporary table operations.
SORT = Causes TKPROF to sort the SQL statements in order of the sum of the CPU time spent executing and the CPU time spent fetching rows before writing them to the output file. For greatest efficiency, always use SORT parameters


Deciding Which Statements to Tune
You need to find which SQL statements use the most CPU or disk resource.
If the TIMED_STATISTICS parameter is on, then you can find high CPU activity in the CPU column. If TIMED_STATISTICS is not on, then check the QUERY and CURRENT columns.
With the exception of locking problems and inefficient PL/SQL loops, neither the CPU time nor the elapsed time is necessary to find problem statements. The key is the number of block visits, both query (that is, subject to read consistency) and current (that is, not subject to read consistency). Segment headers and blocks that are going to be updated are acquired in current mode, but all query and subquery processing requests the data in query mode. These are precisely the same measures as the instance statistics CONSISTENT GETS and DB BLOCK GETS.
You can find high disk activity in the disk column.

Reading TKPROF Reports

Every TKPROF report starts with a header that lists the TKPROF version, the date and time the report was generated, the name of the trace file, the sort option used, and a brief definition of the column headings in the report. Every report ends with a series of summary statistics. You can see the heading and summary statistics on the sample TKPROF report shown earlier in this paper.

The main body of the TKPROF report consists of one entry for each distinct SQL statement that was executed by the database server while SQL trace was enabled. There are a few subtleties at play in the previous sentence. If an application queries the customers table 50 times, each time specifying a different customer_id as a literal, then there will be 50 separate entries in the TKPROF report. If however, the application specifies the customer_id as a bind variable, then there will be only one entry in the report with an indication that the statement was executed 50 times. Furthermore, the report will also include SQL statements initiated by the database server itself in order to perform so-called “recursive operations” such as manage the data dictionary and dictionary cache.

The entries for each SQL statement in the TKPROF report are separated by a row of asterisks. The first part of each entry lists the SQL statement and statistics pertaining to the parsing, execution, and fetching of the SQL statement. Consider the following example:

********************************************************************************
SELECT table_name
FROM user_tables
ORDER BY table_name

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.02 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 14 0.59 0.99 0 33633 0 194
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 16 0.60 1.01 0 33633 0 194

Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: RSCHRAG [recursive depth: 0]

This may not seem like a useful example because it is simply a query against a dictionary view and does not involve
application tables. However, this query actually serves the purpose well from the standpoint of highlighting the
elements of a TKPROF report.

Reading across, we see that while SQL trace was enabled, the application called on the database server to parse this statement once. 0.01 CPU seconds over a period of 0.02 elapsed seconds were used on the parse call, although no physical disk I/Os or even any buffer gets were required. (We can infer that all dictionary data required to parse the statement were already in the dictionary cache in the SGA.)

The next line shows that the application called on Oracle to execute the query once, with less than 0.01 seconds of CPU time and elapsed time being used on the execute call. Again, no physical disk I/Os or buffer gets were required. The fact that almost no resources were used on the execute call might seem strange, but it makes perfect sense when you consider that Oracle defers all work on most SELECT statements until the first row is fetched.

The next line indicates that the application performed 14 fetch calls, retrieving a total of 194 rows. The 14 calls used a total of 0.59 CPU seconds and 0.99 seconds of elapsed time. Although no physical disk I/Os were performed, 33,633 buffers were gotten in consistent mode (consistent gets). In other words, there were 33,633 hits in the buffer cache and no misses. I ran this query from SQL*Plus, and we can see here that SQL*Plus uses an array interface to fetch multiple rows on one fetch call. We can also see that, although no disk I/Os were necessary, it took quite a bit of processing to complete this query.

The remaining lines on the first part of the entry for this SQL statement show that there was a miss in the library cache (the SQL statement was not already in the shared pool), the CHOOSE optimizer goal was used to develop the execution plan, and the parsing was performed in the RSCHRAG schema.

Notice the text in square brackets concerning recursive depth. This did not actually appear on the report—I added it for effect. The fact that the report did not mention recursive depth for this statement indicates that it was executed at the top level. In other words, the application issued this statement directly to the database server. When recursion is involved, the TKPROF report will indicate the depth of the recursion next to the parsing user.

There are two primary ways in which recursion occurs. Data dictionary operations can cause recursive SQL operations. When a query references a schema object that is missing from the dictionary cache, a recursive query is executed in order to fetch the object definition into the dictionary cache. For example, a query from a view whose definition is not in the dictionary cache will cause a recursive query against view$ to be parsed in the SYS schema. Also, dynamic space allocations in dictionary-managed tablespaces will cause recursive updates against uet$ and fet$ in the SYS schema.

Use of database triggers and stored procedures can also cause recursion. Suppose an application inserts a row into a table that has a database trigger. When the trigger fires, its statements run at a recursion depth of one. If the trigger invokes a stored procedure, the recursion depth could increase to two. This could continue through any number of levels.

So far we have been looking at the top part of the SQL statement entry in the TKPROF report. The remainder of the entry consists of a row source operation list and optionally an execution plan display. (If the explain keyword was not used when the TKPROF report was generated, then the execution plan display will be omitted.) Consider the following example, which is the rest of the entry shown above:

Rows     Row Source Operation
------- ---------------------------------------------------
194 SORT ORDER BY
194 NESTED LOOPS
195 NESTED LOOPS OUTER
195 NESTED LOOPS OUTER
195 NESTED LOOPS
11146 TABLE ACCESS BY INDEX ROWID OBJ$
11146 INDEX RANGE SCAN (object id 34)
11339 TABLE ACCESS CLUSTER TAB$
12665 INDEX UNIQUE SCAN (object id 3)
33 INDEX UNIQUE SCAN (object id 33)
193 TABLE ACCESS CLUSTER SEG$
387 INDEX UNIQUE SCAN (object id 9)
194 TABLE ACCESS CLUSTER TS$
388 INDEX UNIQUE SCAN (object id 7)


Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT GOAL: CHOOSE
194 SORT (ORDER BY)
194 NESTED LOOPS
195 NESTED LOOPS (OUTER)
195 NESTED LOOPS (OUTER)
195 NESTED LOOPS
11146 TABLE ACCESS (BY INDEX ROWID) OF 'OBJ$'
11146 INDEX (RANGE SCAN) OF 'I_OBJ2' (UNIQUE)
11339 TABLE ACCESS (CLUSTER) OF 'TAB$'
12665 INDEX (UNIQUE SCAN) OF 'I_OBJ#' (NON-UNIQUE)
33 INDEX (UNIQUE SCAN) OF 'I_OBJ1' (UNIQUE)
193 TABLE ACCESS (CLUSTER) OF 'SEG$'
387 INDEX (UNIQUE SCAN) OF 'I_FILE#_BLOCK#' (NON-UNIQUE)
194 TABLE ACCESS (CLUSTER) OF 'TS$'
388 INDEX (UNIQUE SCAN) OF 'I_TS#' (NON-UNIQUE)

The row source operation listing looks very much like an execution plan. It is based on data collected from the SQL trace file and can be thought of as a “poor man’s execution plan”. It is close, but not complete.

The execution plan shows the same basic information you could get from the autotrace facility of SQL*Plus or by querying the plan table after an EXPLAIN PLAN statement—with one key difference. The rows column along the left side of the execution plan contains a count of how many rows of data Oracle processed at each step during the execution of the statement. This is not an estimate from the optimizer, but rather actual counts based on the contents of the SQL trace file.

Although the query in this example goes against a dictionary view and is not terribly interesting, you can see that Oracle did a lot of work to get the 194 rows in the result: 11,146 range scans were performed against the i_obj2 index, followed by 11,146 accesses on the obj$ table. This led to 12,665 non-unique lookups on the i_obj# index, 11,339 accesses on the tab$ table, and so on.

In situations where it is feasible to actually execute the SQL statement you wish to explain (as opposed to merely parsing it as with the EXPLAIN PLAN statement), I believe TKPROF offers the best execution plan display. GUI tools such as TOAD will give you results with much less effort, but the display you get from TOAD is not 100% complete and in certain situations critical information is missing. (Again, my experience is with the free version!) Meanwhile, simple plan table query scripts like my explain.sql presented earlier in this paper or utlxpls.sql display very incomplete information. TKPROF gives the most relevant detail, and the actual row counts on each operation can be very useful in diagnosing performance problems. Autotrace in SQL*Plus gives you most of the information and is easy to use, so I give it a close second place.


Another Example:
The following listing shows TKPROF output for one SQL statement as it appears in the output file:

SELECT *
FROM emp, dept
WHERE emp.deptno = dept.deptno;

call   count      cpu    elapsed     disk    query current    rows
---- -------  -------  --------- -------- -------- -------  ------
Parse     11     0.08      0.18        0       0       0         0
Execute   11     0.23      0.66        0       3       6         0
Fetch     35     6.70      6.83      100   12326       2       824
------------------------------------------------------------------
total     57     7.01      7.67      100   12329       8       826
Misses in library cache during parse: 0

If it is acceptable to have 7.01 CPU seconds and to retrieve 824 rows, then you need not look any further at this trace output. In fact, a major use of TKPROF reports in a tuning exercise is to eliminate processes from the detailed tuning phase.

You can also see that 10 unnecessary parse call were made (because there were 11 parse calls for this one statement) and that array fetch operations were performed. You know this because more rows were fetched than there were fetches performed.


TKPROF Reports: More Than Just Execution Plans

The information displayed in a TKPROF report can be extremely valuable in the application tuning process. Of course the execution plan listing will give you insights into how Oracle executes the SQL statements that make up the application, and ways to potentially improve performance. However, the other elements of the TKPROF report can be helpful as well.

Looking at the repetition of SQL statements and the library cache miss statistics, you can determine if the application is making appropriate use of Oracle’s shared SQL facility. Are bind variables being used, or is every query a unique statement that must be parsed from scratch?

From the counts of parse, execute, and fetch calls, you can see if applications are making appropriate use of Oracle’s APIs. Is the application fetching rows one at a time? Is the application reparsing the same cursor thousands of times instead of holding it open and avoiding subsequent parses? Is the application submitting large numbers of simple SQL statements instead of bulking them into PL/SQL blocks or perhaps using array binds?

Looking at the CPU and I/O statistics, you can see which statements consume the most system resources. Could some statements be tuned so as to be less CPU intensive or less I/O intensive? Would shaving just a few buffer gets off of a statement’s execution plan have a big impact because the statement gets executed so frequently?

The row counts on the individual operations in an execution plan display can help identify inefficiencies. Are tables being joined in the wrong order, causing large numbers of rows to be joined and eliminated only at the very end? Are large numbers of duplicate rows being fed into sorts for uniqueness when perhaps the duplicates could have been weeded out earlier on?

TKPROF reports may seem long and complicated, but nothing in the report is without purpose. (Well, okay, the row source operation listing sometimes isn’t very useful!) You can learn volumes about how your application interacts with the database server by generating and reading a TKPROF report.

Examples and Discussion

Step 1 - Look at the totals at the end of the tkprof output

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call   count      cpu    elapsed     disk    query current    rows
---- -------  -------  --------- -------- -------- -------  ------
Parse   A  7     1.87      4.53      385   G 553      22         0
Execute E  7     0.03      0.11    P   0   C   0    D  0     F   0
Fetch   E  6     1.39      4.21    P 182   C 820    D  3     F  20
------------------------------------------------------------------
Misses in library cache during parse: 5
Misses in library cache during execute: 1

      8  user  SQL statements in session.    
     12  internal SQL statements in session.
 [B] 54  SQL statements in session.     
      3  statements EXPLAINed in this session.

1. Compare [A] & [B] to spot over parsing. In this case we have 7 parses for 54 statements which is ok.

2. You can use [P], [C] & [D] to determine the hit ratio.

Hit Ratio is logical reads/physical reads:
Logical Reads = Consistent Gets + DB Block Gets
Logical Reads = query + current
Logical Reads = Sum[C] + Sum[D]
Logical Reads = 0+820 + 0+3
Logical Reads = 820 + 3
Logical Reads = 823

Hit Ratio = 1 - (Physical Reads / Logical Reads)
Hit Ratio = 1 - (Sum[P] / Logical Reads)
Hit Ratio = 1 - (128 / 823)
Hit Ratio = 1 - (0.16)
Hit Ratio = 0.84 or 84%

3. We want fetches to be less than the number of rows as this will mean we have done less work (array fetching). To see this we can compare [E] and [F].
    [E] = 6 = Number of Fetches
    [F] = 20 = Number of Rows
So we are doing 6 fetches to retrieve 20 rows - not too bad.
If arrayfetching was configured then rows could be retrieved with less fetches. Remember that an extra fetch will be done at the end to check that the end of fetch has been reached.

4. [G] Shows reads on the Dictionary cache for the statements.
In this case we have done 553 reads from the Library cache.


STEP 2 - Examine statements using high resource

update ...
where ...

call   count      cpu    elapsed     disk    query current    rows
---- -------  -------  --------- -------- -------- -------  ------
Parse      1        7       122        0       0       0         0
Execute    1       75       461        5   H 297    I  3     J   1
Fetch      0        0         0        0       0       0         0
------------------------------------------------------------------

[H] shows that this query is visiting 297 blocks to find the rows to update
[I] shows that only 3 blocks are visited performing the update
[J] shows that only 1 row is updated.

297 block to update 1 rows is a lot. Possibly there is an index missing?


STEP 3 - Look for over parsing

SELECT .....

call   count      cpu    elapsed    disk   query  current    rows
---- -------  -------  --------- ------- -------- -------  ------
Parse   M  2   N  221       329        0      45        0       0
Execute O  3   P    9        17        0       0        0       0
Fetch      3        6         8        0   L   4        0   K   1
------------------------------------------------------------------
Misses in library cache during parse: 2 [Q]

[K] is shows that the query has returned 1 row.
[L] shows that we had to read 4 blocks to get this row back.
This is fine.

[M] show that we are parsing the statement twice - this is not desirable especially as the cpu usage is high [N] in comparison to the execute figures : [O] & [P]. [Q] shows that these parses are hard parses. If [Q] was 1 then the statement would have had 1 hard parse followed by a soft parse (which just looks up the already parsed detail in the library cache).

This is not a particularly bad example since the query has only been executed a few times. However excessive parsing should be avoided as far as possible by:

o Ensuring that code is shared:
    - use bind variables
    - make shared pool large enough to hold query definitions in memory long enough to be reused.


Useful queries

To make a quick assessment if it may be poorly tuned SQL causing the problem, the following query will help:

select sid, name, value

  from v$sesstat ss, v$statname sn
  where ss.statistic# = sn.statistic#
    and name in ('db block gets', 'consistent gets','physical reads','table fetch contiued row')
    and sid = &SID
  order by 3 desc;


Also, it is important to identify whether or not the session is waiting or actually doing work. The following query will help asses this.

select substr(event, 1, 30) "Resource", time_waited / 100 "Time waited/used (in seconds)"

  from v$session_event
  where sid = &SID
union
select substr(name, 1, 30) "Resource", value / 100 "Time waited/used (in seconds)"
  from v$statname sn, v$sesstat ss
  where sn.statistic# = ss.statistic#
    and sn.name = 'CPU used by this session'
    and ss.sid = &SID
  order by 2 desc ;

If you experience a lot of waits, it may be of some help to query v$session_wait to find the parameters for the wait events. However, querying v$session_wait takes a while relative to the length of most waits. Hope this helps


Tracing with Triggers
In some cases, you may want to trace when an user connects to the database and only stop when he disconnects. On such ocassions, we can make use of database event level triggers to enable/disable that automatically.

create or replace trigger trace_trigger_scott_on
after logon on database
when (user = 'SCOTT')
declare
   stmt varchar2(100);
   hname varchar2(20);
   uname varchar2(20);
begin
   select sys_context('USERENV','HOST'), sys_context('USERENV','SESSION_USER') into hname, uname
      from dual;
   stmt := 'alter session set tracefile_identifier =' || hname|| '_' || uname;
   execute immediate stmt;
   execute immediate 'alter session set sql_trace=TRUE';
end;
/

create or replace trigger trace_trigger_scott_off
before logoff on database
when (user = 'SCOTT')
begin
   execute immediate 'alter session set sql_trace=false';
end;
/


TRACE SESSION 10046

Note the bit about ‘extended SQL tracing’. It’s called ‘extended’ because it’s like regular SQL tracing (which you can turn on by issuing the command Alter session set sql_trace=true ) but it additionally captures diagnostic information on what’s called ‘wait events’ – behind the scenes events that delay your SQL statement – events such as locks on tables, contention for disk blocks and so on

UPDATE TEST SET COL1 = COL1
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute     50    114.77     781.11          6     277901     128304      124500
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       51    114.77     781.11          6     277901     128304      124500


Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 59     (recursive depth: 1)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  enqueue                                       305        3.12        605.91
  buffer busy waits                             146        0.08          1.38
  undo segment extension                        162        0.00          0.00
  db file sequential read                         6        0.06          0.12
********************************************************************************

Our original trace file shows a total elapsed time of 6.27 seconds and a CPU time of 6.13 seconds, which is acceptable. However our current trace file shows a total elapsed time of 781.11 seconds and a CPU time of 114.77 seconds – our CPU time has risen dramatically, and our elapsed time is several times longer again.
However, note the section that follows the statement breakdown, headed by ‘Elapsed Times Include Waiting On The Following Events’. This is called the wait event analysis, and it tells us exactly what kept our statement waiting.
The key column is the ‘total waited’ column, and you can see in this instance that our statement was kept waiting for 605.91 seconds by the ‘enqueue’ wait event – which means ‘locks’. Looking back at our PL/SQL code, this problem is happening because each of the five copies of the program are trying to update the same columns in the same table, and  more often than not the row is locked when a copy of the application needs to write to it. 
Also, note the ‘buffer busy waits’ event. Buffer busy waits (http://www.dba-oracle.com/art_builder_bbw.htm) occur when an Oracle session needs to access a block in the buffer cache, but cannot because the buffer copy of the data block is locked. This buffer busy wait condition can happen for either of the following reasons:
However, note that whilst the number of buffer busy waits is 50% of the enqueue waits, the total waited time is only a fraction of the wait time caused by enqueue events – therefore, if you were tuning this application, you would focus the vast majority of your time on solving the enqueue problem, as this will have the greatest impact on your application (i.e. it will potentially remove 605.91 seconds from the time your session takes to run, rather than the 1.38 seconds that would be removed by dealing with the buffer busy waits issue)

OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse      512      0.00       0.00          0          0          0           0
Execute   3605    115.77     787.28          6     286518     132493      126066
Fetch      500      0.17       1.08          0       2499          0         500
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     4617    115.94     788.37          6     289017     132493      126566

Misses in library cache during parse: 0

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  enqueue                                       308        3.12        606.08
  buffer busy waits                             258        0.08          2.85
  undo segment extension                        162        0.00          0.00
  db file sequential read                         6        0.06          0.12
  latch free                                     15        0.03          0.10
 
  513  user  SQL statements in session.
    0  internal SQL statements in session.
  513  SQL statements in session.
********************************************************************************
Trace file: markr9i_ora_2652_test2_binds_concurrent_processes.trc
Trace file compatibility: 9.02.00
Sort options: default
       1  session in tracefile.
     513  user  SQL statements in trace file.
       0  internal SQL statements in trace file.
     513  SQL statements in trace file.
       8  unique SQL statements in trace file.
    9491  lines in trace file.


Looking at the ‘Overall Totals For All Recursive Statements’ section, it shows that the total CPU time was 115.94 and the total elapsed time was 788.37, with the difference between the two being the amount of time Oracle was having to wait around for resources or other wait events.
The ‘Elapsed times include waiting on following events’ section then summarises all of the wait events, with ‘enqueue’ again being by far the largest contributor to the total time taken up by the session.