The
CBO makes its explain-plan decisions based on statistics. The CBO
tries to create an explain-plan that will result in the lowest amount
of physical I/O.
Statistics
provide critical input in order for CBO to work properly; these are
generated
for data storing objects and include information such as the number of
rows in
a table, distinct values in a column, number of leaf blocks in an
index, etc.
The more accurate the statistics, the more efficient the results
provided by
Optimizer.
Statistics
may be exact or estimated. Statistics generated with a COMPUTE clause
analyzes
all of the data in the object. This gives the optimizer accurate
information to
work on and arrive at a good execution plan.
Statistics
generated with an ESTIMATE clause analyzes data in the object to the
extent of
sample size mentioned. Sample size may be specified as number of rows
or
percentage of rows that should be randomly analyzed to generate the
statistics.
Optionally block sampling may also be specified. This saves on time if
there
are many huge tables in the system. The guarantee of good execution
plans will
depend on how close the estimated value is to the exact values.
The CBO doesn’t use the rules of the RBO but does use the object order
in the FROM clause and/or the object_id to break ties. The CBO can
consider a maximum of explain plan variations up to the init.ora
setting “OPTIMIZER_MAX_PERMUTATIONS”.
Statistics
are stored in a data dictionary in tables owned by SYS user. The
following
views display the statistics collected for tables, columns and indexes.
For Tables
Statistics for individual partitions of a table can be seen from DBA_TAB_PARTITIONS. Cluster statistics is available from DBA_CLUSTERS.
For Columns
For Indexes
Statistics
for individual partitions of indexes can be seen from
DBA_IND_PARTITIONS and
DBA_IND_SUBPARTITIONS.
Dictionary
tables related to Histogram information are discussed later.
Available CBO Modes
CBO
has two available modes in which to run, ALL_ROWS(default) and
FIRST_ROWS.
FIRST_ROWS
aims at returning the first row(s) of the statement as soon as
possible. This
mode tells optimizer to give response time prime importance. It prefers
nested-loop joins. FIRST_ROWS uses cost as well as
some thumb rules to process the first set of rows. Examples of thumb
rules -
Plans using indexes are preferred over plans having full table scans as
access path,
ORDER BY clause can induce index access, etc. As
of release 9i, the number of rows to be returned in the first
hit can also
be mentioned in the parameter, FIRST_ROWS_n (n could be 1, 10,
100 or
1000). This could be set as per the application requirements.
ALL_ROWS processes all rows for a given query before returning the output. It forces optimizer to consider minimal use of resources and best throughput. ALL_ROWS prefers sort-merge joins.
For an OLTP system, FIRST_ROWS would be the ideal option for fast response time. ALL_ROWS is meant for batch-oriented applications. Note, a plan that produces the first n rows with the fastest response time might not be an optimal plan if requirement is to obtain the entire result, so decide as per the need of the application |
Basic CBO Terms
The
following terms will be used quite often when analyzing statements in
CBO.
Cost
The
COST computed in CBO is a unit of expense involved with each operation.
The
logic as to how the cost is actually derived is not documented or made
external. Moreover, this may change across releases.
Cardinality
The
number of rows in the table or number of distinct row links in the
index. The
cardinality of a query is the number of rows that is expected to be
returned by
it.
Selectivity
The number of distinct values. The distinct values of a column
being indexed are known as its selectivity. For example, if a table has
10000
rows and an index is created on a column having 4000 distinct values,
then the
selectivity of the index is (4000/10000) * 100 = 40%. Unique index on
not null
columns have a selectivity of 100%.
Transitivity
It is a process of generating additional predicates for a query by
CBO. This enables optimizer to consider additional execution paths. For
example
if predicates are provided in query of the type A=B and B=C, the
optimizer may
add an additional predicate that indicates A=C.
Statistics
Much required information gathered for various data holding
objects. This information is vital for the CBO to decide on execution
plans.
Join
Methods
Oracle uses joins like Hash, sort-merge and nested loops. A query
may run faster using one type of join as compared to other methods.
This should
be evaluated for individual queries.
FTS
FTS or Full Table Scan relates to a query sequentially scanning a
table from the first block to the last allocated block. This could be
very expensive
for big tables and should be avoided.
Index
scan
Relates to random access of a table by use of one or more indexes
on the table.
1)
OPTIMIZER_MODE
This
will decide the mode in which the optimizer engine should run in. Valid
values
are RULE, CHOOSE, ALL_ROWS, FIRST_ROWS (_n).
Optionally,
the OPTIMIZER_MODE can be set to CHOOSE. This is kind of an
intermediate option
between RBO and CBO. In fact, it tries to run the query in either CBO
or RBO
depending on the availability or unavailability of statistics.
Therefore, if
the tables present in the query have statistics generated on them, CBO
(ALL_ROWS only) is preferred or RBO is taken up.
e.g.:
optimizer_mode = first_rows
2)
OPTIMIZER_FEATURES_ENABLE
This
is set to a version number such as- 8.1.5, 8.1.7, 9.0.0. Since new
features and
functionality are being added to CBO in every release, its behavior may
change
and result in different execution plans. You can set this to a version
number for
which your application is tuned. Please note setting it to a lower
version will
prevent the use of new features that have come in later versions.
e.g.:
optimizer_features_enable = 8.1.7
3)
OPTIMIZER_MAX_PERMUTATIONS
This
parameter specifies the maximum number of permutations that should be
considered for queries with joins, to choose an execution plan. This
will
influence the parse time of queries. This parameter should be set to a
lower
value. Make sure the other parameters mentioned in this section are set
properly so that the optimizer finds an optimal execution plan within
the
specified limits. It defaults to 80000 in Oracle 8, which means no
limits! In
Oracle 9i it is defaulted to 2000.
e.g.:
optimizer_max_permutations = 2000
4)
OPTIMIZER_INDEX_COST_ADJ
Optimizer_Index_Cost_Ad
takes a value between 1 and 10000 (default 100). Setting this value
makes the index more or less friendly to the optimizer. For example
setting it to 50 tells the optimizer that an execution plan that uses
index(es) is half as expensive as other execution plans that are not
using them. The lower the value (less than 100), the less full table
scan executions will take place in the system.
Use of indexes in OLTP is a crucial factor for deciding the response
time, but sometimes, full table scans provide better throughput. Nested
loop joins are influenced by this parameter value. Set this parameter
to a value that is optimal for your application. I recommend set this parameter between 10 -
20 for OLTP and 50 for DSS Systems..
If you are having slow performance because the CBO first_rows optimizer
mode is favoring too many full-table scans, you can reset the
optimizer_index_cost_adj parameter to immediately tune all of the SQL
in your database to favor index scans over full-table scans. This is a
"silver bullet" that can improve the performance of an entire database
in cases where the database is OTLP and you have verified that the
full-table scan costing is too low.
5)
OPTIMIZER_INDEX_CACHING
This
tells optimizer to favor nested loops and IN-list iterators over
sort-merge and
hash joins. The default value is 100 and makes nested loops and IN-list
operator look less expensive and more appealing to Optimizer. The value
basically
indicates the percentage of index blocks that the optimizer should
assume are
present in cache. I recommend set this
parameter to 90.
6)
OPTIMIZER_PERCENT_PARALLEL
OPTIMIZER_PERCENT_PARALLEL
takes a value between 0 and 100. A
low value favors indexes and a higher value will favor full table
scans. The
optimizer uses this parameter in working out the cost of a full table
scan. A
value of 100 makes use of degree of parallelism set at object level. I
prefer
setting it to zero to favor use of indexes and prevent use of parallel
query in
computing the costing.
It is renamed to _OPTIMIZER_PERCENT_PARALLEL in Oracle 9i and its value
should not be
altered
unless recommended by Oracle support.
e.g.:
optimizer_percent_parallel = 0
7)
COMPATIBLE
This
parameter is used to provide backward compatibility with earlier
releases. This
may also restrict the use of some new features. CBO has undergone lot
of
changes in release 8. It is advisable to set this parameter to 8.1.0 or
higher.
Only three digits are required to be specified, however, you can
specify more
for record purposes.
e.g.:
compatible = 8
8)
DB_FILE_MULTIBLOCK_READ_COUNT
This
parameter determines the number of database blocks read in one
input/output
operation during a full table scan. The value set is used in computing
the
cost of full table scans. OLTP systems will not benefit by setting a
high value
for this parameter as the computed cost for full table scan execution
plans
would reduced. The maximum size is Operating system dependent.
e.g.:
db_file_multiblock_read_count = 1 (may be set to a higher value)
9)
SORT_AREA_SIZE
This
parameter defines the maximum memory space allocated per user process
that
requires sorting of data and for insert and updates to bitmap indexes.
Optimizer uses the set value for determining the cost of sorting in a
query.
This defaults to 64K. Normally a size of 64K to 1M is appropriate for
OLTP
systems.
Sort-merge
joins are influenced by this parameter value. The bigger the size the
more
appealing will be sort-merge joins over nested-loop and hash joins. The
lower
the size the more sorting will take place in temporary segments. So
assign a
value that has been evaluated for your setup.
Oracle
recommends the use of PGA_AGGREGATE_TARGET instead of this parameter
from
Oracle 9i.
e.g:
sort_area_size = 1048576
10)
SORT_MULTIBLOCK_READ_COUNT
This
specifies the number of database blocks to be read each time a sort
performs a
read from a temporary segment. The default value is 2. General
recommendation
is to keep the default value. However, if you intend to hike up the
SORT_AREA_SIZE considerably, you may evaluate the improvement by
changing this
parameter.
e.g.:
sort_multiblock_read_count = 2
11)
HASH_JOIN_ENABLED
Hash
joins are available only in CBO. In
hash joins, a hash table is created on the join key of the smallest
sized
table. It then joins the other tables to find the match. Hash joins may
prove
to be faster than other type of joins in some conditions, especially
when the index
is missing or search criteria is not very selective. Hash joins require
a large
amount of memory as the hash tables are retained; this may sometimes
result in
memory swapping.
Nested-loop
joins return the first row faster than sort-merge and hash joins and
are
preferred for OLTP, but other types of joins cannot be ignored for
running
other aspects of the applications.
e.g.:
hash_join_enabled = true
12)
HASH_AREA_SIZE
This
specifies the maximum amount of memory in bytes to be used for a hash
join per
process. It is defaulted to 2 times SORT_AREA_SIZE.
Oracle
recommends the use of PGA_AGGREGATE_TARGET instead of this parameter
from
Oracle 9i.
e.g.:
hash_area_size = 2097152
Setting
this to a very low number may sometimes result in the following error.
ORA-6580:
Hash Join ran out of memory while keeping large rows in memory.
13)
HASH_MULTIBLOCK_IO_COUNT
This
specifies how many sequential blocks a hash join reads and writes in
one
Input/Output activity. Oracle recommends not changing or assigning a
value to
this parameter; this will let oracle decide on the appropriate value
for each
individual query. In such casees, the value of the parameter will
appear as 0
in the V$PARAMETER view.
This
parameter is renamed to _HASH_MULTIBLOCK_IO_COUNT as of Oracle 9i.
e.g.:
hash_multi_block_io_count = 0
14)
BITMAP_MERGE_AREA_SIZE
This
parameter is relevant for systems using bitmap indexes. It specifies
the amount
of memory Oracle uses to merge bitmaps retrieved from a range scan of a
bitmap
index. The default value is 1 MB, which is considered sufficient for
most
setups.
Oracle
recommends use of PGA_AGGREGATE_TARGET instead of this parameter from
Oracle 9i.
e.g.:
bitmap_merge_area_size = 1048576
15)
QUERY_REWRITE_ENABLED
This
parameter is relevant for systems using Materialized views, Function
based
indexes and stored outlines. Setting this parameter enables query
rewrite for
the database. The materialized views should also be query rewrite
enabled to
allow the optimizer to redirect a query to it rather than the actual
tables
listed in the FROM clause. Query rewriting is internally done by the
optimizer
based on what results are expected and whether these are already
present in
materialized view form.
e.g.:
query_rewrite_enabled = true
16)
QUERY_REWRITE_INTEGRITY
This
parameter determines the extent to which the optimizer must enforce
query
rewriting to Materialized views. This determines the accuracy of the
query
rewrite. It can be set to ENFORCED, TRUSTED or STALE_TOLERATED.
ENFORCED
option uses Materialized views if they contain fresh data, thus
guarantying the
integrity and consistency of data. TRUSTED uses relationships that have
been
defined in the system for rewriting the query. STALE_TOLERATED allows
the use
of relationships that are not enforced and may use stale data. For OLTP
systems, ENFORCED would be the ideal setup, as reports need to be
up-to-date.
e.g.:
query_rewrite_integrity = enforced
17)
ALWAYS_ANTI_JOIN
This
parameter specifies the join method for anti-joins, for example when a
NOT IN
operator is present in your query. It can be set to NESTED_LOOPS, MERGE
or
HASH. It is defaulted to NESTED_LOOPS in Oracle 8i and CHOOSE
in Oracle
9i.
This
parameter is renamed to _ALWAYS_ANTI_JOIN as of Oracle 9i.
e.g.:
always_anti_join = nested_loops
18)
ALWAYS_SEMI_JOIN
This
parameter specifies the join method for semi-joins. These types of
joins are
carried out by Optimizer after transforming a query. In such joins,
duplicate
values from the inner table are removed and then the type of join
specified in
the parameter is used to perform a semi-join. It can be set to
NESTED_LOOPS,
MERGE or HASH. In Oracle 8i, it is defaulted to STANDARD and
in Oracle 9i
it is defaulted to CHOOSE, to pick up an appropriate join.
This
parameter is renamed to _ALWAYS_SEMI_JOIN as of Oracle 9i.
e.g.:
always_semi_join = nested_loops
19)
STAR_TRANSFORMATION_ENABLED
This
specifies whether query transformation will be applied to
star queries. It can be set to TRUE, FALSE or TEMP_DISABLE
(transformation will
take place but will not use temporary tables). I presently set it to
FALSE due
to some known issues of sub-optimal queries being generated. If you
intend to
use this, please upgrade your version to 8.1.7.4 and above.
e.g.:
star_transformation_enabled = false
20)
PARALLEL_BROADCAST_ENABLED
This
parameter refers to parallel executions in cluster databases.
It is meant for improving hash and sort-merge join operations where a
very
large result set is joined with a very small result set. When this
option is
enabled, the optimizer broadcasts a copy of all rows in the smaller
result set
to all cluster databases that are processing some rows of the larger
result
set.
It
is obsolete in release 9.2.0.
e.g.:
parallel_broadcast_enabled = false
21)
OPTIMIZER_DYNAMIC_SAMPLING
This
parameter is introduced in release 9i. It is meant for
situations where tables are not analyzed. As CBO depends heavily on
statistics,
the parameter tells the optimizer to sample the unanalyzed tables that
are
being used in a query. A level of 0 to 10 can be specified, the higher
the
value the more time optimizer spends in sampling.
e.g.:
optimizer_dynamic_sampling = 1
22)
PARTITION_VIEW_ENABLED
This
parameter is meant for backward compatibility to support partition
views.
Oracle recommends use of partition tables rather than partition views.
If you
are migrating to CBO, chances are that you may not be using partition
views.
e.g.:
partition_view_enabled = false
23)
CURSOR_SHARING
This
parameter determines what kind of SQL statements can share
the same cursor. It can be set to FORCE, SIMILAR or EXACT. FORCE will
try to
squeeze statements that may differ in some literals to share the same
cursor.
SIMILAR is somewhat the same but will try to maintain the plan
optimization for
identical statements. EXACT allows statements with exact identical text
to
share a cursor.
Using
FORCE may sometimes result in unexpected results.
e.g.:
cursor_sharing = exact
24)
PGA_AGGREGATE_TARGET
Introduced
in Oracle 9i, this parameter specifies the aggregate
PGA
memory
available to all server processes attached to an instance. This
parameter can
be set for automatic sizing of SQL working areas. It replaces other
existing
parameters like SORT_AREA_SIZE, BITMAP_MERGE_AREA_SIZE and
HASH_AREA_SIZE.
It
can be set to a value between 10 MB to 4000 GB, depending on the setup
requirement.
In CBO, adapt to ordering from left-to-right, left being the driving table. The ORDERED hint used in CBO picks up tables left-to-right for processing. Take a pick. Avoid RBO style coding techniques. CBO has advanced features such as function-based and bitmap indexes, control processing of queries with proper where clauses and hints. |
For example, below is an existing legacy system code that always uses a predicate with a hard coded value 1666. Though this may not change in the lifetime of the code, best practice is to avoid such coding and define this as a variable.
Original
code:
begin
...
select sum(qty) qty into rec.qty
from
jncdm_ra_transactions a, ra_customer_trx_all b
where
a.calendar_month_id = pi_calendar_month_id
and
b.customer_trx_id = a.customer_trx_id
and
b.cust_trx_type_id != 1666
and b.type =
'INV';
...
end;
Change
it to use a variable:
declare
...
l_exclude_trx_type_id
number := '1666';
l_mod_type varchar2(3) :=
'INV';
begin
...
select sum(qty) qty into rec.qty
from
jncdm_ra_transactions a, ra_customer_trx_all b,
where
calendar_month_id = pi_calendar_month_id
and b.customer_trx_id =
a.customer_trx_id
and b.cust_trx_type_id !=
l_exclude_trx_type_id
and c.type = l_mod_type;
...
end;
Oracle provides more than one way of generating statistics.
NOTE= Oracle strongly recommends that you use the DBMS_STATS package rather than ANALYZE to collect optimizer statistics. That package lets you collect statistics in parallel, collect global statistics for partitioned objects, and fine tune your statistics collection in other ways. Further, the cost-based optimizer will eventually use only statistics that have been collected by DBMS_STATS |
Oracle recommends setting the ESTIMATE_PERCENT
parameter of the DBMS_STATS gathering procedures to DBMS_STATS .AUTO_SAMPLE_SIZE
to maximize performance gains while achieving necessary statistical
accuracy. AUTO_SAMPLE_SIZE
lets Oracle determine the best sample size for good statistics. For
example, to collect table and column statistics for all tables in the OE
schema with auto-sampling: |
EXECUTE DBMS_STATS.GATHER_SCHEMA_STATS('OE',DBMS_STATS.AUTO_SAMPLE_SIZE);
By default DBMS_STATS will generate statistics for the table and not it's indexes (By default CASCADE => FALSE). |
If you intend to ESTIMATE statistics because of time constraints, try to arrive at an optimal sample size that will yield excellent results for your database. In general, a sample size of 5% to 10% generates adequate statistics!
Gathering statistics on tables requires sorting to be done and this takes up resources. Gathering statistics on indexes does not require sorting. Considering this benefit, you may COMPUTE statistics on indexes for accurate data. |
You
may also consider generating statistics in Parallel. Below
is an example of arriving at an appropriate ESTIMATE sample size.
The table INV.MATERIAL_TRX has around 4.5 million records. Below are statistics at various sample sizes.
Commands
used:
dbms_stats.gather_table_stats('INV', 'MTL_MATERIAL_TRX'); --compute
dbms_stats.gather_table_stats('INV', 'MTL_MATERIAL_TRX', estimate_percent => 5);
dbms_stats.gather_table_stats('INV', 'MTL_MATERIAL_TRX', estimate_percent => 10);
dbms_stats.gather_table_stats('INV', 'MTL_MATERIAL_TRX', estimate_percent => 51);
Final statistics:
MODE | NUM_ROWS| BLOCKS|AVG_ROW_LEN|SAMPLE_SIZE|LAST_ANAL|Time taken
_______|__________|__________|___________|___________|_________|__________
compute| 4591474| 193230| 251| 4591474|27-JUL-03|2 hr
at 5% | 4582460| 193230| 247| 229123|27-JUL-03|8 min
at 10% | 4587520| 193230| 249| 458752|27-JUL-03|17 min
at 20% | 4591635| 193230| 250| 918327|27-JUL-03|32 min
at 51% | 4590890.2| 193230| 250| 2341354|27-JUL-03|1 hr 56 min
When
there is a 10-20% change in data, the general convention is to generate
fresh
statistics. You can start of with a general rule of estimating
statistics on a
weekly basis. If the tables are giving real bad hits because of heavy
activity,
you may consider using the DML Monitoring option to update statistics
every few
hours for such tables. Statistics
are not incremental and are regenerated every time. If there is no
considerable
change in data, there is no advantage in generating statistics too
frequently.
A table that is in the process of being analyzed cannot undergo DDL changes but DML activities can be carried on. Analyzing an index puts a shared lock on the related table; hence, neither DML nor DDL can be performed. Preferably avoid all activities during the statistics generation phase |
Use DBMS_STATS for generating statistics on a periodic basis. You may also categorize objects under various groups depending on their frequency of data change. For example, a daily transaction table will always change and statistics may be generated more often. Master tables change subtly and statistics could be generated less often. |
In the case of ESTIMATE statistics, arrive at an optimal sample size. Oracle recommends a percentage of 5%. Consider using COMPUTE statistics for indexes and index-organized tables. |
If queries on a particular set of tables always require more accurate statistics to behave properly, and otherwise fail to perform, consider using hints to direct the optimizer and avoid dependency on statistics generation. |
Whenever a heavy upload of data is done, consider explicit generation of statistics on the concerned tables. |
The general rule is to not generate statistics for SYS schema if you are on Oracle 8i. In Oracle 9i, you will have to test this out in your setup to arrive at a conclusion. The dictionary will need to be analyzed in Oracle 10i, as RBO will be unsupported then.
This
package allows you to insert your application specific information into
the
dynamic dictionary tables. This is very handy for implementing certain
logics
and for analyzing and tuning. I mention this package here mainly for
analyzing
and tuning purposes. By using this feature, you can integrate your
application
into the database and find vital information at any point of time, such
as what
is presently running in the database and from which part of the
application a
particular query is being executed and what the user sessions are
doing.
Any
time a performance issue arises, look at the application specific
information
to identify the problematic area instantly! Application specific
information is
set in dictionary tables such as V$SESSION (MODULE, ACTION, CLIENT_INFO
columns).
The following lists the procedures in the DBMS_STATS
package for gathering statistics:
Procedure | Collects |
---|---|
|
Index statistics |
|
Table, column, and index statistics |
|
Statistics for all objects in a schema |
|
Statistics for all objects in a database |
|
CPU and I/O statistics for the system |
System statistics enable the optimizer to consider a system's I/O and CPU performance and utilization. For each plan candidate, the optimizer computes estimates for I/O and CPU costs. It is important to know the system characteristics to pick the most efficient plan with optimal proportion between I/O and CPU cost.
System I/O characteristics depend on many factors and do not stay constant all the time. Using system statistics management routines, database administrators can capture statistics in the interval of time when the system has the most common workload. For example, database applications can process OLTP transactions during the day and run OLAP reports at night. Administrators can gather statistics for both states and activate appropriate OLTP or OLAP statistics when needed. This enables the optimizer to generate relevant costs with respect to available system resource plans.
When Oracle generates system statistics, it analyzes system activity in a specified period of time. Unlike table, index, or column statistics, Oracle does not invalidate already parsed SQL statements when system statistics get updated. All new SQL statements are parsed using new statistics. Oracle Corporation highly recommends that you gather system statistics.
The DBMS_STATS
.GATHER_SYSTEM_STATS
routine collects system statistics in a user-defined timeframe. You can
also set system statistics values explicitly using DBMS_STATS
.SET_SYSTEM_STATS
.
Use DBMS_STATS
.GET_SYSTEM_STATS
to verify
system statistics.
mystats
table:
BEGIN
DBMS_STATS.GATHER_SYSTEM_STATS(
gathering_mode => 'interval',
interval => 720,
stattab => 'mystats',
statid => 'OLTP');
END;
/
Gather statistics during the night. Gathering ends after
720 minutes and is stored in the mystats
table:
BEGIN
DBMS_STATS.GATHER_SYSTEM_STATS(
gathering_mode => 'interval',
interval => 720,
stattab => 'mystats',
statid => 'OLAP');
END;
/
If appropriate, you can switch between the statistics gathered. It is possible to automate this process by submitting a job to update the dictionary with appropriate statistics. During the day, the following jobs import the OLTP statistics for the daytime run:
VARIABLE jobno number;
BEGIN
DBMS_JOB.SUBMIT(:jobno,
'DBMS_STATS.IMPORT_SYSTEM_STATS(''mystats'',''OLTP'');'
SYSDATE, 'SYSDATE + 1');
COMMIT;
END;
/
During the night, the following jobs import the OLAP statistics for the nighttime run:
BEGIN
DBMS_JOB.SUBMIT(:jobno,
'DBMS_STATS.IMPORT_SYSTEM_STATS(''mystats'',''OLAP'');'
SYSDATE + 0.5, 'SYSDATE + 1');
COMMIT;
END;
/
DBMS_STATS.GATHER_DATABASE_STATS
Can gather statistics on all the tables and
indexes in a database.
Some parameters :
- estimate_percent
o NULL means compute
o The valid range is [0.000001,100].
o DBMS_STATS.AUTO_SAMPLE_SIZE so Oracle determines the best sample size
for good statistics (Recommended value 9i).
- block_sample
o Only pertinent when doing an estimate statistics.
o Use or not random block sampling instead of random row sampling.
o Random block sampling is more efficient, but if the data is not
randomly distributed on disk, then the sample values may be somewhat
correlated.
- method_opt
o DEFAULT 'FOR ALL COLUMNS SIZE 1',
o FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause]
o FOR COLUMNS [size clause] column|attribute [size_clause]
[,column|attribute [size_clause]...],
o size_clause := SIZE {integer | REPEAT | AUTO | SKEWONLY}
. integer Number of histogram buckets. Valid Range [1,254]
. REPEAT Collects histograms only on the columns that already have
histograms (9i).
. AUTO Oracle determines the columns to collect histograms based on
data distribution and the workload of the columns (9i).
. SKEWONLY Oracle determines the columns to collect histograms based on
the data distribution of the columns (9i).
- degree
o Degree of parallelism.
o NULL means use the table default value
o DBMS_STATS.DEFAULT_DEGREE means value based on the init parameters
(9i).
- granularity
o Only pertinent if the table is partitioned.
o DEFAULT: Gather global- and partition-level statistics.
o GLOBAL: Gather global statistics.
o PARTITION: Gather partition-level statistics.
o SUBPARTITION: Gather subpartition-level statistics.
o ALL: Gather all (subpartition, partition, and global)stats.
- cascade
o Default (CASCADE => FALSE). No Index Gathering.
o Gathers statistics on the indexes as well (TRUE).
o Index statistics gathering is not parallelized except in 9iR2
o Equivalent to gather_index_stats procedure on each of the indexes in
the database in addition to gathering table and column statistics.
- options
o GATHER: Default. Gathers statistics on all objects.
o GATHER AUTO: Gathers all necessary statistics automatically.
Oracle implicitly determines which objects need new statistics, and
determines how to gather those statistics (9i).
o GATHER STALE: Gathers statistics on stale objects as determined by
looking at the *_tab_modifications views.
o GATHER EMPTY: Gathers statistics on objects which currently have no
statistics.
- gather_sys
o Gathers statistics on the objects owned by the 'SYS' user (TRUE)
DECLARE
sql_stmt VARCHAR2(1024);
BEGIN
FOR tab_rec IN (SELECT owner,table_name
FROM all_tables WHERE owner like UPPER('&1') ) LOOP
sql_stmt := 'BEGIN
dbms_stats.gather_table_stats
(ownname => :1,
tabname => :2,
partname => null,
estimate_percent => 10,
degree => 3 ,
cascade => true); END;' ;
EXECUTE IMMEDIATE sql_stmt USING
tab_rec.owner, tab_rec.table_name ;
END LOOP;
END;
/
Oracle can gather some statistics automatically while
creating or rebuilding a B-tree or bitmap index. The COMPUTE
STATISTICS option of CREATE INDEX
or ALTER INDEX ... REBUILD
enables this gathering of statistics. |
The statistics that Oracle gathers for the COMPUTE
STATISTICS
option depend on whether the index is
partitioned or nonpartitioned.
To ensure correctness of the statistics, Oracle always
uses base tables when creating an index with the COMPUTE
STATISTICS
option, even if another index is available that could be used to create
the index.
If you do not use the COMPUTE
STATISTICS
clause, or if you have made significant changes to the data, then use
the DBMS_STATS
.GATHER_INDEX_STATS
procedure
to collect index statistics.
You should analyze the table after creating a
function-based index, to allow Oracle to collect column statistics
equivalent information for the expression. Optionally, you can collect
histograms for the index expressions by specifying for all
hidden columns size number_of_buckets
in the METHOD_OPT argument to the DBMS_STATS
procedures. |
Before gathering new statistics for a particular schema,
use the DBMS_STATS
.EXPORT_SCHEMA_STATS
procedure to extract and save existing statistics. Then use DBMS_STATS
.GATHER_SCHEMA_STATS
to gather new statistics. You can implement both of these with a single
call to the GATHER_SCHEMA_STATS
procedure, by specifying
additional parameters.
Before you can utilize
automated statistics gathering for
a particular table, you must bring either the tables of a specific
schema or a complete database into the monitoring mode. Do this with
the DBMS_STATS
.ALTER_SCHEMA_TAB_MONITORING
or DBMS_STATS
.ALTER_DATABASE_TAB_MONITORING
procedures. Alternatively, you can enable the monitoring attribute
using the MONITORING
keyword. This keyword is part of the
CREATE
TABLE
and ALTER
TABLE
statement syntax. Monitoring tracks the approximate number of INSERT
s,
UPDATE
s, and DELETE
s
for that table since the last time statistics were gathered. Oracle
uses this data to identify tables with stale statistics. Then, you can
enable automated statistics gathering by setting up a recurring job
(perhaps by using job queues) that invokes DBMS_STATS
.GATHER_TABLE_STATS
with the GATHER
STALE
option at an
appropriate interval for your application.
Objects are considered
stale when 10% of the total rows
have been changed. When you issue GATHER_TABLE_STATS
with
GATHER
STALE
, the procedure checks the USER_TAB_MODIFICATIONS
view. If a monitored table has been modified more than 10%, then
statistics are gathered again. The information about changes of tables,
as shown in the USER_TAB_MODIFICATIONS
view, can be
flushed from the SGA into the data dictionary with the DBMS_STATS
.FLUSH_DATABASE_MONITORING_INFO
procedure.
Note: There can be a few minutes delay while Oracle propagates information to this view. |
To disable monitoring, use the DBMS_STATS
.ALTER_SCHEMA_TAB_MONITORING
or DBMS_STATS
.ALTER_DATABASE_TAB_MONITORING
procedures, or use the NOMONITORING
keyword. Example:
SQL> exec dbms_stats.ALTER_SCHEMA_TABLE_MONITORING('<owner>',TRUE);
SQL> exec dbms_stats.ALTER_DATABASE_TABLE_MONITORING(TRUE);
==================
Oracle9i Examples:
==================
In Oracle9i, only 1 statement is required to gather statistics on objects which are new or stale. This is because GATHER AUTO is noOracle9i - Gathering statistics for objects in a schema:
--------------------------------------------------------
exec dbms_stats.gather_schema_stats(
ownname => NULL,
estimate_percent => 100, -- The estimate_percent can be automatically determined in 9i using: estimate_percent => dbms_stats.auto_sample_size,
granularity => 'ALL',
method_opt => 'FOR ALL COLUMNS SIZE AUTO',
degree => DBMS_STATS.DEFAULT_DEGREE,
options => 'GATHER AUTO',
cascade => TRUE
);Oracle9i - Gathering statistics for objects in the database:
------------------------------------------------------------
exec dbms_stats.gather_database_stats(
estimate_percent => 100, -- The estimate_percent can be automatically determined in 9i using: estimate_percent => dbms_stats.auto_sample_size,
granularity => 'ALL',
method_opt => 'FOR ALL COLUMNS SIZE AUTO',
degree => DBMS_STATS.DEFAULT_DEGREE,
options => 'GATHER AUTO',
cascade => TRUE
);
By using the constant DBMS_STATS.AUTO_SAMPLE_SIZE, Oracle automatically determines the best sample size for good statistics.
The GATHER
STALE
option
gathers statistics only for tables that have stale statistics and for
which you have enabled monitoring.
The GATHER
STALE
option
maintains up-to-date statistics for the cost-based optimizer. Using
this option at regular intervals also avoids the overhead associated
with gathering statistics on all tables at one time. The GATHER
option can incur much more overhead, because this option generally
gathers statistics for a greater number of tables than GATHER
STALE
.
GATHER_SCHEMA_STATS
and GATHER_DATABASE_STATS
procedures to establish a frequency of statistics collection that is
appropriate for the application. The frequency of collection intervals
should balance the task of providing accurate statistics for the
optimizer against the processing overhead incurred by the statistics
collection processSELECT INDEX_NAME "NAME", NUM_ROWS, DISTINCT_KEYS "DISTINCT",
LEAF_BLOCKS, CLUSTERING_FACTOR "CF", BLEVEL "LEVEL",
AVG_LEAF_BLOCKS_PER_KEY "ALFBPKEY"
FROM DBA_INDEXES
WHERE OWNER = 'SH'
ORDER BY INDEX_NAME;
Typical output is:
NAME NUM_ROWS DISTINCT LEAF_BLOCKS CF LEVEL ALFBPKEY
-------------------------- -------- -------- ----------- ------- ------- ----------
CUSTOMERS_PK 50000 50000 454 4405 2 1
PRODUCTS_PK 10000 10000 90 1552 1 1
PRODUCTS_PROD_CAT_IX 10000 4 99 4422 1 24
PRODUCTS_PROD_SUBCAT_IX 10000 37 170 6148 2 4
SALES_PROD_BIX 6287 909 1480 6287 1 1
SALES_PROMO_BIX 4727 459 570 4727 1 1
SELECT COLUMN_NAME, NUM_DISTINCT, NUM_NULLS, NUM_BUCKETS, DENSITY
FROM DBA_TAB_COL_STATISTICS
WHERE TABLE_NAME ="PA_EXPENDITURE_ITEMS_ALL"
ORDER BY COLUMN_NAME;
This returns the following data:
COLUMN_NAME NUM_DISTINCT NUM_NULLS NUM_BUCKETS DENSITY
------------------------------ ------------ ---------- ----------- ----------
BURDEN_COST 4300 71957 1 .000232558
BURDEN_COST_RATE 675 7376401 1 .001481481
CONVERTED_FLAG 1 16793903 1 1 poor
COST_BURDEN_DISTRIBUTED_FLAG 2 15796 1 .5 poor
COST_DISTRIBUTED_FLAG 2 0 1 .5 poor
COST_IND_COMPILED_SET_ID 87 6153143 1 .011494253
EXPENDITURE_ID 1171831 0 1 8.5337E-07 GOOD
TASK_ID 8648 0 1 .000115634
TRANSFERRED_FROM_EXP_ITEM_ID 1233787 15568891 1 8.1051E-07 GOOD
The cost-based optimizer can use data value histograms to get accurate estimates of the distribution of column data. A histogram partitions the values in the column into bands, so that all column values in a band fall within the same range. Histograms provide improved selectivity estimates in the presence of data skew, resulting in optimal execution plans with nonuniform data distributions.
One of the fundamental tasks of the cost-based optimizer is determining the selectivity of predicates that appear in queries. Selectivity estimates are used to decide when to use an index and the order in which to join tables. Some attribute domains (a table's columns) are not uniformly distributed.
The cost-based optimizer uses height-based histograms on
specified attributes to describe the distributions of nonuniform
domains. In a height-based histogram, the column values are divided
into bands so that each band contains approximately the same number of
values. The useful information that the histogram provides, then, is
where in the range of values the endpoints fall.
Histograms can affect performance and should be used only
when they substantially improve query plans. Histogram statistics data
is persistent, so the space required to save the data depends on the
sample size. In general, create histograms on columns that are used
frequently in WHERE
clauses of queries and have a highly
skewed data distribution. For uniformly distributed data, the
cost-based optimizer can make fairly accurate guesses about the cost of
executing a particular statement without the use of histograms.
Histograms, like all other optimizer statistics, are static. They are useful only when they reflect the current data distribution of a given column. (The data in the column can change as long as the distribution remains constant.) If the data distribution of a column changes frequently, you must recompute its histogram frequently.
Histograms are not useful for columns with the following characteristics:
You generate histograms by using the DBMS_STATS
package. You can generate histograms for columns of a table or
partition. For example, to create a 10-bucket histogram on the SAL
column of the emp
table, issue the following statement:
EXECUTE DBMS_STATS.GATHER_TABLE_STATS
('scott','emp', METHOD_OPT => 'FOR COLUMNS SIZE 10 sal');
The SIZE
keyword declares the maximum
number of buckets for the histogram. You would create a histogram on
the SAL
column if there were an unusually high number of employees with the
same salary and few employees with other salaries. You can also collect
histograms for a single partition of a table.
Oracle Corporation recommends using the DBMS_STATS
package to have the database automatically decide which columns need
histograms. This is done by specifying SIZE
AUTO
.
UPDATE so_lines l
SET open_flag=null,
s6=10,
s6_date=sysdate,
WHERE l.line_type_code in ('REGULAR','DETAIL','RETURN') AND
l.open_flag = 'Y' AND NVL(l.shipped_quantity, 0)=0 OR
NVL(l.shipped_quantity, 0) != 0 AND
l.shipped_quantity +NVL(l.cancelled_quantity, 0)= l.ordered_quantity)) AND
l.s6=18
This query shows the skewed distribution of data values
for s6
. In this case, there are two distinct non-null
values: 10 and 18. The majority of the rows consists of s6
= 10 (1,589,464), while a small number of rows consist of s6
= 18 (13,091).
S6: COUNT(*)
======================
10 1,589,464
18 13,091
NULL 21,889
The selectivity of column s6
, where s6
=
18
:
S = 13,091 / (13,091 + 1,589,464) = 0.008
If No Histogram is Used: The selectivity of column s6
is assumed to be 50%, uniformly distributed across 10 and 18. This is
not selective; therefore, s6
is not an ideal choice for
use as an index.
If a Histogram is Used: The data distribution information
is stored in the dictionary. This allows the optimizer to use this
information and compute the correct selectivity based on the data
distribution. In this example,
the selectivity, based on the histogram data, is 0.008. This is a
relatively high, or good, selectivity, which leads the optimizer to use
an index on column s6
in the execution plan.
- DBMS_STATS.GATHER_TABLE_STATS
DBMS_STATS.GATHER_TABLE_STATS
gathers statistics for a table and its columns, and optionally the
associated
indexes. Syntax:
dbms_stats.gather_table_stats(ownname,
tabname, partname,
estimate_percent, block_sample,
method_opt,
degree, granularity,
cascade,
stattab, statid, statown);
The first two parameters are mandatory, the rest are defaulted to a value.
e.g.:
Estimate statistics for a table and its columns
exec dbms_stats.gather_table_stats (ownname => 'INV', tabname =>
'MTL_MATERIAL_TRX', estimate_percent => 5);
e.g.:
Estimate statistics for a table, its columns and indexes.
exec dbms_stats.gather_table_stats(ownname => 'APPS', tabname =>
'AM21', estimate_percent
=> 5, cascade => true);
e.g.:
Estimate statistics in parallel, the following uses 8 threads to
complete the
task
Session
- A
exec dbms_stats.gather_table_stats(ownname => 'INV', tabname =>
'MTL_MATERIAL_TRX', estimate_percent => 5, degree => 8);
Session
- B (When the above process is running)
select * from v$px_process;
SERV|STATUS | PID|SPID | SID| SERIAL#The below example allows generation of column statistics in parallel. The degree of the table is initially set to 8 and the "SIZE 1" makes use of this. Refer to the Histogram section below to find out about the SIZE
____|_________|__________|_________|__________|__________
P000|IN USE | 50|9684 | 7| 50586
P001|IN USE | 65|9686 | 60| 51561
P002|IN USE | 66|9688 | 17| 2694
P003|IN USE | 67|9690 | 30| 39243
P004|IN USE | 68|9692 | 74| 11017
P005|IN USE | 69|9694 | 48| 4253
P006|IN USE | 70|9696 | 76| 17
P007|IN USE | 71|9698 | 68| 1285
e.g.:
Estimate statistics for columns in a table, this will also generate statistics for tables.
exec dbms_stats.gather_table_stats(ownname => 'SYS', tabname => 'AM01', estimate_percent => 5, method_opt => 'FOR ALL COLUMNS');
e.g.:
No statistics are collected for Global Temporary Tables; handle these with care. Make use of hints to drive queries on these. Provide explicit statistics by using DBMS_STATS.SET_TABLE_STATS if your temporary table is used for processing huge amounts of data. |
select num_rows, blocks, avg_row_len, temporary, user_stats
from dba_tables
where table_name = 'AM21';
NUM_ROWS BLOCKS AVG_ROW_LEN T USE
---------- ---------- ----------- - ---
Y NO
exec dbms_stats.gather_table_stats('DATASTAGE', 'TMP_GL_VALUE_DIFF');
*
ERROR at line 1:
ORA-20000: Gathering statistics for a temporary table is not supported
exec dbms_stats.set_table_stats(ownname => 'SYS', tabname => 'AM21', numrows => 3000, numblks => 300, avgrlen => 50);
select num_rows, blocks, avg_row_len, temporary, user_stats
from dba_tables
where table_name = 'AM21';
NUM_ROWS BLOCKS AVG_ROW_LEN T USE
---------- ---------- ----------- - ---
3000 300 50 Y YES
- DBMS_STATS.GATHER_INDEX_STATS
DBMS_STATS.GATHER_INDEX_STATS
gathers statistics for indexes. Index statistics cannot be generated in
parallel. In addition, the block sampling option available in tables is
not
available for indexes. Syntax:
dbms_stats.gather_index_stats(ownname, indname, partname,
estimate_percent, stattab, statid, statown);
e.g.:
exec dbms_stats.gather_index_stats (ownname
=> 'INV', indname => 'MTL_SYSTEM_ITEMS_JHN99');
DBMS_STATS.GATHER_SCHEMA_STATS
DBMS_STATS.GATHER_SCHEMA_STATS
gathers statistics for a given schema. Syntax:
dbms_stats.gather_schema_stats(ownname, estimate_percent, block_sample,
method_opt, degree, granularity, cascade, stattab, statid, options,
objlist, statown);
declare
l_owner varchar2(30) := 'QP';
l_emptylst dbms_stats.objecttab;
begin
dbms_stats.gather_schema_stats(ownname => l_owner,
options => 'LIST EMPTY', objlist => l_emptylst);
for i in nvl(l_emptylst.first, 0) .. nvl(l_emptylst.last, 0) loop
dbms_output.put_line(l_emptylst(i).objtype || '/' || l_emptylst(i).objname);
end loop;
end;
/
DBMS_STATS.GATHER_DATABASE_STATS
DBMS_STATS.GATHER_DATABASE_STATS
gathers statistics for the complete database. In 8i, this will generate
statistics for the SYS schema also. This has been rectified in Oracle
9i. For
Oracle 8i, an alternative is to generate statistics for individual
schemas or
delete SYS schema statistics after generating statistics at the
database level. Syntax:
dbms_stats.gather_database_stats(estimate_percent, block_sample,
method_opt, degree, granularity,
cascade, stattab, statid,
options, objlist, statown);
- DBMS_STATS.GET_TABLE_STATS
Get
table statistics. Syntax:
e.g.:
getting table statistics data.
declare
l_numrows number;
l_numblks number;
l_avgrlen number;
begin
dbms_stats.get_table_stats(ownname => 'SYS', tabname => 'AM01',
numrows => l_numrows, numblks => l_numblks, avgrlen => l_avgrlen);
dbms_output.put_line('No. of rows: ' || l_numrows);
dbms_output.put_line('No. of blks: ' || l_numblks);
dbms_output.put_line('Avg row length: ' || l_avgrlen);
end;
/
No. of rows: 4106860
No. of blks: 6219
Avg row length: 3
- DBMS_STATS.GET_COLUMN_STATS
Get
column statistics present in the dictionary. Syntax:
e.g.:
getting statistics for a column.
declare
l_distcnt number;
l_density number;
l_nullcnt number;
l_srec dbms_stats.statrec;
l_avgclen number;
begin
dbms_stats.get_column_stats(ownname => 'SYS', tabname => 'AM01',
colname => 'COL1', distcnt => l_distcnt, density => l_density,
nullcnt => l_nullcnt, srec => l_srec, avgclen => l_avgclen);
dbms_output.put_line('No. of distinct values: ' || l_distcnt);
dbms_output.put_line('Density: ' || l_density);
dbms_output.put_line('Count of nulls: ' || l_nullcnt);
dbms_output.put_line('Avg. column length: ' || l_avgclen);
end;
/
No. of distinct values: 2
Density: .5
Count of nulls: 0
Avg. column length: 3
- DBMS_STATS.GET_INDEX_STATS
Get index statistics. Syntax
dbms_stats.get_index_stats(ownname, indname, partname, stattab, statid,
numrows, numlblks, numdist, avglblk, avgdblk,
clstfct, indlevel, statown);
e.g.: getting an index statistics.
declare
l_numrows number;
l_numlblks number;
l_numdist number;
l_avglblk number;
l_avgdblk number;
l_clstfct number;
l_indlevel number;
begin
dbms_stats.get_index_stats(ownname => 'SYS', indname => 'AM01_N1',
numrows => l_numrows, numlblks => l_numlblks,
numdist => l_numdist, avglblk => l_avglblk,
avgdblk => l_avgdblk, clstfct => l_clstfct,
indlevel => l_indlevel);
dbms_output.put_line('No. of rows: ' || l_numrows);
dbms_output.put_line('No. of blks: ' || l_numlblks);
dbms_output.put_line('No. of distinct values: ' || l_numdist);
dbms_output.put_line('Avg leaf blocks for distinct keys: ' || l_avglblk);
dbms_output.put_line('Avg data blocks pointed to in the table: ' || l_avgdblk);
dbms_output.put_line('Clustering factor: ' || l_clstfct);
dbms_output.put_line('Index height: ' || l_indlevel);
end;
/
No. of rows: 3819711
No. of blks: 11092
No. of distinct values: 1
Avg leaf blocks for distinct keys: 11092
Avg data blocks pointed to in the table: 14616
Clustering factor: 14616
Index height: 2
DBMS_STATS also includes routines for gathering statistics and storing them outside the dictionary. This does not influence the optimizer. Most of the procedures in this package as mentioned above have three common parameters - STATID, STATTAB and STATOWN that are related to user processing of statistics.
Advantages of this feature:
1. Estimated statistics at different percentages could be stored and used for testing.
2. Statistics generated on one database could be transferred to another database.
Automated Implementation Example
The following example uses the scott/tiger tables to demonstrate how to implement automated table monitoring and stale statistics gathering.
--
Perform a quick analyze to load in base statistics
-- (for illustrative purposes only!)
begin
DBMS_STATS.GATHER_SCHEMA_STATS(
ownname=>'scott',
estimate_percent=>null,
-- This
table is small, we'll compute
block_sample=>false,
method_opt=>'FOR ALL COLUMNS',
degree=>null,
-- No parallelism used in this example
granularity=>'ALL',
cascade=>true,
options=>'GATHER'
);
end;
/
--
Examine the current statistics
select table_name, num_rows, blocks,
avg_row_len
from user_tables where table_name='EMP';
TABLE_NAME
NUM_ROWS BLOCKS AVG_ROW_LEN
------------------------------ ---------- ---------- -----------
EMP
14
1 37
-- Turn monitoring on for the EMP table
alter table emp monitoring;
Table altered.
--
EXTRA NOTES:
--
-- I could also choose to turn on monitoring for all of the
tables within
-- Scott's schema by simply calling:
-- begin
-- dbms_stats.ALTER_SCHEMA_TAB_MONITORING('scott',true);
-- end;
--
-- Or even at the database level with:
-- begin
-- dbms_stats.alter_database_tab_monitoring(
-- monitoring=>true,
--
sysobjs=>false); -- Don't set to true,
see note below!
-- end;
--
-- Note: Although the option to collect statistics for SYS
objects is available
-- via ALTER_DATABASE_TAB_MONITORING, Oracle continues to
recommend
-- against this practice until the next major release after 9i
rel 2
/
-- Verify that monitoring is turned on
-- (noting the fact that we didn't actually run the other
-- two monitoring statements in the notes above (commented out))
select table_name, monitoring from
user_tables
2 where owner='SCOTT'
3 order by 2;
TABLE_NAME
MON
------------------------------ ---
BONUS
NO
DEPT
NO
DUMMY
NO
SALGRADE
NO
EMP
YES
6 rows selected.
-- Add some rows
insert into emp select * from emp;
14 rows created.
commit;
Commit complete.
--
Wait until the monitoring data is flushed (after timeout or db
shutdown)
-- or if you're really impatient, run the following (as a
DBA-priviledged account):
exec
dbms_stats.flush_database_monitoring_info;
PL/SQL
procedure successfully completed.
-- As scott/tiger, let's check USER_TAB_MODIFICATIONS to see what it has collected:
select * from user_tab_modifications;
TABLE_NAME PARTITION_NAME SUBPARTITION_NAME INSERTS UPDATES DELETES TIMESTAMP TRUNCATED
---------- -------------- ----------------- ------- ------- ------- --------- ---------
EMP 14 0 0 05-JUN-02 NO
-- Execute DBMS_STATS to gather stats on all stale tables
begin
DBMS_STATS.GATHER_SCHEMA_STATS(
ownname=>'scott',
estimate_percent=>null, -- This table is small, we'll compute
block_sample=>false,
method_opt=>'FOR ALL COLUMNS',
degree=>null, -- No parallelism used in this example
granularity=>'ALL',
cascade=>true,
options=>'GATHER STALE');
end;
/
-- Verify that the table is no longer listed in USER_TAB_MODIFICATIONS
select * from user_tab_modifications;
no rows selected.
--
Examine some of new statistics collected
select table_name, num_rows, blocks,
avg_row_len
from user_tables where table_name='EMP';
TABLE_NAME
NUM_ROWS BLOCKS AVG_ROW_LEN
------------------------------ ---------- ---------- -----------
EMP
28
1 37