Materialized

All about Materialized Views

Materialized views are a data warehousing/decision support system tool that can increase by many orders of magnitude the speed of queries that access a large number of records. In basic terms, they allow a user to query potentially terabytes of detail data in seconds. They accomplish this by transparently using pre-computed summarizations and joins of data. These pre-computed summaries would typically be very small compared to the original source data. When utilizing materialized views, some decisions need to be made up front. These decisions depend on the frequency of updates, how stale the data can get between updates, or whether the view is to be updated at all. With most Oracle features, the more current you want things, the more resources they’ll utilize. You can decide to populate the materialized view on creation or upon the first access of the materialized view.
Other decisions are about the refresh frequency.
There are four options to consider here: complete, fast, force, and never. The terms are rather descriptive as to the name of the option. The “complete” option rebuilds the materialized view data in its entirely. “Fast” requires a materialized view log (covered later in this section) that tracks the changes to the underlying tables. “Force” also requires a materialized view log, as this option will try to do a fast refresh; otherwise, it will do a complete refresh. The “never” option doesn’t do any rebuilds or refreshes at all. You also need to decide whether to perform the refresh when data is committed to the data source (using the “refresh on commit” option); otherwise, you’d get the “refresh on demand” option where you’d then have to periodically run the DBMS_MVIEW.REFRESH procedure.

In order to use materialized views, the Oracle DBA must set special initialization parameters and grant special authority to the users of materialized views. You start by setting these initialization parameters within Oracle to enable the mechanisms for materialized views and query rewrite
optimizer_mode = choose, first_rows, or all_rows
job_queue_interval = 3600
job_queue_processes = 1
query_rewrite_enabled = true
compatible = 8.1.5.0.0 (or greater)
query_rewrite_integrity = enforced

QUERY REWRITE INTEGRITY
This parameter controls how Oracle rewrites queries and may be set to one of three values:
ENFORCED - Queries will be rewritten using only constraints and rules that are enforced and guaranteed by Oracle. There are mechanisms by which we can tell Oracle about other inferred relationships, and this would allow for more queries to be rewritten, but since Oracle does not enforce those relationships, it would not make use of these facts at this level.
TRUSTED - Queries will be rewritten using the constraints that are enforced by Oracle, as well as any relationships existing in the data that we have told Oracle about, but are not enforced by the database.
STALE TOLERATED - Queries will be rewritten to use materialized views even if Oracle knows the data contained in the materialized view is ' stale ' (out-of-sync with the details). This might be useful in an environment where the summary tables are refreshed on a recurring basis, not on commit, and a slightly out-of-sync answer is acceptable.

The needed privileges are as follows:

CREATE SESSION

CREATE TABLE

CREATE MATERIALIZED VIEW

QUERY REWRITE

Finally, you must be using the Cost Based Optimizer CBO in order to make use of query rewrite. If you do not use the CBO, query rewrite will not take place.

The example will demonstrate what a materialized view entails. The concept is that of reducing the execution time of a long running query transparently, by summarizing data in the database. A query against a large table will be transparently rewritten into a query against a very small table, without any loss of accuracy in the answer. For the example we create our own big table based on the system view ALL_OBJECTS.

Prepare the large table BIGTAB:

sqlplus scott/tiger
set echo on
set termout off

drop table bigtab;

create table bigtab
nologging
as
select * from all_objects
union all
select * from all_objects
union all
select * from all_objects;

insert /*+ APPEND */ into bigtab
select * from bigtab;
commit;
insert /*+ APPEND */ into bigtab
select * from bigtab;
commit;
insert /*+ APPEND */ into bigtab
select * from bigtab;
commit;

analyze table bigtab compute statistics;
select count(*) from bigtab;

COUNT(*)
----------
708456

Run query against this BIGTABLE

Initially this quewry will require a full scan of the large table.

set autotrace on
set timing on
select owner, count(*) from bigtab group by owner;

OWNER                            COUNT(*)
------------------------------ ----------
CTXSYS                               6264
ELAN                                 1272
HR                                    816
MDSYS                                5640
ODM                                  9768
ODM_MTR                               288
OE                                   2064
OLAPSYS                             10632
ORDPLUGINS                            696
ORDSYS                              23232
OUTLN                                 168
PM                                    216
PUBLIC                             278184
QS                                    984
QS_ADM                                168
QS_CBADM                              576
QS_CS                                 552
QS_ES                                 936
QS_OS                                 936
QS_WS                                 936
SCOTT                                 264
SH                                   4176
SYS                                324048
SYSTEM                              15096
TEST                                 4536
WKSYS                                6696
WMSYS                                3072
XDB                                  6240

28 rows selected.

Elapsed: 00:00:07.06

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
          (Cost=2719 Card=28 Bytes=140)
   1    0   SORT (GROUP BY) (Cost=2719 Card=28 Bytes=140)
   2    1     TABLE ACCESS (FULL) OF 'BIGTAB'
             (Cost=1226 Card=708456 Bytes=3542280)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      19815 consistent gets
      18443 physical reads
          0 redo size
        973 bytes sent via SQL*Net to client
        510 bytes received via SQL*Net from client
          3 SQL*Net roundtrips to/from client
          1 sorts (memory)
          0 sorts (disk)
         28 rows processed

In order to get the aggregate count, we must count 700'000+ records on over 19800 blocks. If you need this summary often per day, you can avoid counting the details each and every time by creating a materialized view of this summary data.

Create the Materialized View

sqlplus scott/tiger

grant query rewrite to scott;
alter session set query_rewrite_enabled=true;
alter session set query_rewrite_integrity=enforced;

create materialized view mv_bigtab
build immediate
refresh on commit
enable query rewrite
as
select owner, count(*)
from bigtab
group by owner;

analyze table mv_bigtab compute statistics;

Basically, what we've done is pre-calculate the object count, and define this summary information as a materialized view. We have asked that the view be immediately built and populated with data. You'll notice that we have also specified REFRESH ON COMMIT and ENABLE QUERY REWRITE. Also notice that we may have created a materialized view, but when we ANALYZE, we are analyzing a table. A materialized view creates a real table, and this table may be indexed, analyzed, and so on.

Now let's see the materialized view in action by issuing the same query again

set timing on
set autotrace traceonly
select owner, count(*)
from bigtab
group by owner;
set autotrace off
set timing off

28 rows selected.

Elapsed: 00:00:00.03

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
          (Cost=2 Card=28 Bytes=252)
   1    0   TABLE ACCESS (FULL) OF 'MV_BIGTAB'
           (Cost=2 Card=28 Bytes=252)

Statistics
----------------------------------------------------------
         11 recursive calls
          0 db block gets
         17 consistent gets
          0 physical reads
          0 redo size
        973 bytes sent via SQL*Net to client
        510 bytes received via SQL*Net from client
          3 SQL*Net roundtrips to/from client
          4 sorts (memory)
          0 sorts (disk)
         28 rows processed

No physical I/O this time around as the data was found in the cache. Our buffer cache will be much more efficient now as it has less to cache. W could not even begin to cache the previous query's working set, but now I can. Notice how our query plan shows we are now doing a full scan of the MV_BIGTAB table, even though we queried the detail table BIGTAB. When the SELECT OWNER, ... query is issued, the database automatically directs it to the materialized view.

Now, add a new row to the BIGTAB table and commit te change

insert into bigtab
(owner, object_name, object_type, object_id)
values ('Martin', 'Zahn', 'Akadia', 1111111);

commit;

set timing on
set autotrace traceonly
select owner, count(*)
from bigtab
where owner = 'Martin'
group by owner;
set autotrace off
set timing off

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
          (Cost=2 Card=1 Bytes=9)
   1    0   TABLE ACCESS (FULL) OF 'MV_BIGTAB'
           (Cost=2 Card=1 Bytes=9)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          4 consistent gets
          0 physical reads
          0 redo size
        439 bytes sent via SQL*Net to client
        499 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

The analysis shows that we scanned the materialized view MV_BIGTAB and found the new row. By specifying REFRESH ON COMMIT in our original definition of the view, we requested that Oracle maintain synchronization between the view and the details, the summary will be maintained as well.

This is relatively straightforward and is answered in a single word - performance. By calculating the answers to the really hard questions up front (and once only), we will greatly reduce the load on the machine, We will experience:

Less physical reads - There is less data to scan through.

Less writes - We will not be sorting/aggregating as frequently.

Decreased CPU consumption - We will not be calculating aggregates and functions on the data, as we will have already done that.

Markedly faster response times - Our queries will return incredibly quickly when a summary is used, as opposed to the details. This will be a function of the amount of work we can avoid by using the materialized view, but many orders of magnitude is not out of the question.

Materialized views will increase your need for one resource - more permanently allocated disk. We need extra storage space to accommodate the materialized views, of course, but for the price of a little extra disk space, we can reap a lot of benefit.
Materialized views work best in a read-only, or read-intensive environment. They are not designed for use in a high-end OLTP environment. They will add overhead to modifications performed on the base tables in order to capture the changes.
There are concurrency issues with regards to rising the REFRESH ON COMMIT option. Consider the summary example from before. Any rows that are inserted or deleted from this table will have to update one of 28 rows in the summary table in order to maintain the count in real time. This does not preclude the use of materialized views in an OLTP environment. For example if you use full refreshes on a recurring basis (during off-peak time) there will be no overhead added to the modifications, and there would be no concurrency issues. This would allow you to report on yesterday's activities, for example, and not query the live OLTP data for reports.

Refreshing materialized views
In Oracle9i, if you specify REFRESH FAST for a single-table aggregate materialized view, you must have created a materialized view log for the underlying table, or the refresh command will fail. When creating a materialized view, you have the option of specifying whether the refresh occurs manually (ON DEMAND) or automatically (ON COMMIT, DBMS_JOB). To use the fast warehouse refresh facility, you must specify the ON DEMAND mode. To refresh the materialized view, call one of the procedures in DBMS_MVIEW.
The DBMS_MVIEW package provides three types of refresh operations:
    * DBMS_MVIEW.REFRESH—Refreshes one or more materialized views
    * DBMS_MVIEW.REFRESH_ALL_MVIEWS—Refreshes all materialized views
    * DBMS_MVIEW.REFRESH_DEPENDENT—Refreshes all table-based materialized views

Materialized views may appear to be hard to work with at first. There will be cases where you create a materialized view, and you know that the view holds the answer to a certain question but, for some reason, Oracle does not. The more meta data provided, the more pieces of information about the underlying data you can give to Oracle, the better.
So, now that we can create a materialized view and show that it works, what are the steps Oracle will undertake to rewrite our queries? Normally, when QUERY REWRITE ENABLED is set to FALSE, Oracle will take your SQL as is, parse it, and optimize it. With query rewrites enabled, Oracle will insert an extra step into this process. After parsing, Oracle will attempt to rewrite the query to access some materialized view, instead of the actual table that it references. If it can perform a query rewrite, the rewritten query (or queries) is parsed and then optimized along with the original query. The query plan with the lowest cost from this set is chosen for execution. If it cannot rewrite the query, the original parsed query is optimized and executed as normal.

Manual complete refresh
A complete refresh occurs when the materialized view is initially defined, unless it references a prebuilt table, and a complete refresh may be requested at any time during the life of the materialized view. Because the refresh involves reading the detail table to compute the results for the materialized view, this can be a very time-consuming process, especially if huge amounts of data need to be read and processed.

Manual fast (incremental) refresh
If you specify REFRESH FAST (which means that only deltas performed by UPDATE, INSERT, DELETE on the base tables will be refreshed), Oracle performs further verification of the query definition to ensure that fast refresh can always be performed if any of the detail tables change. These additional checks include the following:
    * A materialized view log must be present for each detail table.
    * The RowIDs of all the detail tables must appear in the SELECT list of the MVIEW query definition.
    * If there are outer joins, unique constraints must be placed on the join columns of the inner table.
You can use the DBMS_MVIEW package to manually invoke either a fast refresh or a complete refresh, where F equals Fast Refresh and C equals Complete Refresh:
    EXECUTE DBMS_MVIEW.REFRESH('emp_dept_sum','F');

Automatic fast refresh of materialized views
The automatic fast refresh feature is completely new in Oracle9i, so you can refresh a snapshot with DBMS_JOB in a short interval according to the snapshot log. With Oracle 9i, it's possible to refresh automatically on the next COMMIT performed at the master table. This ON COMMIT refreshing can be used with materialized views on single-table aggregates and materialized views containing joins only. ON COMMIT MVIEW logs must be built as ROWID logs, not as primary-key logs. For performance reasons, it's best to create indexes on the ROWIDs of the MVIEW. Note that the underlying table for the MVIEW can be prebuilt. Example of a materialized view with an ON COMMIT refresh:

CREATE MATERIALIZED VIEW empdep
ON PREBUILT TABLE
REFRESH FAST ON COMMIT
ENABLE QUERY REWRITE
AS SELECT empno, ename, dname, loc,
            e.rowid emp_rowid,
            d.rowid dep_rowid
    FROM emp e, dept d
   WHERE e.deptno = d.deptno;

Monitoring materialized views
Oracle provides information in the data dictionary to monitor the behavior of materialized views. When you’re monitoring materialized views, it’s critical that you check the refresh interval in the dba_jobs view. Here is a SQL statement to check the generated job status for materialized views:
Monitoring materialized views

Oracle provides information in the data dictionary to monitor the behavior of materialized views. When you’re monitoring materialized views, it’s critical that you check the refresh interval in the dba_jobs view. Here is a SQL statement to check the generated job status for materialized views:
SELECT
SUBSTR(job,1,4) "Job", SUBSTR(log_user,1,5)          "User",
SUBSTR(schema_user,1,5)                              "Schema",
SUBSTR(TO_CHAR(last_date,'DD.MM.YYYY HH24:MI'),1,16) "Last Date",
SUBSTR(TO_CHAR(next_date,'DD.MM.YYYY HH24:MI'),1,16) "Next Date",
SUBSTR(broken,1,2) "B", SUBSTR(failures,1,6)        "Failed",
SUBSTR(what,1,20)                                    "Command"
FROM dba_jobs;

Job User Schem Last Date         Next Date       B Fail Command
---- ----- ----- ---------------- ---------------- - ---- -------------
90   SCOTT SCOTT 28.01.2000 11:33 28.01.2000 13:33 N 0    dbms_refresh.refresh

Summary table management, another term for the materialized view, has actually been around for some time in tools such as Oracle Discoverer. If you ran a query in SQL*PLUS, or from your Java JDBC client, then the query rewrite would not (could not) take place. Furthermore, the synchronization between the details (original source data) and the summaries could not be performed or validated for you automatically, since the tool ran outside the database.
Furthermore, since version 7.0, the Oracle database itself has actually implemented a feature with many of the characteristics of summary tables - the Snapshot. This feature was initially designed to support replication, but many would use it to ' pre-answer ' large queries. So, we would have snapshots that did not use a database link to replicate data from database to database, but rather just summarized or pre-joined frequently accessed data. This was good, but without any query rewrite capability, it was still problematic. The application had to know to use the summary tables in the first place, and this made the application more complex to code and maintain. If we added a new summary then we would have to find the code that could make use of it, and rewrite that code.
In Oracle 8.1.5 Oracle took the query rewriting capabilities from tools like Discoverer, the automated refresh and scheduling mechanisms from snapshots (that makes the summary tables ' self maintaining ' ), and combined these with the optimizer's ability to find the best plan out of many alternatives. This produced the materialized view.
With all of this functionality centralized in the database, now every application can take advantage of the automated query rewrite facility, regardless of whether access to the database is via SQL*PLUS, JDBC, ODBC, Pro*C, OCI, or some third party tool. Every Oracle 8i enterprise database can have summary table management. Also, since everything takes place inside the database, the details can be easily synchronized with the summaries, or at least the database knows when they aren't synchronized, and might bypass stale summaries.