Types of Materialized Views
The SELECT clause in the materialized view creation statement defines the data that the materialized view is to contain. There are only a few restrictions on what may be specified. Any number of tables can be joined together, however, they cannot be remote tables if you wish to take advantage of query rewrite or the warehouse refresh facility (part of the DBMS_OLAP package). Besides tables, views, inline views, subqueries, and materialized views may all be joined or referenced in the SELECT clause.
The types of materialized views are:
- Materialized Views with Joins and Aggregates
- Single-Table Aggregate Materialized Views
- Materialized Views Containing Only Joins
Materialized Views with Joins and Aggregates
In data warehouses, materialized views would normally contain one of the aggregates shown in Example 2 below. For fast refresh to be possible, the SELECT list must contain all of the GROUP BY columns (if present), and may contain one or more aggregate functions. The aggregate function must be one of: SUM, COUNT(x), COUNT(*),COUNT(DISTINCT x), AVG, VARIANCE, STDDEV, MIN, and MAX, and the expression to be aggregated can be any SQL value expression.
If a materialized view contains joins and aggregates, then it cannot be fast refreshed using a materialized view log. Therefore, for a fast refresh to be possible, only new data can be added to the detail tables and it must be loaded using the direct path method.
Here are some examples of the type of materialized view that can be created.
Create Materialized View: Example 1
CREATE MATERIALIZED VIEW store_sales_mv PCTFREE 0 TABLESPACE mviews STORAGE (initial 16k next 16k pctincrease 0) BUILD DEFERRED REFRESH COMPLETE ON DEMAND ENABLE QUERY REWRITE AS SELECT s.store_name, SUM(dollar_sales) AS sum_dollar_sales FROM store s, fact f WHERE f.store_key = s.store_key GROUP BY s.store_name;
Example 1 creates a materialized view store_sales_mv that computes the sum of sales by store. It is derived by joining the tables store and fact on the column store_key. The materialized view does not initially contain any data because the build method is DEFERRED. A complete refresh is required for the first refresh of a build deferred materialized view. When it is refreshed, a complete refresh is performed and, once populated, this materialized view can be used by query rewrite.
Create Materialized View: Example 2
CREATE MATERIALIZED VIEW store_stdcnt_mv PCTFREE 0 TABLESPACE mviews STORAGE (INITIAL 16k NEXT 16k PCTINCREASE 0) BUILD IMMEDIATE REFRESH FAST ENABLE QUERY REWRITE AS SELECT store_name, t.time_key, STDDEV(unit_sales) AS stdcnt_unit_sales AVG(unit_sales) AS avgcnt_unit_sales COUNT(unit_sales) AS count_days SUM(unit_sales) AS sum_unit_sales FROM store s, fact f, time t WHERE s.store_key = f.store_key AND f.time_key = t.time_key GROUP BY store_name, t.time_key;
The statement above creates a materialized view store_stdcnt_mv that computes the standard deviation for the number of units sold by a store on a given date. It is derived by joining the tables store, time, and fact on the columns store_key and time_key. The materialized view is populated with data immediately because the build method is immediate and it is available for use by query rewrite. In this example, the default refresh method is FAST, which is allowed because the COUNT and SUM aggregates have been included to support fast refresh of the STDDEV aggregate.
Single-Table Aggregate Materialized Views
A materialized view that contains one or more aggregates (SUM, AVG, VARIANCE, STDDEV, COUNT) and a GROUP BY clause may be based on a single table. The aggregate function can involve an expression on the columns such as SUM(a*b). If this materialized view is to be incrementally refreshed, then a materialized view log must be created on the detail table with the INCLUDING NEW VALUES option, and the log must contain all columns referenced in the materialized view query definition.
CREATE MATERIALIZED VIEW log on fact with rowid (store_key, time_key, dollar_sales, unit_sales) including new values; CREATE MATERIALIZED VIEW sum_sales PARALLEL BUILD IMMEDIATE REFRESH FAST ON COMMIT AS SELECT f.store_key, f.time_key, COUNT(*) AS count_grp, SUM(f.dollar_sales) AS sum_dollar_sales, COUNT(f.dollar_sales) AS count_dollar_sales, SUM(f.unit_sales) AS sum_unit_sales, COUNT(f.unit_sales) AS count_unit_sales FROM fact f GROUP BY f.store_key, f.time_key;
In this example, a materialized view has been created which contains aggregates on a single table. Because the materialized view log has been created, the materialized view is fast refreshable. If DML is applied against the fact table, then, when the commit is issued, the changes will be reflected in the materialized view.
Table 8-1 illustrates the aggregate requirements for a single-table aggregate materialized view.
Table 8-1 Single-Table Aggregate Requirements
Note that COUNT(*) must always be present.
Incremental refresh for a single-table aggregate materialized view is possible after any type of DML to the base tables (direct load or conventional INSERT, UPDATE, or DELETE).
A single-table aggregate materialized view can be defined to be refreshed ON COMMIT or ON DEMAND. If it is ON COMMIT, the refresh is performed at commit time of the transaction that does DML on one of the materialized view's detail tables.
After a refresh ON COMMIT, you are urged to check the alert log and trace files to see if any error occurred during the refresh.
Materialized Views Containing Only Joins
Materialized views may contain only joins and no aggregates, such as in the next example where a materialized view is created which joins the fact table to the store table. The advantage of creating this type of materialized view is that expensive joins will be precalculated.
Incremental refresh for a materialized view containing only joins is possible after any type of DML to the base tables (direct load or conventional INSERT, UPDATE, or DELETE).
A materialized view containing only joins can be defined to be refreshed ON COMMIT or ON DEMAND. If it is ON COMMIT, the refresh is performed at commit time of the transaction that does DML on the materialized view's detail table.
If you specify REFRESH FAST, Oracle performs further verification of the query definition to ensure that fast refresh can be performed if any of the detail tables change. These additional checks include:
- A materialized view log must be present for each detail table.
- The rowids of all the detail tables must appear in the SELECT list of the materialized view query definition.
- If there are outer joins, unique constraints must exist on the join columns of the inner table.
For example, if you are joining the fact and a dimension table and the join is an outer join with the fact table being the outer table, there must exist unique constraints on the join columns of the dimension table.
If some of the above restrictions are not met, then the materialized view should be created as REFRESH FORCE to take advantage of incremental refresh when it is possible. If the materialized view is created as ON COMMIT, Oracle performs all of the fast refresh checks. If one of the tables did not meet all of the criteria, but the other tables did, the materialized view would still be incrementally refreshable with respect to the other tables for which all the criteria are met.
In a data warehouse star schema, if space is at a premium, you can include the rowid of the fact table only because this is the table that will be most frequently updated, and the user can specify the FORCE option when the materialized view is created.
A materialized view log should contain the rowid of the master table. It is not necessary to add other columns.
To speed up refresh, it is recommended that the user create indexes on the columns of the materialized view that stores the rowids of the fact table.
CREATE MATERIALIZED VIEW LOG ON fact WITH ROWID; CREATE MATERIALIZED VIEW LOG ON time WITH ROWID; CREATE MATERIALIZED VIEW LOG ON store WITH ROWID; CREATE MATERIALIZED VIEW detail_fact_mv PARALLEL BUILD IMMEDIATE REFRESH FAST AS SELECT f.rowid "fact_rid", t.rowid "time_rid", s.rowid "store_rid", s.store_key, s.store_name, f.dollar_sales, f.unit_sales, f.time_key FROM fact f, time t, store s WHERE f.store_key = s.store_key(+) AND f.time_key = t.time_key(+);
In this example, in order to perform a REFRESH FAST, unique constraints should exist on s.store_key and t.time_key. It is also recommended that indexes be created on the columns fact_rid, time_rid, and store_rid, as illustrated below, which will improve the performance of refresh.
CREATE INDEX mv_ix_factrid ON detail_fact_mv(fact_rid);
Alternatively, if the example shown above did not include the columns time_rid and store_rid, and if the refresh method was REFRESH FORCE, then this materialized view would be fast refreshable only if the fact table was updated but not if the tables time or store were updated.
CREATE MATERIALIZED VIEW detail_fact_mv PARALLEL BUILD IMMEDIATE REFRESH FORCE AS SELECT f.rowid "fact_rid", s.store_key, s.store_name, f.dollar_sales, f.unit_sales, f.time_key FROM fact f, time t, store s WHERE f.store_key = s.store_key(+) AND f.time_key = t.time_key(+);
Creating a Materialized View
A materialized view can be created with the CREATE MATERIALIZED VIEW statement or using Oracle Enterprise Manager. The following command creates the materialized view store_sales_mv.
CREATE MATERIALIZED VIEW store_sales_mv PCTFREE 0 TABLESPACE mviews STORAGE (INITIAL 16k NEXT 16k PCTINCREASE 0) PARALLEL BUILD IMMEDIATE REFRESH COMPLETE ENABLE QUERY REWRITE AS SELECT s.store_name, SUM(dollar_sales) AS sum_dollar_sales FROM store s, fact f WHERE f.store_key = s.store_key GROUP BY s.store_name;
It is not uncommon in a data warehouse to have already created summary or aggregation tables, and the DBA may not wish to repeat this work by building a new materialized view. In this instance, the table that already exists in the database can be registered as a prebuilt materialized view. This technique is described in "Registration of an Existing Materialized View".
Once you have selected the materialized views you want to create, follow the steps below for each materialized view.
- Do the physical design of the materialized view (existing user-defined materialized views do not require this step). If the materialized view contains many rows, then, if appropriate, the materialized view should be partitioned by a time attribute (if possible) and should match the partitioning of the largest or most frequently updated detail or fact table (if possible). Refresh performance generally benefits from a large number of partitions because it can take advantage of the parallel DML capabilities in Oracle.
- Use the CREATE MATERIALIZED VIEW statement to create and, optionally, populate the materialized view. If a user-defined materialized view already exists, then use the PREBUILT clause in the CREATE MATERIALIZED VIEW statement. Otherwise, use the BUILD IMMEDIATE clause to populate the materialized view immediately, or the BUILD DEFERRED clause to populate the materialized view at a more convenient time. The materialized view is disabled for use by query rewrite until the first REFRESH, after which it will be automatically enabled, provided the ENABLE QUERY REWRITE clause has been specified.
See Also: See Oracle8i SQL Reference for descriptions of the SQL statements CREATE MATERIALIZED VIEW, ALTER MATERIALIZED VIEW, ORDER BY, and DROP MATERIALIZED VIEW.
Naming
The name of a materialized view must conform to standard Oracle naming conventions. However, if the materialized view is based on a user-defined prebuilt table, then the name of the materialized view must exactly match that table name.
If you already have a naming convention for tables and indexes, you may consider extending this naming scheme to the materialized views so that they are easily identifiable. For example, instead of naming the materialized view sum_of_sales, it could be called sum_of_sales_mv to denote that this is a materialized view and not a table or view, for instance.
Storage Characteristics
Unless the materialized view is based on a user-defined prebuilt table, it requires and occupies storage space inside the database. Therefore, the storage needs for the materialized view should be specified in terms of the tablespace where it is to reside and the size of the extents.
If you do not know how much space the materialized view will require, then the DBMS_OLAP.ESTIMATE_SIZE package, which is described in Chapter 15, "Summary Advisor", can provide an estimate on the number of bytes required to store this materialized view. This information can then assist the design team in determining the tablespace in which the materialized view should reside.
Build Methods
Two build methods are available for creating the materialized view, as shown in the following table. If you select BUILD IMMEDIATE, the materialized view definition is added to the schema objects in the data dictionary, and then the fact or detail tables are scanned according to the SELECT expression and the results are stored in the materialized view. Depending on the size of the tables to be scanned, this build process can take a considerable amount of time.
An alternative approach is to use the BUILD DEFERRED clause, which creates the materialized view without data, thereby enabling it to be populated at a later date using the DBMS_MVIEW.REFRESH package described in Chapter 14, "Loading and Refreshing".
Build Method | Description |
---|---|
BUILD DEFERRED |
Create the materialized view definition but do not populate it with data. |
BUILD IMMEDIATE |
Create the materialized view and then populate it with data. |
Used for Query Rewrite
Even though a materialized view is defined, it will not automatically be used by the query rewrite facility. The clause ENABLE QUERY REWRITE must be specified if the materialized view is to be considered available for rewriting queries.
If this clause is omitted or specified as DISABLE QUERY REWRITE when the materialized view is initially created, the materialized view can subsequently be enabled for query rewrite with the ALTER MATERIALIZED VIEW statement.
If you define a materialized view as BUILD DEFERRED, it is also not eligible for query rewrite until it is populated with data.
Query Rewrite Restrictions
Query rewrite is not possible with all materialized views. If query rewrite is not occurring when expected, check to see if your materialized view satisfies all of the following conditions.
Materialized View Restrictions
- There cannot be non-repeatable expressions (ROWNUM, SYSDATE, non-repeatable PL/SQL functions, and so on) anywhere in the defining query.
- There cannot be references to RAW or LONG RAW datatypes or object REFs.
- The query must be a single-block query, that is, it cannot contain set functions (UNION, MINUS, and so on). However, a materialized view can have multiple query blocks (for example, inline views in the FROM clause and subselects in the WHERE or HAVING clauses).
- If the materialized view was registered as PREBUILT, the precision of the columns must agree with the precision of the corresponding SELECT expressions unless overridden by WITH REDUCED PRECISION.
Query Rewrite Restrictions
- If a query has both local and remote tables, only local tables will be considered for potential rewrite.
- None of the detail tables can be owned by SYS, and the materialized view cannot be owned by SYS.
Non-SQL Text Rewrite Restrictions
- SELECT and GROUP BY lists, if present, must be the same in the query and the materialized view and must contain straight columns, that is, no expressions are allowed in the columns.
- Aggregate operators must occur only as the outermost part of the expression; that is, aggregates such as AVG(AVG(x)) or AVG(x)+AVG(x) are not allowed.
- The WHERE clause must contain only inner or outer equijoins, and they can be connected only by ANDs. No ORs and no selections on single tables are allowed in the WHERE clause.
- HAVING or CONNECT BY clauses are not allowed.
Refresh Options
When you define a materialized view, you can specify its two refresh options: how to refresh and what type of refresh. If unspecified, the defaults are assumed as ON DEMAND and FORCE.
The two refresh execution modes are: ON COMMIT and ON DEMAND. The method you select will affect the type of materialized view that can be defined.
If you think the materialized view did not refresh, check the alert log or trace file.
If a materialized view fails during refresh at COMMIT time, the user has to explicitly invoke the refresh procedure using the DBMS_MVIEW package after addressing the errors specified in the trace files. Until this is done, the view will no longer be refreshed automatically at commit time.
You can specify how you want your materialized views to be refreshed from the detail tables by selecting one of four options: FORCE, COMPLETE, FAST, and NEVER.
Whether the fast refresh option is available will depend upon the type of materialized view. Fast refresh is available for three general classes of materialized views: materialized views with joins only, materialized views with joins and aggregates, and materialized views with aggregates on a single table.
General Restrictions on Fast Refresh
The materialized view's defining query is restricted as follows:
- The FROM list must contain base tables only (that is, no views).
- It cannot contain references to non-repeating expressions like SYSDATE and ROWNUM.
- It cannot contain references to RAW or LONG RAW data types.
- It cannot contain HAVING or CONNECT BY clauses.
- The WHERE clause can contain only joins and they must be equi-joins (inner or outer) and all join predicates must be connected with ANDs. No selection predicates on individual tables are allowed.
- It cannot have subqueries, inline views, or set functions like UNION or MINUS.
Restrictions on Fast Refresh on Materialized Views with Joins Only
Defining queries for materialized views with joins only and no aggregates have these restrictions on fast refresh:
- All restrictions from "General Restrictions on Fast Refresh".
- They cannot have GROUP BY clauses or aggregates.
- If the WHERE clause of the query contains outer joins, then unique constraints must exist on the join columns of the inner join table.
- Rowids of all the tables in the FROM list must appear in the SELECT list of the query.
- Materialized view logs must exist with rowids for all the base tables in the FROM list of the query.
- Materialized views from this category are FAST refreshable after DML or Direct Load to the base tables.
Restrictions on Fast Refresh on Materialized Views with Single-Table Aggregates
Defining queries for materialized views with single-table aggregates have these restrictions on fast refresh:
- All restrictions from "General Restrictions on Fast Refresh".
- They can only have a single table.
- The SELECT list must contain all GROUP BY columns.
- Expressions are allowed in the GROUP BY and SELECT clauses provided they are the same.
- They cannot have a WHERE clause.
- COUNT(*) must be present.
- They cannot have a MIN or MAX function.
- For a materialized view with an aggregate with a single table, a materialized view log must exist on the table and must contain all columns referenced in the materialized view. The log must have been created with the INCLUDING NEW VALUES clause.
- If AVG(expr) or SUM(expr) is specified, you must have COUNT(expr).
- If VARIANCE(expr) or STDDEV(expr) is specified, you must have COUNT(expr) and SUM(expr).
Restrictions on Fast Refresh on Materialized Views with Joins and Aggregates:
Defining queries for materialized views with joins and aggregates have these restrictions on fast refresh:
- All restrictions from "General Restrictions on Fast Refresh".
- The WHERE clause can contain inner equi-joins only (that is, no outer joins)
- Materialized views from this category are FAST refreshable after Direct Load to the base tables; they are not FAST refreshable after conventional DML to the base tables.
- Materialized views from this category can have only the ON DEMAND option (that is, no ON COMMIT refresh option is allowed).
ORDER BY
An ORDER BY clause is allowed in the CREATE MATERIALIZED VIEW statement. It is only used during the initial creation of the materialized view. It is not used during a full refresh or an incremental refresh.
To improve the performance of queries against large materialized views, store the rows in the materialized view in the order specified in the ORDER BY clause. This initial ordering provides physical clustering of the data. If indexes are built on the columns by which the materialized view is ordered, accessing the rows of the materialized view using the index will significantly reduce the time for disk I/O due to the physical clustering.
The ORDER BY clause is not considered part of the materialized view definition. As a result, there is no difference in the manner in which Oracle detects the various types of materialized views (for example, materialized join views with no aggregates). For the same reason, query rewrite is not affected by the ORDER BY clause. This feature is similar to the CREATE TABLE ... ORDER BY ... capability that exists in Oracle. For example:
CREATE MATERIALIZED VIEW sum_sales REFRESH FAST ON DEMAND AS SELECT cityid, COUNT(*) count_all, SUM(sales) sum_sales, COUNT(sales) cnt_sales FROM city_sales ORDER BY cityid;
In the above example, we would use the "ORDER BY cityid" clause only during the creation of the materialized view. The materialized view definition is not affected by the ORDER BY clause. The definition is:
SELECT cityid, COUNT(*) count_all, SUM(sales) sum_sales, COUNT(sales) cnt_sales FROM city_sales
Using Oracle Enterprise Manager
A materialized view can also be created using Oracle Enterprise Manager by selecting the materialized view object type. There is no difference in the information required if this approach is used. However, there are three property sheets which must be completed and you need to ensure that the option "Enable Query Rewrite" on the General sheet is selected.
Nested Materialized Views
A nested materialized view is a materialized view whose definition is based on another materialized view. A nested materialized view may reference other relations in the database in addition to materialized views.
Why Use Nested Materialized Views?
In a data warehouse, you typically create many aggregate views on a single join (for example, rollups along different dimensions). Incrementally maintaining these distinct materialized aggregate views can take a long time because the underlying join has to be performed many times. By using nested materialized views, the join is performed just once (while maintaining the materialized view containing joins only) and incremental maintenance of single-table aggregate materialized views is very fast due to the self-maintenance refresh operations on this class of views. Using nested materialized views also overcomes the limitation posed by materialized aggregate views, where incremental maintenance can only be done with direct-load insert.
Rules for Using Nested Materialized Views
You should keep a couple of points in mind when deciding whether to use nested materialized views.
- If you do not need the FAST REFRESH clause, then you can define a nested materialized view.
- Materialized views with joins only and single-table aggregate materialized views can be REFRESH FAST and nested if all the materialized views that they depend on are either materialized join views or single-table aggregate materialized views.
Restrictions when Using Nested Materialized Views
Only nested materialized join views and nested single-table aggregate materialized views can use incremental refresh. If you want complete refresh for all of your materialized views, then you can still nest these materialized views.
Materialized join views and single-table aggregate materialized views can be incrementally refreshed in the presence of arbitrary DML. In addition, the ON COMMIT refresh mode can be used with these types of materialized views. To maximize performance with materialized join views and single-table aggregate materialized views, you should first combine the two together. That is, define a single-table aggregate materialized view on top of a materialized join view. Such a composition yields a materialized aggregate view with respect to the base tables. Thus, logically:
single-table aggregate materialized view (materialized join view (<tables>))
is equivalent to:
materialized view with joins and aggregates(<tables>)
Figure 8-3 Nested Materialized View Equivalents
Figure 8-3 is just one of many possible ways to nest materialized views, but it is likely to be the most frequent and practical. Cyclic dependencies (a materialized view that indirectly references itself) are caught at creation time and an error is generated. Some restrictions are placed on the way you can nest materialized views. Oracle allows nesting a materialized view only when all the immediate dependencies of the materialized view do not have any dependencies amongst themselves. Thus, in the dependency tree, a materialized view can never be a parent as well as a grandparent of an object. For example, Figure 8-4, shows an impermissible materialized view because it is both a parent and grandparent of the same object.
Figure 8-4 Nested Materialized View Restriction
Limitations of Nested Materialized Views
Nested materialized views incur the space overhead of materializing the join and having a materialized view log. This is in contrast to materialized aggregate views where the space requirements of the materialized join view and its log are not demanding, but have relatively long refresh times due to multiple computations of the same join.
Nested materialized views are incrementally refreshable under any type of DML while materialized aggregate views are incrementally refreshable under direct-load insert only.
Example of a Nested Materialized View
You can create a materialized join view or a single-table aggregate materialized view on a single-table on top of another materialized join view, single-table aggregate materialized view, complex materialized view (a materialized view Oracle cannot perform incremental refresh on) or base table. All the underlying objects (be they materialized views or tables) on which the materialized view is defined on must have a materialized view log. All the underlying objects are treated as if they were tables. All the existing options for materialized join views and single-table aggregate materialized views can be used. Thus, ON COMMIT refresh is supported for these types of nested materialized views.
The following presents a retail database with an example schema and some materialized views to illustrate how nested materialized views can be created.
STORE (store_key, store_name, store_city, store_state, store_country) PRODUCT (prod_key, prod_name, prod_brand) TIME (time_key, time_day, time_week, time_month) FACT (store_key, prod_key, time_key, dollar_sales) /* create the materialized view logs */ CREATE MATERIALIZED VIEW LOG ON fact WITH ROWID; CREATE MATERIALIZED VIEW LOG ON store WITH ROWID; CREATE MATERIALIZED VIEW LOG ON time WITH ROWID; CREATE MATERIALIZED VIEW LOG ON product WITH ROWID; /*create materialized join view join_fact_store_time as incrementally refreshable at COMMIT time */ CREATE MATERIALIZED VIEW join_fact_store_time REFRESH FAST ON COMMIT AS SELECT s.store_key, s.store_name, f.dollar_sales, t.time_key, t.time_day, f.prod_key, f.rowid frid, t.rowid trid, s.rowid srid FROM fact f, store s, time t WHERE f.time_key = t.time_key AND f.store_key = s.store_key;
To create a nested materialized view on the table join_fact_store_time, you would have to create a materialized view log on the table. Because this will be a single-table aggregate materialized view on join_fact_store_time, you need to log all the necessary columns and use the INCLUDING NEW VALUES clause.
/* create materialized view log on join_fact_store_time */ CREATE MATERIALIZED VIEW log on join_fact_store_time WITH rowid (store_name, time_day, dollar_sales) INCLUDING new values; /* create the single-table aggregate materialized view sum_sales_store_time on join_fact_store_time as incrementally refreshable at COMMIT time. */ CREATE MATERIALIZED VIEW sum_sales_store_time REFRESH FAST ON COMMIT AS SELECT COUNT(*) cnt_all, SUM(dollar_sales) sum_sales, COUNT(dollar_sales) cnt_sales, store_name, time_day FROM join_fact_store_time GROUP BY store_name, time_day;
Note that the above single-table aggregate materialized view sum_sales_store_time is logically equivalent to a multi-table aggregate on the tables fact, time, and store whose definition is
SELECT COUNT(*) cnt_all, SUM(f.dollar_sales) sum_sales, COUNT(f.dollar_sales) cnt_sales, s.store_name, t.time_day FROM fact f, time t , store s WHERE f.time_key = t.time_key AND f.store_key = s.store_key GROUP BY store_name, time_day;
You can now define a materialized join view join_fact_store_time_prod as a join between join_fact_store_time and table product.
CREATE MATERIALIZED VIEW join_fact_store_time_prod REFRESH FAST ON COMMIT AS SELECT j.rowid jrid, p.rowid prid, j.store_name, j.prod_key, j.prod_name, j.dollar_sales FROM join_fact_store_time j, product p WHERE j.prod_key = p.prod_key;
The above schema can be diagrammatically represented as in Figure 8-5.
Figure 8-5 Nested Materialized View Schema
Nesting Materialized Views with Joins and Aggregates
Materialized views with joins and aggregates can be nested if they are refreshed as COMPLETE REFRESH. Thus, a user can arbitrarily nest materialized views having joins and aggregates. No incremental maintenance is possible for these materialized views.
Note that the ON COMMIT refresh option is not available for complex materialized views. Because you have to invoke the refresh functions manually, ordering has to be taken into account. This is because the refresh for a materialized view that is built on other materialized views will use the current state of the other materialized views, whether they are fresh or not. You can find the dependent materialized views for a particular object using the PL/SQL function GET_MV_DEPENDENCIES() in the DBMS_MVIEW package.
Nested Materialized View Usage Guidelines
Here are some guidelines on how to use nested materialized views:
- If incremental refresh is desired when a materialized view contains joins and aggregates and standard fast refresh cannot be used because DML is occurring on the tables, consider creating a single-table aggregate materialized view nested on a materialized join view.
- If incremental refresh is desired, you should incrementally refresh all the materialized views along any chain. It makes little sense to define an incrementally refreshable materialized view on top of a materialized view that must be refreshed with a complete refresh.
- When using materialized join views and single-table aggregate materialized views, you can define them to be ON COMMIT or ON DEMAND. The choice would depend on the application using the materialized views. If one expects the materialized views to always remain fresh, then all the materialized views should have the ON COMMIT refresh option. If the time window for refresh does not permit refreshing all the materialized views at commit time, then the appropriate materialized views could be created with (or altered to have) the ON DEMAND refresh option.
Registration of an Existing Materialized View
Some data warehouses have implemented materialized views in ordinary user tables. Although this solution provides the performance benefits of materialized views, it does not:
- provide query rewrite to all SQL applications
- enable materialized views defined in one application to be transparently accessed in another application
- generally support fast parallel or fast incremental materialized view refresh
Because of these problems, and because existing materialized views may be extremely large and expensive to rebuild, you should register your existing materialized view tables with Oracle whenever possible. You can register a user-defined materialized view with the CREATE MATERIALIZED VIEW ... ON PREBUILT TABLE statement. Once registered, the materialized view can be used for query rewrites or maintained by one of the refresh methods, or both.
In some cases, user-defined materialized views are refreshed on a schedule that is longer than the update cycle; for example, a monthly materialized view may be updated only at the end of each month, and the materialized view values always refer to complete time periods. Reports written directly against these materialized views implicitly select only data that is not in the current (incomplete) time period. If a user-defined materialized view already contains a time dimension:
- It should be registered and then incrementally refreshed each update cycle.
- A view should be created that selects the complete time period of interest.
For example, if a materialized view was formerly refreshed monthly at the end of each month, then the view would contain the selection WHERE time.month < CURRENT_MONTH().
- The reports should be modified to refer to the view instead of referring directly to the user-defined materialized view.
If the user-defined materialized view does not contain a time dimension, then:
- A new materialized view should be created that does include the time dimension (if possible).
- The view should aggregate over the time column in the new materialized view.
The contents of the table must reflect the materialization of the defining query at the time you register it as a materialized view, and each column in the defining query must correspond to a column in the table that has a matching datatype. However, you can specify WITH REDUCED PRECISION to allow the precision of columns in the defining query to be different from that of the table columns.
The table and the materialized view must have the same name, but the table retains its identity as a table and can contain columns that are not referenced in the defining query of the materialized view. These extra columns are known as unmanaged columns. If rows are inserted during a refresh operation, each unmanaged column of the row is set to its default value. Therefore, the unmanaged columns cannot have NOT NULL constraints unless they also have default values.
Unmanaged columns are not supported by single-table aggregate materialized views or materialized views containing joins only.
Materialized views based on prebuilt tables are eligible for selection by query rewrite provided the parameter QUERY_REWRITE_INTEGRITY is set to at least the level of TRUSTED. See Chapter 19, "Query Rewrite", for details about integrity levels.
When you drop a materialized view that was created on a prebuilt table, the table still exists--only the materialized view is dropped.
When a prebuilt table is registered as a materialized view and query rewrite is desired, the parameter QUERY_REWRITE_INTEGRITY must be set to at least STALE_TOLERATED because, when it is created, the materialized view is marked as unknown. Therefore, only stale integrity modes can be used.
CREATE TABLE sum_sales_tab PCTFREE 0 TABLESPACE mviews STORAGE (INITIAL 16k NEXT 16k PCTINCREASE 0) AS SELECT f.store_key SUM(dollar_sales) AS dollar_sales, SUM(unit_sales) AS unit_sales, SUM(dollar_cost) AS dollar_cost FROM fact f GROUP BY f.store_key; CREATE MATERIALIZED VIEW sum_sales_tab ON PREBUILT TABLE WITHOUT REDUCED PRECISION ENABLE QUERY REWRITE AS SELECT f.store_key, SUM(dollar_sales) AS dollar_sales, SUM(unit_sales) AS unit_sales, SUM(dollar_cost) AS dollar_cost FROM fact f GROUP BY f.store_key;
This example illustrates the two steps required to register a user-defined table. First, the table is created, then the materialized view is defined using exactly the same name as the table. This materialized view sum_sales_tab is eligible for use in query rewrite.
Partitioning a Materialized View
Because of the large volume of data held in a data warehouse, partitioning is an extremely useful option that can be used by the database designer.
Partitioning the fact tables improves scalability, simplifies system administration, and makes it possible to define local indexes that can be efficiently rebuilt. See Chapter 5, "Parallelism and Partitioning", for further details about partitioning.
Partitioning a materialized view also has benefits for refresh, since the refresh procedure can use parallel DML to maintain the materialized view. To realize these benefits, the materialized view has to be defined as PARALLEL and parallel DML must be enabled in the session.
When the data warehouse or data mart contains a time dimension, it is often desirable to archive the oldest information, and then reuse the storage for new information, the rolling window scenario. If the fact tables or materialized views include a time dimension and are horizontally partitioned by the time attribute, then management of rolling materialized views can be reduced to a few fast partition maintenance operations provided that the unit of data that is rolled out equals, or is at least aligned with, the range partitions.
If you plan to have rolling materialized views in your warehouse, then you should determine how frequently you plan to perform partition maintenance operations, and you should plan to partition fact tables and materialized views to reduce the amount of system administration overhead required when old data is aged out.
With the introduction of new partitioning options in Oracle8i, you are not restricted to using range partitions. For example, a composite partition using both a time value and, say, a key value could result in an ideal partition solution for your data.
An ideal case for using partitions is when a materialized view contains a subset of the data. For example, this can be achieved by defining an expression of the form WHERE time_key < '1-OCT-1998' in the SELECT expression for the materialized view. However, if a WHERE clause of this type is included, then query rewrite will be restricted to the exact match case, which severely restricts when the materialized view is used. To overcome this problem, use a partitioned materialized view with no WHERE clause and then query rewrite will be able to use the materialized view and it will only search the appropriate partition, thus improving query performance.
There are two approaches to partitioning a materialized view:
Partitioning the Materialized View
Partitioning a materialized view involves defining the materialized view with the standard Oracle partitioning clauses as illustrated in the example below. This example creates a materialized view called part_sales_mv which uses three partitions, is, by default, fast refreshed, and is eligible for query rewrite.
CREATE MATERIALIZED VIEW part_sales_mv PARALLEL PARTITION by RANGE (time_key) ( PARTITION time_key VALUES LESS THAN (TO_DATE('31-12-1997', 'DD-MM-YYYY')) PCTFREE 0 PCTUSED STORAGE (INITIAL 64k NEXT 16k PCTINCREASE 0) TABLESPACE sf1, PARTITION month2 VALUES LESS THAN (TO_DATE('31-01-1998', 'DD-MM-YYYY')) PCTFREE 0 PCTUSED STORAGE INITIAL 64k NEXT 16k PCTINCREASE 0) TABLESPACE sf2, PARTITION month3 VALUES LESS THAN (TO_DATE('31-01-1998', 'DD-MM-YYYY')) PCTFREE 0 PCTUSED STORAGE (INITIAL 64k NEXT 16k PCTINCREASE 0) TABLESPACE sf3) BUILD DEFERRED REFRESH FAST ENABLE QUERY REWRITE AS SELECT f.store_key, f.time_key, SUM(f.dollar_sales) AS sum_dol_sales, SUM(f.unit_sales) AS sum_unit_sales FROM fact f GROUP BY f.time_key, f.store_key;
Partitioning a Prebuilt Table
Alternatively, a materialized view can be registered to a partitioned prebuilt table as illustrated in the following example:
CREATE TABLE part_fact_tab( time_key, store_key, sum_dollar_sales, sum_unit_sale) PARALLEL PARTITION by RANGE (time_key) ( PARTITION month1 VALUES LESS THAN (TO_DATE('31-12-1997', 'DD-MM-YYYY')) PCTFREE 0 PCTUSED 99 STORAGE (INITITAL 64k NEXT 16k PCTINCREASE 0) TABLESPACE sf1, PARTITIION month2 VALUES LESS THAN (TO_DATE('31-01-1998', 'DD-MM-YYYY')) PCTFREE 0 PCTUSED 99 STORAGE (INITIAL 64k NEXT 16k PCTINCREASE 0) TABLESPACE sf2, PARTITION month3 VALUES LESS THAN (TO_DATE('31-01-1998', DD-MM-YYYY')) PCTFREE 0 PCTUSED 99 STORAGE INITIAL 64k NEXT 16k PCTINCREASE 0) TABLESPACE sf3) AS SELECT f.time_key, f.store_key, SUM(f.dollar_sales) AS sum_dollar_sales, SUM(f.unit_sales) AS sum_unit_sales FROM fact f GROUP BY f.time_key, f.store_key; CREATE MATERIALIZED VIEW part_fact_tab ON PREBUILT TABLE ENABLE QUERY REWRITE AS SELECT f.time_key, f.store_key, SUM(f.dollar_sales) AS sum_dollar_sales, SUM(f.unit_sales) AS sum_unit_sales FROM fact f GROUP BY f.time_key , f.store_key;
In this example, the table part_fact_tab has been partitioned over three months and then the materialized view was registered to use the prebuilt table. This materialized view is eligible for query rewrite because the ENABLE QUERY REWRITE clause has been included.
Indexing Selection for Materialized Views
The two main operations on a materialized view are query execution and incremental refresh, and each operation has different performance requirements. Query execution may need to access any subset of the materialized view key columns, and may need to join and aggregate over a subset of those columns. Consequently, query execution usually performs best if there is a single-column bitmap index defined on each materialized view key column.
In the case of materialized views containing only joins using the fast refresh option, it is highly recommended that indexes be created on the columns that contain the rowids to improve the performance of the refresh operation.
If a materialized view using joins and aggregates is fast refreshable, then an index is automatically created and cannot be disabled.
See Chapter 18, "Tuning Parallel Execution", for further details.
Invalidating a Materialized View
Dependencies related to materialized views are automatically maintained to ensure correct operation. When a materialized view is created, the materialized view depends on the detail tables referenced in its definition. Any DDL operation, such as a DROP or ALTER, on any dependency in the materialized view will cause it to become invalid.
A materialized view is automatically revalidated when it is referenced. In many cases, the materialized view will be successfully and transparently revalidated. However, if a column has been dropped in a table referenced by a materialized view or the owner of the materialized view did not have one of the query rewrite privileges and that has now been granted to the owner, the command:
ALTER MATERIALIZED VIEW mview_name ENABLE QUERY REWRITE
should be used to revalidate the materialized view. If there are any problems, an error will be returned.
The state of a materialized view can be checked by querying the tables USER_MVIEWS or ALL_MVIEWS. The column STALENESS will show one of the values FRESH, STALE, UNUSABLE, UNKNOWN, or UNDEFINED to indicate whether the materialized view can be used.
Security Issues
To create a materialized view, the privilege CREATE MATERIALIZED VIEW is required, and to create a materialized view that references a table in another schema, you must have SELECT privileges on that table. Moreover, if you enable query rewrite, you must have the QUERY REWRITE or GLOBAL QUERY REWRITE privilege to reference tables in your own schema. To enable query rewrite on a materialized view that references tables outside your schema, you must have the GLOBAL QUERY REWRITE privilege.
If you continue to get a privilege error while trying to create a materialized view and you believe that all the required privileges have been granted, then the problem is most likely due to a privilege not being granted explicitly and trying to inherit the privilege from a role instead. The owner of the materialized view must have explicitly been granted SELECT access to the referenced tables if they are in a different schema.
Guidelines for Using Materialized Views in a Data Warehouse
Determining what materialized views would be most beneficial for performance gains is aided by the analysis tools of the DBMS_OLAP package. Specifically, you can call the DBMS_OLAP.RECOMMEND_MV procedure to see a list of materialized views that Oracle recommends based on the statistics and the usage of the target database. See Chapter 15, "Summary Advisor", for further details.
If you are going to write your own materialized views without the aid of Oracle analysis tools, then use these guidelines to achieve maximum performance:
- Instead of defining multiple materialized views on the same tables with the same GROUP BY columns but with different measures, define a single materialized view including all of the different measures.
- If your materialized view includes the aggregated measure AVG(x), also include COUNT(x) to support incremental refresh. Similarly, if VARIANCE(x) or STDDEV(x) is present, then always include COUNT(x) and SUM(x) to support incremental refresh.
Altering a Materialized View
There are five amendments that can be made to a materialized view:
- change its refresh option (FAST/FORCE/COMPLETE/NEVER)
- change its refresh mode (ON COMMIT/ ON DEMAND)
- recompile
- enable/disable its use for query rewrite
- consider fresh
All other changes are achieved by dropping and then recreating the materialized view.
The COMPILE clause of the ALTER MATERIALIZED VIEW statement can be used when the materialized view has been invalidated as described in "Invalidating a Materialized View". This compile process is quick, and allows the materialized view to be used by query rewrite again.
For further information about ALTER MATERIALIZED VIEW, see Oracle8i SQL Reference.
Dropping a Materialized View
Use the DROP MATERIALIZED VIEW statement to drop a materialized view. For example:
DROP MATERIALIZED VIEW sales_sum_mv;
This command drops the materialized view sales_sum_mv
. If the materialized view was prebuilt on a table, then the table is not dropped but it can no longer be maintained with the refresh mechanism. Alternatively, you can drop a materialized view using Oracle Enterprise Manager.
Overview of Materialized View Management Tasks
The motivation for using materialized views is to improve performance, but the overhead associated with materialized view management can become a significant system management problem. Materialized view management activities include:
- Identifying what materialized views to create initially
- Indexing the materialized views
- Ensuring that all materialized views and materialized view indexes are refreshed properly each time the database is updated
- Checking which materialized views have been used
- Determining how effective each materialized view has been on workload performance
- Measuring the space being used by materialized views
- Determining which new materialized views should be created
- Determining which existing materialized views should be dropped
- Archiving old detail and materialized view data that is no longer useful
After the initial effort of creating and populating the data warehouse or data mart, the major administration overhead is the update process, which involves the periodic extraction of incremental changes from the operational systems; transforming the data; verification that the incremental changes are correct, consistent, and complete; bulk-loading the data into the warehouse; and refreshing indexes and materialized views so that they are consistent with the detail data.
The update process must generally be performed within a limited period of time known as the update window. The update window depends on the update frequency (such as daily or weekly) and the nature of the business. For a daily update frequency, an update window of two to six hours might be typical.
The update window usually displays the time for the following activities:
- Loading the detail data.
- Updating or rebuilding the indexes on the detail data.
- Performing quality assurance tests on the data.
- Refreshing the materialized views.
- Updating the indexes on the materialized views.
A popular and efficient way to load data into a warehouse or data mart is to use SQL*Loader with the DIRECT or PARALLEL option or to use another loader tool that uses the Oracle direct path API.
See Also:
See Oracle8i Utilities for the restrictions and considerations when using SQL*Loader with the DIRECT or PARALLEL keywords. |
Loading strategies can be classified as one-phase or two-phase. In one-phase loading, data is loaded directly into the target table, quality assurance tests are performed, and errors are resolved by performing DML operations prior to refreshing materialized views. If a large number of deletions are possible, then storage utilization may be adversely affected, but temporary space requirements and load time are minimized. The DML that may be required after one-phase loading causes multi-table aggregate materialized views to become unusable in the safest rewrite integrity level.
In a two-phase loading process:
- Data is first loaded into a temporary table in the warehouse.
- Quality assurance procedures are applied to the data.
- Referential integrity constraints on the target table are disabled, and the local index in the target partition is marked unusable.
- The data is copied from the temporary area into the appropriate partition of the target table using INSERT AS SELECT with the PARALLEL or APPEND hint.
- The temporary table is dropped.
- The constraints are enabled, usually with the NOVALIDATE option.
Immediately after loading the detail data and updating the indexes on the detail data, the database can be opened for operation, if desired. Query rewrite can be disabled by default (with ALTER SYSTEM SET QUERY_REWRITE_ENABLED = FALSE) until all the materialized views are refreshed, but enabled at the session level for any users who do not require the materialized views to reflect the data from the latest load (with ALTER SESSION SET QUERY_REWRITE_ENABLED = TRUE). However, as long as QUERY_REWRITE_INTEGRITY = ENFORCED or TRUSTED, this is not necessary because the system ensures that only materialized views with updated data participate in a query rewrite.
------------------Some sample script-------------------------------------------------------------------
DROP MATERIALIZED VIEW TMS.TEST1MV;
CREATE MATERIALIZED VIEW TMS.TEST1MV
TABLESPACE TMSDATA
PCTUSED 0
PCTFREE 10
INITRANS 2
MAXTRANS 255
STORAGE (
INITIAL 104K
NEXT 104K
MINEXTENTS 1
MAXEXTENTS UNLIMITED
PCTINCREASE 0
BUFFER_POOL DEFAULT
)
NOCACHE
LOGGING
NOCOMPRESS
NOPARALLEL
BUILD IMMEDIATE
USING INDEX
TABLESPACE TMSDATA
PCTFREE 10
INITRANS 2
MAXTRANS 255
STORAGE (
INITIAL 104K
NEXT 104K
MINEXTENTS 1
MAXEXTENTS UNLIMITED
PCTINCREASE 0
BUFFER_POOL DEFAULT
)
REFRESH FAST
START WITH TO_DATE('17-Jun-2011 00:44:02','dd-mon-yyyy hh24:mi:ss')
NEXT SYSDATE + 1/24/60
WITH ROWID
AS
/* Formatted on 06/17/2011 1:43:56 PM (QP5 v5.163.1008.3004) */
SELECT "TEST1"."SUPERID" "SUPERID", "TEST1"."ID" "ID"
FROM "TEST1" "TEST1";
COMMENT ON MATERIALIZED VIEW TMS.TEST1MV IS 'snapshot table for snapshot TMS.TEST1MV';
-- Note: Index I_SNAP$_TEST1MV will be created automatically
-- by Oracle with the associated materialized view.
----------------------------Sample 2-----------------------------------------------------------------
DROP MATERIALIZED VIEW TMS.TEST1MV2;
CREATE MATERIALIZED VIEW TMS.TEST1MV2
TABLESPACE TMSDATA
PCTUSED 0
PCTFREE 10
INITRANS 2
MAXTRANS 255
STORAGE (
INITIAL 104K
NEXT 104K
MINEXTENTS 1
MAXEXTENTS UNLIMITED
PCTINCREASE 0
BUFFER_POOL DEFAULT
)
NOCACHE
LOGGING
NOCOMPRESS
NOPARALLEL
BUILD IMMEDIATE
USING INDEX
TABLESPACE TMSDATA
PCTFREE 10
INITRANS 2
MAXTRANS 255
STORAGE (
INITIAL 104K
NEXT 104K
MINEXTENTS 1
MAXEXTENTS UNLIMITED
PCTINCREASE 0
BUFFER_POOL DEFAULT
)
REFRESH FAST ON COMMIT
WITH ROWID
AS
/* Formatted on 06/17/2011 1:46:35 PM (QP5 v5.163.1008.3004) */
SELECT "TEST1"."SUPERID" "SUPERID", "TEST1"."ID" "ID"
FROM "TEST1" "TEST1";
COMMENT ON MATERIALIZED VIEW TMS.TEST1MV2 IS 'snapshot table for snapshot TMS.TEST1MV2';
-- Note: Index I_SNAP$_TEST1MV2 will be created automatically
-- by Oracle with the associated materialized view.