After talking about theSQL Analyzer (Sql Replay) in part I of this series andDatabase Replay in part II I will introduce you this time to some 11g New Features for the query optimizer.
1. As of Oracle 10g the server collects optimizer statistics on all objects which do not have any object statistics in the data dictionary at all and for those object for which the statistics have become stale.
This is implemented by an automatically created scheduler job called GATHER_STATS_JOB which runs every night and every weekend by default and belongs to the AUTO_TASKS_JOB_CLASS. This mechanism has been changed in 11g by the way and we now use a different auto-task framework.Oracle uses table monitoring for this feature which is enabled by DEFAULT for all tables in 10g. This means that Oracle monitors all tables and for those where more than 10% of the values have changed since the last job run the statistics will get a STALE-flag in the data dictionary and the job run would compute new statistics for them. So far so good. A lot of manual work was removed the DBA shoulders in 10g but we had to find out that for some objects the threshold of 10% was suboptimal.This issue was addressed in Oracle 11gR1. Now we can set individual thresholds on table level with DBMS_STATS and there are new data dictionary views for these thresholds.
Let’s see how it works:
SYSTEM @ orcl11 SQL> desc dba_tab_stat_prefs
Name Null? Type
—————————– ————– ——————-
OWNER NOT NULL VARCHAR2(30)
TABLE_NAME NOT NULL VARCHAR2(30)
SYSTEM @ orcl11 SQL> exec dbms_stats.set_table_prefs(‘SYSTEM’,'T’, ‘STALE_PERCENT’, ’15′)
PL/SQL procedure successfully completed. SYSTEM @ orcl11 SQL> select *
2 from dba_tab_stat_prefs
3 where owner=’SYSTEM’
4 and table_name=’T'; OWNER TABLE_NAME PREFERENCE_NAME REFERENCE_VALUE
——— —————— ————————— ————————-
SYSTEM T STALE_PERCENT 15
2. Another very nice new feature is deferred publication of newly gathered statistics. Now a DBA can control when new statistics are visible to the optimzer. This can be very handy for examply if you want to make sure that all statistics for all dependent objects have been collected and that not the new table statistics are used before the collection for the dependent indexes and partitions and subpartitions has been completed. They will be published all at the same time.Furthermore there is a procedure DBMS_STATS.RESUME_GATHER_STATS which can be used to catch up again, in case for some reason the statistics collection fails from comleting. This procedure continues exactly where the previouse run stopped.These new still unpublished statistics could be shipped to a testing system to test the impact on the workload before they would be published to the production system. The unpublished new statistics are stored in special tables and are called PENDING STATISTICS in contrast to CURRENT STATISTICS which are the ones the optimzer can use.There are a lot of *_PENDING_STATS views for the data dictionary.
After testing you can use DBMS_STATS.PUBLISH_PENDING_STATS to make the PENDING STATISTICS current.
3. The next highlight in 11g is MULTI COLUMN STATISTICS: now Oracle can gather statistics on groups of correlated columns and store them. This addresses a number issues which were around in 10g where Oracle had introduced the possibilty for the optimzer to take into account correlations for columns in a few limited cases. Jonathan Lewis had an entire chapter in his COST BASED ORACLE FUNDAMENTALS book. So let’s wait what the master of the optimizer will find out with his test cases about this!
4. Incrementally updated global statistics on partitioned tables are the next feature I want to discuss here:Oracle 10g collects statistics at global- (table), partition- and sub-partition-level for partitioned tables. In other words table monitoring works on partitions as well and is enabled by default in 10g. Very good!The problem in 10g is that the collection of the global level statistics can cause a dramatic overhead because a full table scan is necessary while the partition level statistics are computed per partition and and are only gathered fpr those partitions in which more than 10% of the values have changed. In most cases only a few partitions undergo changes normally when we have a range base partitioning . The others are stable.In Oracle database 11g a new technology called synopses is used for certain global stsistics such as the number of distinct values. Oracle can now only scan those partitions which have been significantly modified and for the other partitions make use of the old stastistics in order to calculate the global stastistics on table level. These synopses need additional diskspace and are stored in the SYSAUX tablespace. This is the drawback! We have additional I/O in advance of faster statistics collection.Histograms are not incrementally maintained. For more information see the online Performance Tuning Guide for 11g.
5. Last but not least let me mention EXTENDED STATISTICS on expressions:if you have a WHERE CLAUSE like this:
WHERE UPPER(last_name) = ‘King’
the optimizer uses 1% as the value for SELECTIVITY if there is no function based index available, which in many cases leads to the wrong descision and we end up with a bad execution plan. In 10g the optimzer could already handle some built in functions like TO_NUMBER.
As of 11g we can do this:
exec DBMS_STATS.GATHER_TABLE_STATS(‘HR’, ‘EMPLOYEES’, METHOD_OPT => ‘for all columns size 1 columns (upper(last_name)) size 5‘)