About Oracle Case Studies
Oracle Case Studies are intended as learning tools and for sharing information or knowledge related to a complex event, process, procedure, or to a series of related events. Each case study is written based upon the experience that the writer/s encountered.
Each Case Study contains a skill level rating. The rating provides an indication of what skill level the reader should have as it relates to the information in the case study.
- Expert: significant experience with the subject matter.
- Intermediate: some experience with the subject matter.
- Beginner: little experience with the subject matter.
Case Study Abstract
This case study uses a tracefile from an actual service request to illustrate a methodology for analyzing 10053 trace files. Please note that event 10053 trace files were designed to assist Oracle developers and support personnel to diagnose optimizer problems and are subject to change with every new patchset or release. Our intent with this case study is not to provide a comprehensive reference to the 10053 trace file, but rather to show how this trace file is used by Oracle engineers. Along the way, we will also gain insight into how the Cost Based Optimizer (CBO) arrives at the cost for a query and, ultimately, how it arrives at the execution plan. It is also important to point out that the algorithms used by the CBO to estimate the cost of a query may change from version to version as well.
Our task here is to analyze the bad plan and determine how the CBO computed the costs that led to the cheaper, but flawed plan. We will compare both 10053 traces at some points, but mainly focus on how the costs were computed for the bad plan. The good plan is basically a shorter version of the bad plan (one in which no indexes were considered).
The reason for examining a 10053 is typically to understand why the CBO made its decisions. The 10053 will help answer the question "Why was my index not used?" or the related, "Why did the CBO choose an Full Table Scan (FTS)?". The 10053 is typically not the best place to start looking for tuning opportunities - the execution plan and TKProf have better information in this regard; the 10053 is used for deeper cause determination.
The situation is that an unhinted SQL statement (SELECT involving a 3-way join) will take over 9 hours to execute versus a statement with a "NO_INDEX" hint that will complete in less than 4 minutes. The tables are partitioned and parallel query is being used. In addition, the customer has set "OPTIMIZER_INDEX_CACHING" to 70 (for unknown reasons, but we can bet it is probably because they weren't getting plans they liked). This parameter has the effect of decreasing the cost of single-block index I/O by 70 percent.
10053 traces were obtained for both the unhinted ("bad") plan and the hinted ("good") plan. The main difference between these two plans is that the bad plan uses all nested loop joins, with the inner-most join having an INDEX FULL SCAN (not index fast full scan) operation as the inner rowsource. The good plan uses a hash join with an INDEX FAST FULL SCAN (IFF) operation as the inner rowsource
The original trace file for the bad plan is
. Please note that this case study is very sensitive to changes that may occur to the 10053 trace file in the future. Generally, later versions of the trace file should be easier to read and require less effort in the "Pre-Analysis Work" section.
Before jumping to the analysis of the trace files, we must make some observations and derive some factors that the CBO will use to compute its costs. Sometimes, the value of some of these parameters and factors will provide a good initial indication about why a particular plan was chosen over competing plans.
To collect the event 10053 trace file, the following syntax was used in SQLPlus:
The output of "oradebug tracefile_name" will point to the 10053 trace file.
A) Verify the query being traced
This is an important step because we want to be sure we are examining the trace for the associated query. Find the SQL within the section of the trace file called "
" and make sure it is the correct one. In version 10g, the QUERY section is at the END of the trace when no bind variables are present in the query, otherwise, it will be at the beginning of the trace. Be careful to note which trace is associated with the QUERY section being considered. Sometimes one can easily mistake a QUERY at the end of a trace as belonging to a trace that follows (likely to happen in 10g with no binds in the SQL).
OPTIMIZER_FEATURES_ENABLE = 9.2.0
_OPTIMIZER_PERCENT_PARALLEL = 101
OPTIMIZER_INDEX_CACHING = 70
This parameter will affect the cost of an index access operation by multiplying the original cost of using the index by (100 - optimizer_index_caching)/100. So, in this case, it will reduce index read costs using this factor: (100 - 70)/100 = 0.3 or about 1/3. This means that index costs will be multiplied by 0.3 making the cost 1/3 of what they would otherwise be. Please note that index I/O costs are computed from the values of "BLEVEL", "LEAF_BLOCKS", and "CLUF" (clustering factor); this parameter only affects the portion of the costs contributed by BLEVEL and LEAF_BLOCKS. CLUF affects the cost of accessing the table and is not discounted by OPTIMIZER_INDEX_CACHING.
OPTIMIZER_INDEX_COST_ADJ = 99
This parameter will be used to scale the cost of index access operation by the fraction:
optmizer_index_cost_adj / 100. In this case, it will be 99/100 or 0.99. This parameter will affect all index costs, even those used in joins.
OPTIMIZER_DYNAMIC_SAMPLING = 1
This parameter controls how aggressively the CBO relies on dynamic sampling to give it information about the cardinalities and selectivities needed to cost access paths. At a setting of 1, it will basically only sample if statistics are missing from tables in the query.
_OPTIMIZER_COST_MODEL = CHOOSE
If set to CHOOSE, the CBO will use the new CPU model if system statistics were collected. If set to I/O, it will use the old cost model that ignores CPU cost.
DB_FILE_MULTIBLOCK_READ_COUNT = 64
This parameter may control the costs of performing full table or index scans. Higher values of this parameter will cause the CBO to cost full table or index scans cheaper. The value is scaled down by the CBO using either a fixed formula (when OPTIMIZER_COST_MODEL = io) or scaling factors computed from actual system statistics that were gathered.
_CPU_TO_IO = 0 (default)
This parameter is used to scale CPU cycles to an I/O cost for the purpose of calculating an overall cost using CPU and I/O cost components. If set to 0, the default value, the CBO will use either an internal fixed value, or a value derived from data collected with the system statistics (which includes CPU speed, single block I/O time, multiblock I/O time, and average number of blocks read during multiblock I/O). In order to verify costs when the CBO is taking CPU costs into account, it is important to determine what the CBO is using for this value.
C) Calculate the CPU to I/O ratio used by the CBO
To determine the value for _CPU_TO_IO being used by the CBO, one must find an entry in the 10053 that shows each of the values: CPU cost, I/O cost, combined cost. We can compute the factor used internally by the CBO by knowing these three values and then applying them to the formula that relates them.
The value of _CPU_TO_IO stays constant throughout the 10053 and is the same for any calculation, so any entry that shows all three components is usable for finding the value. The important thing to remember is to pick an entry with large values for CPU and I/O costs; the higher the values, the more accurate will be the derived factor.
- Find an index fast full (IFF) scan or something where CPU, IO and total are shown.
Combined Resc Cost = (RSC_CPU cost / _cpu_to_io) + RSC_IO Cost
Solve for _cpu_to_io:
_cpu_to_io = RSC_CPU Cost / (Combined Cost - RSC_IO Cost)
= 2865229980 / (55630 - 52693)
975563.49 CPU cycles per IO
D) Calculate the multiblock read divisor
When the CBO estimates the cost for a full table scan or an index fast full scan, it will divide the total number of blocks in a table or index by a divisor that is an estimate of how many blocks will be read for each physical read from the disk (called here the
multiblock read divisor or MBDivisor
). In the past, the parameter "DB_FILE_MULTIBLOCK_READ_COUNT was used as a basis for estimating the value of the MBDivisor (its value was reduced by a formula to compensate for real-world limitations). In version 9.2 and later, the value of the MBDivisor is estimated differently if system statistics are collected.
Before analyzing the 10053, it is useful to derive the divisor value that the CBO is using so we can quickly see if the value is realistic. Very low values will cause the CBO to cost FTS and IFF more expensive than index access paths; high values will tend to cost FTS / IFF much cheaper.
To derive the divisor, find a single table access path entry and obtain the resource cost ("Resc") for a table scan ("tsc"). Then find the total number of blocks in the table. Calculate the factor as follows:
tsc cost = Num Blocks / MBDivisor
Solve for the MBdivisor: MBdivisor = Num Blocks / tsc cost
From the "Base Statistical Information" section:
From the "Single Table Access Path" section:
Mdivisor = Nblks / tsc
Mdivisor = 125272 / 116982 = 1.07
Note: This factor seems very low. It will be very helpful to see what the system stats are and how this value was calculated. The low value will make FTS and IFF scans expensive relative to index scans. Its possible that these values are realistic, but it is also possible the system stats were obtained during a time that was not representative of actual load. Considering how expensive FTS will be, it is interesting that the customer felt they had to bring down index costs further using the index_cost_adjustment parameter. Its easy to see that this database will prefer almost any kind of index access over FTS in many cases.
It seems unusual that the customer set OPTIMIZER_INDEX_COST_ADJ so high when the multiblock divisor was already low and probably discouraged the CBO to choose non-index paths anyway. It would be helpful to know the historical reasons why the customer set that value.
Typical problems include:
- Missing statistics on tables and indexes
If stats have not been gathered for an object, you will see messages like this:
Unfortunately, for indexes there is no message explicitly stating that they were not analyzed. Instead, you will have to read the stats for the index and look for the default stats. The default stats for LEAF_BLOCKS is 25 and for CLUSTERING_FACTOR is 800.
For partitioned objects, determine whether global only statistics or partition level statistics have been gathered.
(no partition level statistics gathered), look for UNANALYZED partitions. Example (not from the current trace):
Unfortunately, there is no way to tell if the composite stats were obtained by a global sampling or aggregating individual partitions. Global sampling is preferred, so if in doubt, check if DBA_TABLES.GLOBAL_STATS equals YES to confirm that global statistics were gathered.
Look for statements like "No Histogram" for particular columns. Example (not from the current trace):
The procedure used to analyze a 10053 is basically to work from the bottom of the trace file up to the areas of interest. An area is of interest when it pertains to a part of the plan that appears to be flawed.
In summary, here are the steps to analyze a 10053 trace file:
1. Start with the end
Verify the SQL you are interested in is the one belonging to the trace you are about to analyze. If an EXPLAIN PLAN is in the trace, indent it to make it readable.
2. Look at the final cost of the chosen plan
3. Find the join order that produced the final cost
4. Find the part of the join order where the costing is of interest
5. Find the particular join type that produced the cost for the join order
6. Examine the costs for the join type found in step 5.
a. determine which access paths were used
b. examine other access paths that were rejected (applies only to nested loop joins where more than one access path to the inner rowsource is costed)
1. Start with the end, the Explain plan
Its important to always obtain an accurate execution plan output along with a 10053. Some information is simply missing in the 10053 and we'll need to use the plan to draw a conclusion about how the costing was performed. Other times we'll want to use the plan to navigate the 10053 or sanity check our own analysis.
Make sure to indent the plan steps according to
to make the hierarchy of the plan readable.
Please note the explain plan output embedded in the trace file differs between versions and is often not present. In 10g, you can obtain a nicely formatted plan by turning on the 10053 trace and then executing an EXPLAIN PLAN FOR.... command. However, the presence of bind values may affect the actual plan that is produced and the EXPLAIN PLAN command will not catch it due to a different code path used to perform the bind peeking.
2. Look at final cost of the plan:
The better plan (in a another trace file) which used the NO_INDEX hint, looked like this:
3. Find the Join Order that produced the final cost
Use the cost (20762) as a key to find the join order that evaluated to this cost. We find this:
This belongs to join order number 2.... scroll up to the beginning of the join order section to see this:
: PREMIUM_PLAN_CODE [A13] CERT_INSURED_PLAN_BENEFIT_FACT [A11] CERTIFICATE [A12]
Finding the join order of the chosen execution plan in 9206 and 10g is easier since we have following trace in 10053 :
In this case, the chosen "JOIN ORDER" is 2.
4. Find the part of the join order that differs between the "good" plan and the "bad" plan.
In this case, they differ in the cost of the second table in the join order:
5. Search for the join within the join order section that produced the cost identified in step 4.
You will find this line as you search UP the file for the cost of 20266:
Keep searching to find which join produced this (lowest) cost:
We knew from the plan that it was going to be an NL join, so this confirms it too. Search within the NL join section for the calculation that lead to this cost:
At this point we see that the best cost was found for the "resp" cost. This is the "response time" cost; i.e., the cost for a plan that uses PX to obtain the answer in the shortest possible time. The "resc" cost is the "resource cost". This is the resource consumption cost when executing the query serially. The full execution plan output would show whether PX was being used or not.
We need to figure out how this cost was calculated. To do this, we need to see how the CBO costed the join. Here are the basic formulae with the actual values substituted:
Note: In the formulae below, "RESC(outer)" refers to the resource cost of accessing the inner table. "RESP(outer)" refers to the response cost (using PX) of the outer table.
This join permutation uses a parallel operation for the outer table and parallel slaves for the inner table, but each slave uses a full index access path for the inner table. Since its an NL join, its probably using "broadcast" PX row distribution to do this. Without the execution plan, its tough to know if this is what the CBO chose to do.
So, instead of dividing the cost by 4, it divides the cost by 3.6.
The degree of parallelism for the outer table's full table scan is seen
. The join's degree of parallelism is set to the highest degree of parallelism for any particular table. In this case it is set to the degree of parallelism for the PREMIUM_PLAN_CODE table. A full execution plan output would be very helpful to verify this.
Other Costs in the NL Join for the table "CERT_INSURED_PLAN_BENEFIT_FACT"
Using an FTS:
Resc = Resc(outer) + [Card(outer) * Resc(inner) ]
+ [1 *
] (Note: Resc(Inner) is close to the value in single table access path, but not exact)
= 230879 (exact)
Resp = Resp(outer) + [Card(outer) * Resp(inner) ]
= 57118 vs. 57618 (close, not exact)
Using an Index Fast Full Scan:
Resc = Resc(outer) + [ Card(outer) * Resc(Inner) ]
Resp = Resp(outer) + [Card(outer) * Resp(inner) / (degree of join parallelism) ]
= 29716.5 =
Costs for other join types (SMJ and HJ):
Resc cost = Resc(outer) + Resc(inner) + Sort_Cost(outer) + Sort_Cost(inner
= Resc(outer) + Resc(inner) + [ (CPU_Cost(outer) + IO_Cost(outer) ) + (CPU_Cost(inner) + IO_Cost(inner) ]
Resp cost = Resp(outer) + Resp(inner) + Par_Sort_Cost(outer) + Par_Sort_Cost(inner)
= Resp(outer) + Resp(inner) + [ (Par_CPU_Cost(outer) + Par_IO_Cost(outer) ) + ( (Par_CPU_Cost(inner) + Par_CPU_Cost(inner)) ]
Resc cost = Resc(outer) + Resc(inner) + HJ_Cost_Ser
= 63646 + 55630 + (
= 121624 ~
Resp cost = Resp(outer) + Resp(inner) + HJ_Cost_Par
= 15809 + 27815 + 587
= 44211 ~
Note: The RESC and RESP costs used for the SMJ and HA are from the single table access cost section for each table. I.e.,
RESC(inner) is the IFF resc cost and RESP(inner) is the IFF resp cost.
The end result here is that with the unhinted ("bad") plan, the CBO chooses to do a full index scan with an NL join. This was influenced by:
- the high cost of FTS (due to the very low value for the "multiblock read divisor")
- the low cost of index access due to the OPTIMIZER_INDEX_CACHING parameter. This parameter had the effect of significantly lowering index access costs, beyond what was reasonable for the system.
Since we know the better plan performed full table scans and had much better performance, we can see the "multiblock read divisor" appears to be inaccurate in this case. A more accurate divisor would have made the cost of FTS/IFF much cheaper and a plan using a hash join would have been more attractive. An even better performing plan might be a nested loops plan that uses an IFF for the inner rowsource (similar to the hash join plan, but avoids the hashing costs). The IFF was not chosen for the nested loop inner rowsource in this case because its cost calculation did not receive the cost reduction benefit of the OPTIMIZER_INDEX_CACHING factor (only applies to indexes read with single block read operations). Furthermore, the relative cost of IFF was also raised by the low multiblock read divisor.
The next steps to investigate for this service request include:
- Find out why the multiblock read divisor is so low (check system stats in aux_stats$)
- Gather system stats again during a realistic interval into a named "stattab" for comparison to the exisiting values in aux_stats$
- Find out why the customer set OPTIMIZER_INDEX_CACHING so high
Various articles of interest can be found on Oracle Metalink. Access the articles using the following steps:
- Go to
MetaLink and login
- Click on the "Knowledge" tab
- Click on "Database" under the "Support Categories" menu
- Click on "SQL Tuning and Application Access - Including CBO" under "Performance and Scalability"
- View the variety of documents by clicking the links in the "Browse Product" menu
For additional information, we suggest looking at the following (please note this document is from sources outside of Oracle Corporation):
Cost-Based Oracle, Fundamentals
, Jonathan Lewis, Apress, 2006
来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/29487349/viewspace-2684176/，如需转载，请注明出处，否则将追究法律责任。