I just upgraded to Oracle 10g and I'm seeing very bad SQL performance. I had to set optimizer_features_enable=9.0.5. What can I do to fix Oracle10g upgrade & migration performance tuning problems?
Oracle has improved the cost-based Oracle optimizer in 9.0.5 and again in 10g, so you need to take a close look at your environmental parameter settings (init.ora parms) and your optimizer statistics. I have complete directions in my book "Oracle Tuning - The Definitive Reference", but here are some notes. See also Oracle tips for 10g migration.
First, make sure to see these important notes on _optimizer_cost_based_transformation issues when doing 10g upgrades. Also note that 9i has many optimizer bugs, notably issues that are corrected with _optimizer_transitivity_retain when using query re-write.
Properly configured, Oracle 10g should always faster than earlier releases, both for PL/SQL and SQL, so it is likely that any slow performance after an Oracle 10g upgrade is due to initialization parameter settings or incomplete CBO statistics.
Oracle 10g is the world's most flexible and complex database, and upgrading to Oracle 10g is very tricky. Prior to putting your Oracle 10g upgrade into production, it's a best practice to obtain an independent Oracle health check to identify sub-optimal configuration settings.
For complete insurance against bad performance after a 10g upgrade, see my book "Oracle Tuning: The Definitive Reference".
- Check for 10g optimization bugs - See Metalink Note 469972.1, note 240764.1, note 466181.1 and note 337096.1.
- Gather workload statistics - The 10g CBO requires workload information with dbms_stats.gather_system_stats
- Selectively disable dynamic sampling - Dynamic sampling is not for every database. Dynamic sampling default levels change between releases, and you may want to turn-off dynamic sampling, depending on your database load.
- Re-set optimizer costing - Consider unsetting your CPU-based optimizer costing (the 10g default, a change from 9i). CPU costing is best of you see CPU in your top-5 timed events in your STATSPACK/AWR report, and the 10g default of _optimizer_cost_model=cpu will include CPU costs, sometimes invoking more full scans, especially in tablespaces with large blocksizes. To return to your 9i CBO I/O-based costing, set the hidden parameter "_optimizer_cost_model"=io
- Verify deprecated parameters - you need to set optimizer_features_enable = 10.2.0.2 and optimizer_mode = FIRST_ROWS_n (or ALL_ROWS for a warehouse, but remove the 9i CHOOSE default).
- Re-check your db_file_multiblock_read_count parameter - The Oracle 10g release 2 Performance Tuning Guide (page 14.4) notes:
"The value of db_file_multiblock_read_count is set to the maximum allowed by the operating system by default. However, the optimizer uses mbrc=8 for costing.
10gr2 Note: Starting in Oracle 10g release2, Oracle recommends not setting the db_file_multiblock_read_count parameter, allowing Oracle to empirically determine the optimal setting. For more details, see my notes on 10gR2 automatically tuned multi-block reads.
- Verify quality of CBO statistics - Oracle 10g does automatic statistics collection and your original customized dbms_stats job (with your customized parameters) will be overlaid. You may also see a statistics deficiency (i.e. not enough histograms) causing performance issues. Re-analyze object statistics using dbms_stats and make sure that you collect system statistics. Also see 10g changes to dbms_stats for more details.
-- wait an hour or so
- Check optimizer parameters - Ensure that you are using the proper optimizer_mode (the new default is all_rows instead of choose) and check optimal settings for optimizer_index_cost_adj (lower from the default of 100) and optimizer_index_caching (set to a higher value than the default).
- Check I/O timings - This guy has noted that in Linux 2.6 using ASM, "db file scattered reads" (full scan I/O) can become slower than "db file scattered reads" (usually single block gets) because of non-contiguous data block placement on disk. He notes that full-scan access speed is aggravated by Oracle willy-nilly block placement in Automated Storage Management (ASM) and the use of bitmap freelists (Automated Segment Storage Management).
To find the exact root cause of the changed SQL performance, start by collecting the execution plans and TKPROF output for the 10g default and again after setting optimizer_features_enable). Compare the plans and then see how you might adjust initialization parameters and CBO statistics (using dbms_stats) to replicate the optimal SQL execution plan.
Tip - If you have a SQL statement which has poor performance after a 10g upgrade, try temporarily adding a /*+ rule */ hint to see if the problem is related to sub-optimal optimizer statistics.
If the RULE hint optimizes the SQL, remove the RULE hint and adjust your statistics until it replicates the execution plan.
This document shows some parameters which relieved slow SQL performance after a 10g upgrade by George Johnson:
After our upgrade from 9206 to 10201, we ended up with these parameters making the biggest difference to our slow query performance. The optimizer_index_cost_adj figure was arrived at after about 2 days of testing various troublesome
optimizer_secure_view_merging = false
_gby_hash_aggregation_enabled = FALSE
optimizer_index_cost_adj = 50
optimizer_index_caching = 0
_optimizer_cost_based_transformation = OFF
We were told by one Oracle guy that if your DB is not a warehouse and it's used batch and OLTP, the bottom four parameters should be set in 10g, without question to ensure the Warehouse components do not affect OLTP type activity!
This document notes other 10g upgrade issues with advanced SQL optimization in subquery un-nesting and outer hash joins:
"I have noticed after upgrading from 9i to 10g, the biggest impacts for query performance were -
(1) subquery unnesting (mostly, unnesting EXISTS to HASH JOIN SEMI), and
(2)right outer hash join (smaller table in an outer join *is* used as the hash table, unlike what was happening in pre-10g versions).
In some cases (till 10.1.0.4) we would have to hint the sub-query with NO_UNNEST to avoid the hash join semi. . .
The only ways to get rid of this problem (ora-979 is omitted if it is raised in inline view) is to use the hint NO_MERGE in 9i, and depending on 10g version and testmerge table columns - setting the '_OPTIMIZER_COST_BASED_TRANSFORMATION' to off on 10g."
This blog by Shervin Sheidaei notes a similar performance issues after a 10g upgrade and a change to "_optimizer_cost_based_transformation" =off; and "_gby_hash_aggregation_enabled"=false:
Some queries after upgrade from 9i to 10g may have performance issue which means they may run slower in 10g.
For figuring out whether or not performance issue is because of new enhanced features in 10g please turn off the following options and run query in 10g.
alter session set "_optimizer_cost_based_transformation" =off; (Disable subquery unnesting and view merging -- New 10g optimizer feature).
alter session set "_gby_hash_aggregation_enabled" = FALSE; (Disable Hash group by aggregation -- New 10g optimizer feature).
David Aldridge has concerns about disk read performance when using ASM and ASSM, noting that placement on the disk and competing I/O requests can impede scattered read access times (as the read-write head thrashes):
"Does the issue of the slowing of one read matter when other process are being equally served at the same time? That’s a question at the very heart of i/o scheduling (and queuing theory, which is what all this is about) — what is meant by “equally served”?
If it means that the scheduler finishes with one read request and then immediately moves the disk heads to satisfy another, then moves the heads back to satisfy another request from the first process then that might seem equitable, but it’s exactly analogous to a single check-in handling multiple check-ins at the airport at the same time.
Consider passenger A and passenger B, both waiting to be served. To check in each passenger takes five minutes, so passenger A is checked in in five minutes and passenger B waits for five minutes then gets checked in and is gone after a total wait of ten minutes.
If, in an effort to be equitable to both parties, the check-in agent flits between the two then the total time to check them both in is now eleven minutes (taking into account a total latency of one minute due to walking between the desks), and they both wait the full eleven minutes to be finished. Not equitable at all!"
来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/35489/viewspace-605635/，如需转载，请注明出处，否则将追究法律责任。