Dontcheff

A friend in need is a friend to avoid: should I capture sql plan baselines?

In Database tuning, DBA, Oracle database on June 13, 2011 at 23:32

A friend in need is a friend indeed. I am talking about SPM, SQL Plan Management. Or should I rather say that a friend in need is a friend to avoid. Let us see.

Maxym Kharchenko wrote a very interesting note entitled How to screw up your neighbor. He says: Unbelievable! and considers his findings as a major bug.

So, the direct question is: Should we set optimizer_capture_sql_plan_baselines=true?

My answer is NO. Unless you run Oracle 12.1 and above where you could consider it. Or at least heavily patched 11.2.0.2 Oracle on Microsoft OS (Oracle on Windows is another absurd but that is another story). And I would like to explain why by giving you two reasons on why not to touch the default value.

Reason One: There is an Oracle Bug 9910484 called: SQL Plan Management Capture uses excessive space in SYSAUX. And excessive here means gigabytes! I have seen SYSAUX growing really a lot.

If you would like to recall what the parameter optimizer_capture_sql_plan_baselines is all about, please check the following article from Pythian called Wouldn’t “Automatic SQL Plan Management” be the 11g “killer app”. In short, let me quote the documentation: “SQL plan management is a preventative mechanism that records and evaluates the execution plans of SQL statements over time, and builds SQL plan baselines composed of a set of existing plans known to be efficient. The SQL plan baselines are then used to preserve performance of corresponding SQL statements, regardless of changes occurring in the system.”

In order to understand in detail how SPM works, the best link is this white paper from Oracle called SQL Plan Management in Oracle Database 11g.

The following “nice” table is from MOS: Reference Note [ID 567104.1]:

As of now, June 2011, by just looking at the bugs above (if running below 12g), I would say that the words locks, slowness and excessive space would be a very, very good indicators to back off and not to play the role of a cowboy DBA.

If the information above is still not enough convincing why you should stay with the default of optimizer_capture_sql_plan_baselines=false, then go on reading.

Reason Two: When doing manual SQL tuning, not all people might be aware that optimizer_capture_sql_plan_baselines has been set to true. In fact, some may be but not totally comprehend the behavior of Oracle in this special case. Their frustration can escalate rather quickly when they notice that recalculating statistics, adding an index, etc. does not change the execution plan and has no effect whatsoever. Not until when Oracle eventually evolves the execution plans and accepts them. Oracle calls this a “conservative plan selection strategy” as the CBO preferentially uses a tried and tested execution plan even if the new plan looks like it might perform much better.

This might have quite of an impact on the people doing manual tuning across the organization. If you have not yet read Maxym Kharchenko’s article How to screw up your neighbor, please have a look.

Note here that if a SQL profile recommendation made by the automatic SQL tuning task is implemented, the execution plan found by the SQL Tuning Task will be added as an accepted SQL plan baseline.

Interesting enough, only when the newer plan is proved to perform well will it be accepted for use. However, is cost the only criteria for an accepted plan? Have a look, this is output from

select * from DBA_SQL_PLAN_BASELINES where signature = '12258705699202968790':

Why wasn’t the one with lowest cost 31 evolved to an accepted one? Note that this is after running the Automatic SQL Tuning task. Obviously, the plan did not pass the performance criterion. Oracle needs a compound improvement ratio bigger than 1.5 as a new plan must be at least 1.5 times better than the current baseline to be evolved with verification.

Now, how is this compound improvement ratio calculated? I do not know the exact formula but what is taken into account are the ratios of Elapsed Time, CPU Time, Buffer Gets, Disk Reads, Direct Writes and Fetches. In his paper SQL Plan Management and Your Oracle 11g Upgrade, Mike Tefft supports SQL Plan Management. But looking at his slide 17, I wonder why Oracle fails the performance criterion:

A few more nails in the SQL plan basline’s coffin:

Sql Plan Baseline Not always created: 788853.1
ORA-600 when optimizer_capture_sql_plan_baselines=true Bug: 12319228
ORA-1403 when optimizer_capture_sql_plan_baselines=true Bug: 9157046
Automatic capture of sql plan baselines generates excessive redo: Bug 10386337
Zero plan loaded into spm from sql tuning set: Bug 10091254
Sql not being registered in the sql plan baseline: Bug 11077029

For me this is not a stable and most of all not a simple mechanism. Fixed plan, evolved plan, accepted plan, enabled plan, disabled plan… If it is up to me I will make it more simple and clear. And what is the story behind the way the compound improvement ratio is calculated? I have looked into 100s of “Evolve SQL Plan Baseline Report”s and all looks so ambiguous to me.

If you are still with me and need more fun click on Jed Walker’s article Making plans with Oracle 11g and not leaving them behind. Very good writing Jed! Going on a trip? Take the baselines with you 🙂

So, if select count(*) from dba_sql_plan_baselines; returns nothing you are doing OK on this front. If not, here is how do wipe away all the baselines:

Option 1: Bulk delete SQL plan baselines
Option 2: How to drop plans from SPM repository

But if you are still not convinced about the usage of optimizer_capture_sql_plan_baselines (true or false), then use the SQL Performance Analyzer, just like this:

And view the report:

The Overall Impact is 60% plus. Uups.. It is very much up to what you have in the SQL Tuning Set! This is a very important thing to remember.

One last thing. Oracle suggests to migrate stored outlines to SQL plan baselines! Oracle enables you to safely migrate from stored outlines to SQL plan baselines. After the migration, you can maintain the same plan stability you had using stored outlines while being able to utilize the more advanced features provided by the SQL Plan Management framework. For more information, see “Migrating Stored Outlines to SQL Plan Baselines”.

Advertisements
  1. Hi, Julian,

    Just visting your blog. It is really great and it has so much great contents. Look forward to seeing you in OOW.

    Thanks
    Kai

  2. Entertaining and informative as always, and probably just gave the hint to solving a tricky problem here 🙂 Keep up the good work!

  3. We saw this bug hit in 11.1.0.7 databases first. Huge SYSAUX tablespace, when
    database itself was small.

    So, keeping an eye on size of objects can also inform of possible bugs.

  4. Thanks for the interesting and useful post. I’ve shied away from SPM simply because I try to avoid using version one of anything, and also because manual sql profiles as per Osborne/Geist/Sierra usually are sufficient for stabilization in my shop. Re “I wonder why Oracle fails the performance criterion” — The number of buffer gets for the test plan was about the same as the baseline, the number of disk reads for the test plan was less than 1% of the baseline, which likely explains the difference in elapsed time. “due to caching” seems the right call in this case. I wonder how well SPM works with long-running db-link queries that spend most of their time waiting on the remote database.

  5. This appears to have fixed our reporting issue. Not sure the relation! https://community.oracle.com/message/13650412

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: