首页 > Linux操作系统 > Linux操作系统 > 什么时候应该重建索引?(转载)


原创 Linux操作系统 作者:必有我师 时间:2009-01-12 16:28:30 0 删除 编辑

When Should You Rebuild an Index?

by Jonathan Lewis

In an earlier article about B-tree indexes, I ended with the comment, “Resist the argument that you need to rebuild Indexes regularly because ‘they become unbalanced.’ It isn’t a valid argument.” Some time later, I received an email message pointing out that there were other, valid, reasons for rebuilding indexes. This is true, and rebuilding B-tree indexes does sometimes have benefits, so I thought I’d write a short article investigating the reasons for rebuilding B-tree indexes.

Why Rebuild an Index?

There are only two reasons for rebuilding indexes. One is driven by the question, Will the performance gain be worth the cost of rebuilding this index? The other is driven by a similar question: Is the administrative benefit worth the cost?

One convenient aspect of phrasing the questions like this is that if the cost is effectively zero, then even in the case in which rebuilding an index is a pointless exercise, Im not going to insist that you to stop doing it. Of course, the time and effort may not really be zero, and just occasionally, an index rebuild can cause subsequent short-term performance problems, so do carry on reading the rest of the article.

The standard argument for rebuilding indexes is based on the fact that B-tree indexes tend to “degenerate” (i.e., become less space-efficient) as they are populated over time. And the argument is true — but only because the standard theory predicts that B-tree indexes will tend to operate at 75 percent packing, and Oracle creates them, by default, at 90 percent packing. (In fact, Oracle does not coalesce adjacent, low-usage, leaf blocks automatically, so its packing will often be a few percent lower than the theoretical value). So the question of the rebuild generally comes down to the following implicit question:

If I think the index is running at X percent packing, should I rebuild it to 90 percent?

Let’s start by considering the significance of the change from X percent, and before we worry about values for the magic number X.

Potential Benefits

When you rebuild an index, the physical effect is to reduce the size of the index (especially the leaf block count). This may result in three possible benefits:

§  The optimiser calculations may produce a smaller value for the cost of using the index, so the optimiser may use the index in more execution plans.

§  Typical queries that use the index may have to visit fewer index blocks, and therefore run more efficiently.

§  Because the index is smaller, it may become a better survivor in the buffer cache LRU chain, so that blocks from the index are found in the buffer more frequently, and fewer blocks from other objects are flushed to allow index blocks to be reloaded. If this occurs, system-level I/O is reduced, resulting in a possible performance gain for everyone.

So let’s check each effect in turn, and see what the different benefits look like in different circumstances. We’ll start with the optimiser cost calculations and a simple worked example.

Optimiser Costing

If you’ve read the article, “” you will be familiar with the formula originally published by Wolfgang Breitling at IOUG-2002 which gives the general cost of accessing a table through an index as:

blevel +

selectivity * leaf_blocks +

selectivitity * clustering_factor.

Consider an index on 10,000,000 rows, with an average index entry size of 40 bytes. This would give us about 200 entries per leaf block if we use an 8K block size. Assume that the index (according to some statistics in the viewindex_stats) is running at 50 percent efficiency and is therefore, according to common dictate, a good candidate for rebuilding (to the default 90 percent — but we will use 100 percent in our example). In other words, the index is currently running with about 100 entries per block.

Let’s do the arithmetic, first before the rebuild:

10,000,000 rows at 100 rows per block

=> 100,000 leaf blocks

100,000 leaf blocks (at 50 percent packing)

=> 1,000 level 1 branch blocks

1,000 level 1 branch blocks (at 50 percent packing)

=> 10 level 2 branch blocks

10 level 2 branch blocks (at 50 percent packing)

=> 1 level 3 branch block

Now after the rebuild (with pctfree = 0):

10,000,000 rows at 200 rows per block

=> 50,000 leaf blocks

50,000 leaf blocks (at 100 percent packing)

=> 250 level 1 branch blocks

250 level 1 branch blocks (at 100 percent packing)

=> 2 level 2 branch blocks

2 level 2 branch blocks (at 100 percent packing)

=> 1 level 3 branch block

So the index has blevel = 3 before and after rebuilding, but 100,000 leaf_blocks before the rebuild, dropping to 50,000 afterwards. So does this tell us anything about the new cost estimate? Not really, although it is an important little detail that the blevel, which is one component of the cost calculation, rarely changes on rebuilding an index. The exponential relationship between the blevel and the maximum row count makes this almost inevitable.

We need to examine not just the leaf_blocks, but the entire cost component: selectivity * leaf_blocks. How much does this change? Well, that depends on how many rows you have for each index value. Let’s try a couple of numbers, one for a high-precision index (five rows per value), and one for a lower quality index (50 per value).

Rows per value = 5           => selectivity = 1/2,000,000

Selectivity * leaf_blocks (old) = 100,000 / 2,000,000 = 0.05

Selectivity * leaf_blocks (new) = 50,000 / 2,000,000 = 0.025

Because of rounding (I believe Oracle rounds up at this point in the formula), this component of the cost estimate does not change.

Rows per key = 50            => selectivity = 1/200,000

Selectivity * leaf_blocks (old) = 100,000 / 200,000 = 0.5

Selectivity * leaf_blocks (new) = 50,000 / 200,000 = 0.25

Again, this component of the formula shows no change. In fact, it is not until each key value represents 100 rows in the table that the cost estimate is reduced as a result of our rebuilding the index to pack it from 50 percent to 100 percent efficiency

So, there are cases where rebuilding an index will change the optimiserââ¬â¢s opinion of the usefulness of that index — but those cases may be fairly rare in your system.

Typical Queries

So what if the optimiser isn’t convinced that your index has improved; maybe the end-user queries that use that index already will run better. As ever, it is important to know the data and know the application. When you start to think about rebuilding an index “because index_stats says it is running at 50 percent,” the first thing you should do is to ask yourself if (for example) that 50 percent means that half the index is nearly packed and the other half is nearly empty — one magic number for the whole index may not be giving you any useful information. (In the absence of knowledge, you might try a rather expensive treedump to check the details).

Then, when you have decided what the number means, you have to decide if rebuilding would actually have any perceptible effect for your users. Depending on the way the table and index are used, the answer may be no.

Consider this example. The table is addressed by an “Object Oriented” system, which always uses meaningless numeric keys. Every query uses a key value to collect data. One trip down the index fetches one rowid from a leaf block, and then one row from the table. Rebuilding the index to pack 200 rowids per leaf_block doesn’t make any difference if you want only one row.

How about an example closer to the other end of the spectrum: a ââ¬Åchildâ⬠table in which one foreign key value picks up 100 rows? Packing the index means that you only visit one leaf_block instead of two for those rowids. Terrific ââ¬â€ except maybe those 100 table rows come from 100 table blocks, so the effort of rebuilding the index makes a difference of slightly less than one percent to the end-user. You have to check out the figures when trying to work out whether or not itââ¬â¢s going to be worth rebuilding an index, especially if thereââ¬â¢s some reason that make it awkward to do the rebuild.

Buffering Benefits

Clearly, if you have a well-packed index, it is likely to cause less “pollution” in the buffer — there are fewer index blocks, so the whole index may get into the buffer, and because there are fewer blocks to knock other data blocks out of the buffer, there may be a general reduction in I/O. On the down side, a well-packed index may be subject to extra contention on inserts, updates, and deletes. It’s harder to measure a case like this, but it’s probably a good enough reason to rebuild an index (partition) that’s about to be made read-only.

Three thoughts to consider, though. If it’s a really popular index, then the LRU and touch count could be keeping it buffered all the time anyway, so whilst it may be “wasting space” in the buffer, it may not be causing any extra I/O. Secondly, for many queries the significant cost of the query is the cost of visiting table blocks which have a low revisit rate and cause much more aggressive buffer flushing, so worrying about indexes may be distracting you from the problem you should be addressing. Finally, it is possible that by rebuilding indexes too frequently, you may be exaggerating the problem rather than fixing the problem — you might like to try the following (fairly long-running, expect it to take a few minutes) test on a system with an 8K block size and not in an ASSM tablespace if you are using Oracle9:

drop table t1;

create table t1(n1 number(38));

create index i1 on t1(n1);


execute dbms_random.seed(0)



     for i in 1..400000 loop

          insert into t1 values(

               trunc(power(10,14) * dbms_random.value)



     end loop;




-- default pctfree for rebuild

alter index i1 rebuild pctfree 10;



     for i in 1..100000 loop

来自 “ ITPUB博客 ” ,链接:,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录


  • 博文量
  • 访问量