Ask The Real Tom

June 22, 2009



Impact of Table Blocksize on Index Clustering Factor

Filed under: oracle — Tom Turbo @ 4:36 pm

In this post we want to determine what impact the blocksize on the Index clustering factor has.

What exactly is the clustering factor?
The clustering factor is computed as follows:
Every index entry in a B*Tree is compared to its preceding index entry. If the both rowids in the leaf block are in different blocks in the table, then the clustering factor is incremented by one.

Clustering Factor

The picture shows how the Clustering Factor for an index is computed

The Clustering Factor is important for the evaluation of the cost of e.g. an index range scan.
For more Information on Clustering Factor please refer to the following Metalink Note:
Metalink-Note: 39836.1

Assumption

If the values of a column are not orderly inserted, then the clustering factor is high, no matter how big the blocksize of the table is (maybe the clustering factor is a bit lower with bigger blocksize, but that’s negligible). If the values are orderly inserted, then the clustering factor must be a lot lower with bigger blocksize.

So let us prove this Assumption.

Testcase

Before we create the two tablespaces with blocksize 2k and 32k we have to define the cache size for the two different blocksizes:
ALTER SYSTEM SET db_2k_cache_size=100M scope=both;
ALTER SYSTEM SET db_32k_cache_size=100M scope=both;

Create two tablespaces:
CREATE TABLESPACE blks2k datafile '/u02/oradata/TRDB01/blks2k01TRDB01.dbf' SIZE 1G autoextend off blocksize 2K;
CREATE TABLESPACE blks32k datafile '/u02/oradata/TRDB01/blks32k01TRDB01.dbf' SIZE 1G autoextend off blocksize 32K;

Test with unordered values
Now we create the same table in both tablespaces. We want that the values in the column are not inserted in order, so we use the dbms_random package to generate values.

Create tables with random column values:
CREATE TABLE blksize2 TABLESPACE blks2k AS SELECT TRUNC(dbms_random.VALUE(1, 10000000)) num FROM dual CONNECT BY LEVEL <= 1000000;
CREATE TABLE blksize32 TABLESPACE blks32k AS SELECT * FROM blksize2;

Create indexes:
CREATE INDEX blksize2_num ON blksize2 (num);
CREATE INDEX blksize32_num ON blksize32 (num);

Gather stats for the indexes:
EXEC dbms_stats.gather_index_stats('CFTEST', 'BLKSIZE2_NUM',NULL,100);
EXEC dbms_stats.gather_index_stats('CFTEST', 'BLKSIZE32_NUM',NULL,100);

Compare clustering factor of the two tables:
SELECT index_name, clustering_factor FROM dba_indexes WHERE index_name in ('BLKSIZE2_NUM', 'BLKSIZE32_NUM') order by clustering_factor;

INDEX_NAME CLUSTERING_FACTOR
------------------------------ -----------------
BLKSIZE32_NUM 997973
BLKSIZE2_NUM 999893

As we see, the clustering factor is not much lower with higher blocksize, if the column values are orderless.

Test with ordered Data

Create tables with ordered column values:
CREATE TABLE blksize2_2 TABLESPACE blks2k AS SELECT LEVEL num FROM dual CONNECT BY LEVEL <= 1000000;
CREATE TABLE blksize32_2 TABLESPACE blks32k AS SELECT LEVEL num FROM dual CONNECT BY LEVEL <= 1000000;

Create index:
CREATE INDEX blksize2_2_num ON blksize2_2 (num);
CREATE INDEX blksize32_2_num ON blksize32_2 (num);

Gather Statistics:
EXEC dbms_stats.gather_index_stats('CFTEST', 'BLKSIZE2_2_NUM',NULL,100);
EXEC dbms_stats.gather_index_stats('CFTEST', 'BLKSIZE32_2_NUM',NULL,100);

Compare Clustering Factor:
SELECT index_name, clustering_factor FROM dba_indexes WHERE index_name in ('BLKSIZE2_2_NUM', 'BLKSIZE32_2_NUM');

INDEX_NAME CLUSTERING_FACTOR
------------------------------ -----------------
BLKSIZE32_2_NUM 375
BLKSIZE2_2_NUM 6494

With ordered values the clustering factor is much lower with higher blocksize (in this example the index clustering factor of the table with 32k blocksize it is about 5% of the clustering factor with blocksize 2k)

Remember that the Index Blocksize has no impact on the clustering factor. It’s only the table blocksize that affects the index clustering factor.

Powered by WordPress