Observation: Ascending Key Column Cardinalities

The cardinalities of the key columns defined for a composite index are in ascending order. Index cardinality is the number of distinct values that exist for an index key column. The DB2 optimizer uses the cardinality to determine whether, and in what manner, to use an index.

The optimizer uses the following catalog statistics when calculating the filter factor, which is an estimate of the percentage of rows that will satisfy your predicates:

 

Column

Catalog Table and Description

FIRSTKEYCARD,

FIRSTKEYCARF (if version of DB2 converted value to float)

SYSINDEXES: Contains the number of distinct values in the first column of an index. The optimizer uses FIRSTKEYCARD or FIRSTKEYCARDF when calculating the filter factor for a single column index.

FULLKEYCARD,

FULLKEYCARDF (if version of DB2 converted value to float)

SYSINDEXES: Contains the number of distinct values for all keys in a composite index. The optimizer uses FULLKEYCARD or FULLKEYCARDF when you code an equal predicate for each key in a composite index.

COLCARD,

COLCARDF (if version of DB2 converted value to float)

SYSCOLUMNS: Contains an estimate of the number of distinct values in the non-leading columns of a composite index or a non-indexed column. DB2 uses COLCARD or COLCARDF when calculating the filter factor if it cannot use a matching index on all columns of a composite index.

For partitioned indexes, the COLCARD or COLCARDF column from the catalog table SYSINDEXSTATS is used.

DB2 multiplies the filter factor by the number of rows and pages used by a table and adjusts that value base on the current cluster ratio of an index. DB2 uses the result to estimate the number of I/Os required to locate the required rows: the lower the filter factor, the smaller the potential set of table rows shall satisfy selection criteria. Therefore, there is a greater the probability the optimizer will chose an index during access path selection.

If the cardinalities of the key columns defined for an index ascend, and you do not reference a sufficient number of keys in your predicates to accurately represent the true cardinality of a composite key, you reduce the probability that Db2 will properly use an index.

For example, you define a composite, clustering index I1 on columns C1 and C2 of table T1; the cardinality of C1 is 5, the cardinality of C2 is 1000. In addition, table T1 contains 20,000 rows and the cluster ratio of I1 is 100 percent. This results in the following filter factors:

 

Column

Filter Factor

C1

1/5 = 0.2

C2

1/1000 = 0.001

 

If you coded a predicate referencing only column C1, the optimizer might estimate that the query requires 4000 I/Os to retrieve the table pages (20,000 rows * 0.2). Depending on the number of pages occupied by the table, DB2 might calculate that 4000 I/Os does not justify its use of index I1, so it invokes a tablespace scan for table T1. Conversely, if you defined C2 as the first key in I1, DB2 might estimate the query requires only 20 I/Os, and, therefore, chooses an access path that uses I1.