DB2 Performance
Submitted by Matt Fleming on Fri, 2005-11-25 10:35.
Most of this stuff is kind of "no duh" but I always seem to forget something.. so now I am writing it down.
Client Side Performance
- Make sure predicates[1] only reference indexed elements or primary keys unless the data has low cardinality[2]. If a low cardinality column is necessary a data page scan after using an index might not be so bad and is probably better than indexing the column.
- Provide only the exact columns that you need to retrieve in the SELECT-list.. no extra ones, and no columns that exist in the predicates.
- Watch out for db functions on predicates.. they may invalidate the index.
- Use block fetching to minimize the number of round trip requests to the db. I am still a bit unclear how JDBC batching relate to block fetching.
- Use JDBC batch inserts to reduce network chatter and activate multi-row inserts.
Index Performance
- Compound indexes (sometimes called multicolumn or composite indexes) are generally recommended if queries often contain the same columns joined with AND. You can safely use up to five columns in a compound index.
- It is usually better to add a column to an index than to create a new index because INSERT and DELETE performance is affected with each new index on a table.
- The left-most column in a compound index should be the one that occurs most often in the queries. The order of the columns in the WHERE clause once had to match the order in the compound index; however, modern optimizers are supposed to make that rule obsolete.
- An index should rarely contain a column with low cardinality[2].
Subsystem Performance
- Synchronous IO is much worse than Asynchronous IO. Asynchronous IO is caused by DB2 prefetching information, which is a performance booster. SyncIOs are actual reads from disk. Asynch IOs happen when DB2 detects data being fetched in disk order, so creating a clustering index[3] on the most frequently fetched columns could help this happen.
- REORGs (reorganization of the resources on the disk) can be performed on indexes and data independently.
- Keep the statistics updated regularly.
- Enable thread pooling via CMTSTAT=INACTIVE setting. CMTSTAT defaults to ACTIVE which means that each DDF (remote) connection binds and locks a DB2 "worker" thread[4].
Resources
- Craig Mullen's entire site.. the man knows about DB2!
- DB2 Architecture - an introduction to DB2 architecture (different version)
- Error Messages
- Tracing via the DB2 v8 type IV driver.
- Effective Indexes - All about creating DB2 indexes.
- DB2 Indexes - An introduction to DB2 indexes.
- SQL Under the Covers - How DB2 handles queries.
- Improve Insert Performance - tips for bulk insert improvement.
#1Elements following the WHERE clause. e.g. NULL, IN, LIKE, BETWEEN, ON, EXISTS
#2The number of unique data elements in a set. e.g. low cardinality- GENDER; high cardinality- SSN
#3Clustering indexes are used by DB2 to control how table space data is physically stored on disk. Clustering causes inserted rows to be stored contiguously (in sequence) whenever possible.
#4See Distributed Functions of DB2 for z/OS and OS/390
»
- Printer-friendly version
- Login or register to post comments
- 6055 reads
