Using SQLite

Name

ANALYZE — Compute index meta-data

Syntax

Common Usage

ANALYZE;
ANALYZE database_name;
ANALYZE database_name.table_name;

Description

The ANALYZE command computes and records statistical data about database indexes. If available, this data is used by the query optimizer to compute the most efficient query plan.

If no parameters are given, statistics will be computed for all indexes in all attached databases. You can also limit analysis to just those indexes in a specific database, or just those indexes associated with a specific table.

The statistical data is not automatically updated as the index values change. If the contents or distribution of an index changes significantly, it would be wise to reanalyze the appropriate database or table. Another option would be to simply delete the statistical data, as no data is usually better than incorrect data.

Data generated by ANALYZE is stored in one or more tables named sqlite_stat#, starting with sqlite_stat1. These tables cannot be manually dropped, but the data inside can be altered with standard SQL commands. Generally, this is not recommended, except to delete any ANALYZE data that is no longer valid or desired.

By default, the ANALYZE command generates data on the number of entries in an index, as well as the ratio of unique values to total values. This ratio is computed by dividing the total number of entries by the number of unique values, rounding up to the nearest integer. This data is used to compute the cost difference between a full-table scan and an indexed lookup.

If SQLite is compiled with the SQLITE_ENABLE_STAT2 directive, then ANALYZE will also generate an sqlite_stat2 table that contains a histogram of the index distribution. This is used to compute the cost of targeting a range of values.

There is one known issue with the ANALYZE command. When generating the sqlite_stat1 table, ANALYZE must calculate the number of unique values in an index. To accomplish this, the ANALYZE command uses the standard SQL test for equivalence between index values. This means that if a single-column index contains multiple NULL entries, they will each be considered a nonequivalent, unique value (since NULL != NULL). As a result, if an index contains a large number of NULL values, the ANALYZE data will incorrectly consider the index to have more uniqueness than it actually does. This can incorrectly influence the optimizer to pick an index-based lookup when a full-table scan would be less expensive. Due to this behavior, if an index contains a noticeable percentage of NULL entries (say, 10 to 15% or more) and it is common to ask for all of the NULL (or non-NULL) rows, it is recommended that the ANALYZE data for that index is discarded.

Table of Contents for
Using SQLite

Name

Syntax

Common Usage

Description

See Also

Table of Contents for Using SQLite

Name

Syntax

Common Usage

Description

See Also

Table of Contents for
Using SQLite