Create histogram column statistics in Mariadb

Gather the column statistics of a table histogram in Mariadb

Check histogram is enabled or disabled (o means disabled)


SHOW VARIABLES LIKE 'histogram_size';
+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| histogram_size | 254   |
+----------------+-------+
1 row in set (0.001 sec)

Check the histogram for the table is present

MariaDB [classicmodels]> select * from mysql.column_stats where table_name = 'offices';
Empty set (0.001 sec)

Gather the statistics for the column or Create a histogram for the table

Syntax:

ANALYZE [NO_WRITE_TO_BINLOG | LOCAL] TABLE tbl_name [,tbl_name ...] 
  [PERSISTENT FOR [ALL|COLUMNS ([col_name [,col_name ...]])] 
    [INDEXES ([index_name [,index_name ...]])]] 

Example of syntax:

-- update all statistics for all columns and indexes
ANALYZE TABLE table_name PERSISTENT FOR ALL;

-- update specific columns and indexes:
ANALYZE TABLE table_name PERSISTENT FOR COLUMNS (co11,col2.....) INDEXES (idx1,idx2,...);

-- empty lists are allowed:
ANALYZE TABLE table_name PERSISTENT FOR COLUMNS (col1,col2,...) INDEXES ();
ANALYZE TABLE table_name PERSISTENT FOR COLUMNS () INDEXES (idx1,idx2,...);

-- the following will only update mysql.table_stats fields:
ANALYZE TABLE table_name PERSISTENT FOR COLUMNS () INDEXES ();

Example of creating histogram:

-- For all indexex and columns
ANALYZE TABLE offices PERSISTENT FOR ALL;
-- For only columns 
analyze table offices PERSISTENT for COLUMNS(city) indexes ();

MariaDB [classicmodels]> ANALYZE TABLE offices PERSISTENT FOR ALL \G;
*************************** 1. row ***************************
   Table: classicmodels.offices
      Op: analyze
Msg_type: status
Msg_text: Engine-independent statistics collected
*************************** 2. row ***************************
   Table: classicmodels.offices
      Op: analyze
Msg_type: status
Msg_text: OK
2 rows in set (0.093 sec)

Verify again histogram is created:

MariaDB [classicmodels]> explain select * from offices use index (city_count_idx) where city = 'a'\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: offices
         type: ref
possible_keys: city_count_idx
          key: city_count_idx
      key_len: 52
          ref: const
         rows: 1
        Extra: Using index condition
1 row in set (0.000 sec)

Setting for Histogram use:

HISTOGRAM_SIZE
The histogram_size variable determines the size, in bytes, from 0 to 255, used for a histogram. This is effectively the number of bins for histogram_type=SINGLE_PREC_HB or number of bins/2 for histogram_type=DOUBLE_PREC_HB. If it is set to 0 (the default for MariaDB 10.4.2 and below), no histograms are created when running an ANALYZE TABLE.

SHOW [GLOBAL | SESSION] VARIABLES [LIKE ‘pattern’ | WHERE expr]

SHOW [GLOBAL | SESSION] VARIABLES [LIKE 'pattern' | WHERE expr]

SHOW VARIABLES LIKE 'histogram_size';
+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| histogram_size | 254   |
+----------------+-------+
1 row in set (0.001 sec)

HISTOGRAM_TYPE
Its determines whether single precision (SINGLE_PREC_HB) or double precision (DOUBLE_PREC_HB) height-balanced histograms are created. From MariaDB 10.4.3, double precision is the default. For MariaDB 10.4.2 and below, single precision is the default.

SHOW VARIABLES LIKE 'histogram_type';
+----------------+----------------+
| Variable_name  | Value          |
+----------------+----------------+
| histogram_type | DOUBLE_PREC_HB |
+----------------+----------------+
1 row in set (0.001 sec)

OPTIMIZER_USE_CONDITION_SELECTIVITY
optimizer_use_condition_selectivity controls which statistics can be used by the optimizer when looking for the best query execution plan.
1 Use selectivity of predicates.
2 Use selectivity of all range predicates supported by indexes.
3 Use selectivity of all range predicates estimated without histogram.
4 Use selectivity of all range predicates estimated with histogram.
5 Additionally use selectivity of certain non-range predicates calculated on record sample.

SHOW VARIABLES LIKE 'optimizer_use_condition_selectivity';
+-------------------------------------+-------+
| Variable_name                       | Value |
+-------------------------------------+-------+
| optimizer_use_condition_selectivity | 4     |
+-------------------------------------+-------+
1 row in set (0.001 sec)

Refer link: https://mariadb.com/kb/en/histogram-based-statistics/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.