MySQL优化器特性(六)表扫描成本计算
全表扫描成本
使用optimizer_trace,或者使用explain format=tree, 或者explain format=json,可以查看查询的cost
mysql> explain format=tree select * from test_ror; +---------------------------------------------------+ | EXPLAIN | +---------------------------------------------------+ | -> Table scan on test_ror (cost=10.75 rows=105) | +---------------------------------------------------+ 1 row in set (0.00 sec)
mysql> explain format=json select * from test_ror; { "query_block": { "select_id": 1, "cost_info": { "query_cost": "10.75" }, "table": { "table_name": "test_ror", "access_type": "ALL", "rows_examined_per_scan": 105, "rows_produced_per_join": 105, "filtered": "100.00", "cost_info": { "read_cost": "0.25", "eval_cost": "10.50", "prefix_cost": "10.75", "data_read_per_join": "2K" }, "used_columns": [ "id", "a", "b", "c" ] } } }
从上面的例子可以看到,查询总cost为10.75,其中包括read_cost 0.25, eval_cost 10.50。
我们来分析下这里的成本分别是怎么计算的。
全表扫描成本
全表扫描成本计算代码位于sql/sql_planner.cc文件,函数Optimize_table_order::best_access_path中计算。
double scan_read_cost = calculate_scan_cost( tab, idx, best_ref, prefix_rowcount, found_condition, disable_jbuf, &rows_after_filtering, &trace_access_scan); /* We estimate the cost of evaluating WHERE clause for found records as row_evaluate_cost(prefix_rowcount * rows_after_filtering). This cost plus scan_cost gives us total cost of using TABLE/INDEX/RANGE SCAN. */ const double scan_total_cost = scan_read_cost + cost_model->row_evaluate_cost(prefix_rowcount * rows_after_filtering);
成本分为两部分:
扫描表的IO成本(table_scan_cost)
读取记录、分析记录的成本(row_evaluate_cost)
需要注意,这里的全表扫描成本,是在没有其他访问路径可选的情况下计算得到的成本。优化器在评估索引范围扫描的成本时,也会先评估一下全表扫描的成本,这时计算全表扫描的成本公式和本文的描述会有一些差异。
table_scan_cost
Cost_estimate handler::table_scan_cost() { /* This function returns a Cost_estimate object. The function should be implemented in a way that allows the compiler to use "return value optimization" to avoid creating the temporary object for the return value and use of the copy constructor. */ const double io_cost = scan_time() * table->cost_model()->page_read_cost(1.0); Cost_estimate cost; cost.add_io(io_cost); return cost; }
scan_time
对于innodb引擎,scan_time取primary key(cluster index)的大小(单位是page)。
double ha_innobase::scan_time() { ulint stat_clustered_index_size; stat_clustered_index_size = m_prebuilt->table->stat_clustered_index_size; return ((double)stat_clustered_index_size); }
这个信息最终从统计信息中获取:
mysql> select * from mysql.innodb_table_stats where database_name='test' and table_name='test_ror'; +---------------+------------+---------------------+--------+----------------------+--------------------------+ | database_name | table_name | last_update | n_rows | clustered_index_size | sum_of_other_index_sizes | +---------------+------------+---------------------+--------+----------------------+--------------------------+ | test | test_ror | 2023-05-06 15:07:16 | 105 | 1 | 3 | +---------------+------------+---------------------+--------+----------------------+--------------------------+
page_read_cost
page_read_cost会基于表在内存中的缓存比例来计算读取数据块的成本。mysql会评估表的缓存比例,分别计算内存中的块的数量以及磁盘中的块的数据量,
pages_in_mem = pages * in_mem
pages_on_disk = pages - pages_in_mem
cost = pages_in_mem * memory_block_read_cost + pages_on_disk * io_block_read_cost
其中memory_block_read_cost和io_block_read_cost分别代表从内存和磁盘上读取一个页面的成本,这两个值可以在mysql.engine_cost中配置。8.0版本中,memory_block_read_cost默认为0.25,io_block_read_cost默认为1。
page_read_cost代码:
double Cost_model_table::page_read_cost(double pages) const { assert(m_initialized); assert(pages >= 0.0); // 表cache在内存中的比例,范围是0 - 1,0表示所有数据都不在内存中。1表示所有数据都缓存在内存中 const double in_mem = m_table->file->table_in_memory_estimate(); const double pages_in_mem = pages * in_mem; const double pages_on_disk = pages - pages_in_mem; assert(pages_on_disk >= 0.0); // buffer_blaock_read_cost:从内存中读取page的成本 // io_block_read_cost:从磁盘上读取page的成本 const double cost = buffer_block_read_cost(pages_in_mem) + io_block_read_cost(pages_on_disk); return cost; }
table_in_memory_estimate:计算表和索引数据缓存率
对于InnoDB,MySQL会维护每个索引缓存在内存中的Page数,可以基于这个数据计算出数据的缓存率。如果存储引擎没用提供缓存率数据(如MyISAM存储引擎),mysql会使用一个经验公式来预估数据的缓存率,具体逻辑在函数estimate_in_memory_buffer中实现。
double handler::table_in_memory_estimate() const { assert(stats.table_in_mem_estimate == IN_MEMORY_ESTIMATE_UNKNOWN || (stats.table_in_mem_estimate >= 0.0 && stats.table_in_mem_estimate <= 1.0)); /* If the storage engine has supplied information about how much of the table that is currently in a memory buffer, then use this estimate. */ if (stats.table_in_mem_estimate != IN_MEMORY_ESTIMATE_UNKNOWN) return stats.table_in_mem_estimate; /* The storage engine has not provided any information about how much of this index is in memory, use an heuristic to produce an estimate. */ return estimate_in_memory_buffer(stats.data_file_length); }
estimate_in_memory_buffer
根据表的大小和内存大小计算表的缓存率。对于InnoDB,内存大小是指InnoDB Buffer Pool的大小。
经验公式:如果表占用的空间不到内存的20%,则认为所有数据都缓存在内存中(缓存率为1)。如果表占用的空间超过内存的大小,则认为缓存率为0。介于两者之间的表,缓存率为 ( 1 - x ) / 0.8, 0.2 < x < 1,x为表的大小和内存大小的比值。
double handler::estimate_in_memory_buffer(ulonglong table_index_size) const { /* The storage engine has not provided any information about how much of the table/index is in memory. In this case we use a heuristic: - if the size of the table/index is less than 20 percent (pick any number) of the memory buffer, then the entire table/index is likely in memory. - if the size of the table/index is larger than the memory buffer, then assume nothing of the table/index is in memory. - if the size of the table/index is larger than 20 percent but less than the memory buffer size, then use a linear function of the table/index size that goes from 1.0 to 0.0. */ longlong memory_buf_size = get_memory_buffer_size(); if (memory_buf_size <= 0) memory_buf_size = 100 * 1024 * 1024; // 100 MB const double table_index_in_memory_limit = 0.2; const double percent_of_mem = static_cast<double>(table_index_size) / memory_buf_size; double in_mem_est; if (percent_of_mem < table_index_in_memory_limit) // Less than 20 percent in_mem_est = 1.0; else if (percent_of_mem > 1.0) // Larger than buffer in_mem_est = 0.0; else { in_mem_est = 1.0 - (percent_of_mem - table_index_in_memory_limit) / (1.0 - table_index_in_memory_limit); } return in_mem_est; }
InnoDB表和索引缓存率的计算
InnoDB维护了每个索引在buffer pool中缓存的页面数,基于这个统计数据,可以计算InnoDB索引的缓存率。
InnoDB表的缓存率为主键的缓存率。(如果InnoDB没有建立显式主键,则好像不会记录表的缓存率,这种情况下会使用经验公式来预估缓存率)。
变量buf_stat_per_index记录了每个索引缓存在buffer pool中的页面数。
/** Estimate what percentage of an index's pages are cached in the buffer pool @param[in] index index whose pages to look up @return a real number in [0.0, 1.0] designating the percentage of cached pages */ inline double index_pct_cached(const dict_index_t *index) { const ulint n_leaf = index->stat_n_leaf_pages; if (n_leaf == 0) { return (0.0); } const uint64_t n_in_mem = buf_stat_per_index->get(index_id_t(index->space, index->id)); const double ratio = static_cast<double>(n_in_mem) / n_leaf; return (std::max(std::min(ratio, 1.0), 0.0)); }
row_evaluate_cost
row_evaluate_cost计算访问行的成本,计算公式为 rows * row_evaluate_cost,row_evaluate_cost的值可以在mysql.server_cost表配置,8.0版本中,这个值默认为0.1,也就是访问一行数据的成本是0.1。
double row_evaluate_cost(double rows) const { return rows * m_server_cost_constants->row_evaluate_cost(); }
回到本文开始的例子,这个表总共有105行记录,cluster index占用了1个页面,数据全部缓存在内存中,总体成本计算:
io_cost = cluster_index_size * 0.25
row_eval_cost = rows * 0.1 = 10.5
cost = io_cost + row_eval_cost = 0.25 + 10.5 = 10.75