Postgresql源码(106)Generic Plan与Custom Plan的区别(以分区表为例)

发布于:2023-09-14 ⋅ 阅读:(82) ⋅ 点赞:(0)

相关:
《Postgresql源码(105)分区表剪枝代码分析》
《Postgresql源码(106)Generic Plan与Custom Plan的区别(以分区表为例)》

实例

CREATE TABLE measurement (
    city_id         int not null,
    logdate         date not null,
    peaktemp        int,
    unitsales       int
) PARTITION BY RANGE (logdate);

CREATE TABLE measurement_y2006m01 PARTITION OF measurement FOR VALUES FROM ('2006-01-01') TO ('2006-02-01');
CREATE TABLE measurement_y2006m02 PARTITION OF measurement FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
CREATE TABLE measurement_y2006m03 PARTITION OF measurement FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
CREATE TABLE measurement_y2006m04 PARTITION OF measurement FOR VALUES FROM ('2006-04-01') TO ('2006-05-01');
CREATE TABLE measurement_y2006m05 PARTITION OF measurement FOR VALUES FROM ('2006-05-01') TO ('2006-06-01');

insert into measurement values (1, '2006-01-03', floor(random() * 100), floor(random() * 100));
insert into measurement values (2, '2006-02-04', floor(random() * 100), floor(random() * 100));
insert into measurement values (3, '2006-03-05', floor(random() * 100), floor(random() * 100));
insert into measurement values (4, '2006-03-06', floor(random() * 100), floor(random() * 100));
insert into measurement values (5, '2006-03-07', floor(random() * 100), floor(random() * 100));
insert into measurement values (6, '2006-03-08', floor(random() * 100), floor(random() * 100));
insert into measurement values (7, '2006-04-09', floor(random() * 100), floor(random() * 100));
insert into measurement values (8, '2006-04-10', floor(random() * 100), floor(random() * 100));
insert into measurement values (9, '2006-05-11', floor(random() * 100), floor(random() * 100));
insert into measurement values (10, '2006-05-12', floor(random() * 100), floor(random() * 100));
prepare p1 (date, date) as select * from measurement where logdate between $1 and $2;

explain analyze execute p1('2006-04-05', '2006-04-20');

首先看Custom Plan(强制choose_custom_plan返回true)

postgres=# explain analyze execute p1('2006-04-05', '2006-04-20');

                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Seq Scan on measurement_y2006m04 measurement  (cost=0.00..37.75 rows=9 width=16) (actual time=0.017..0.020 rows=6 loops=1)
   Filter: ((logdate >= '2006-04-05'::date) AND (logdate <= '2006-04-20'::date))
 Planning Time: 1620.157 ms
 Execution Time: 0.041 ms

再看Generic Plan(强制choose_custom_plan返回false)

postgres=# explain analyze execute p1('2006-04-05', '2006-04-20');
                                                             QUERY PLAN                                                             
------------------------------------------------------------------------------------------------------------------------------------
 Append  (cost=0.00..188.97 rows=45 width=16) (actual time=0.027..0.034 rows=6 loops=1)
   Subplans Removed: 4
   ->  Seq Scan on measurement_y2006m04 measurement_1  (cost=0.00..37.75 rows=9 width=16) (actual time=0.024..0.028 rows=6 loops=1)
         Filter: ((logdate >= $1) AND (logdate <= $2))
 Planning Time: 28898.081 ms
 Execution Time: 0.102 ms
(6 rows)

问题一:看起来都剪枝了?

是的,从计划上来看,只能看到4月表,确实都剪枝了。

但仔细看还是有区别的,Generic Plan多了一行Subplans Removed: 4。这表示Generic计划实际是没剪枝的,是运行时做的prune。

证明:查看GetCachedPlan生成plan,可以看到下面的计划树把每个月分片都戴上了,未剪枝:
在这里插入图片描述
计划中是没剪枝的,但在计划中记录了裁剪的信息:make_partition_pruneinfo。

然后在执行器启动阶段:

#0  ExecInitAppend (node=0x17435a8, estate=0x173b3b0, eflags=16) at nodeAppend.c:111
#1  0x00000000007566d8 in ExecInitNode (node=0x17435a8, estate=0x173b3b0, eflags=16) at execProcnode.c:182
#2  0x000000000074c12c in InitPlan (queryDesc=0x1737b48, eflags=16) at execMain.c:938
#3  0x000000000074b112 in standard_ExecutorStart (queryDesc=0x1737b48, eflags=16) at execMain.c:265
#4  0x000000000074ae78 in ExecutorStart (queryDesc=0x1737b48, eflags=0) at execMain.c:144
#5  0x00000000006a69fc in ExplainOnePlan (plannedstmt=0x1743510, into=0x0, es=0x1744b70,  queryString=0x17287d0 "prepare p1 (date, date) as select * from measurement where logdate between $1 and $2;", params=0x1744e58, queryEnv=0x0, planduration=0x7ffc0dc5c470,  bufusage=0x0) at explain.c:579
#6  0x00000000006d3a35 in ExplainExecuteQuery (execstmt=0x16370a0, into=0x0, es=0x1744b70, queryString=0x1636510 "explain analyze execute p1('2006-04-05', '2006-04-20');",  params=0x0, queryEnv=0x0) at prepare.c:642
#7  0x00000000006a6807 in ExplainOneUtility (utilityStmt=0x16370a0, into=0x0, es=0x1744b70, queryString=0x1636510 "explain analyze execute p1('2006-04-05', '2006-04-20');",  params=0x0, queryEnv=0x0) at explain.c:486
#8  0x00000000006a62da in ExplainOneQuery (query=0x1637368, cursorOptions=2048, into=0x0, es=0x1744b70,  queryString=0x1636510 "explain analyze execute p1('2006-04-05', '2006-04-20');", params=0x0, queryEnv=0x0) at explain.c:375
#9  0x00000000006a5f8b in ExplainQuery (pstate=0x1658dc0, stmt=0x16370d8, params=0x0, dest=0x1658d28) at explain.c:281
#10 0x00000000009935b5 in standard_ProcessUtility (pstmt=0x16375f0, queryString=0x1636510 "explain analyze execute p1('2006-04-05', '2006-04-20');", readOnlyTree=false,  context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x1658d28, qc=0x7ffc0dc5c9b0) at utility.c:870
#11 0x0000000000992d3e in ProcessUtility (pstmt=0x16375f0, queryString=0x1636510 "explain analyze execute p1('2006-04-05', '2006-04-20');", readOnlyTree=false,  context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x1658d28, qc=0x7ffc0dc5c9b0) at utility.c:530
#12 0x0000000000991947 in PortalRunUtility (portal=0x16d6250, pstmt=0x16375f0, isTopLevel=true, setHoldSnapshot=true, dest=0x1658d28, qc=0x7ffc0dc5c9b0) at pquery.c:1158
#13 0x00000000009916b3 in FillPortalStore (portal=0x16d6250, isTopLevel=true) at pquery.c:1031
#14 0x000000000099101c in PortalRun (portal=0x16d6250, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x16376e0, altdest=0x16376e0, qc=0x7ffc0dc5cb80) at pquery.c:763
#15 0x000000000098ac9f in exec_simple_query (query_string=0x1636510 "explain analyze execute p1('2006-04-05', '2006-04-20');") at postgres.c:1250
#16 0x000000000098f1e0 in PostgresMain (dbname=0x16620d8 "postgres", username=0x16620b8 "mingjie") at postgres.c:4593
#17 0x00000000008cc052 in BackendRun (port=0x1658840) at postmaster.c:4511
#18 0x00000000008cb9eb in BackendStartup (port=0x1658840) at postmaster.c:4239
#19 0x00000000008c7fe8 in ServerLoop () at postmaster.c:1806
#20 0x00000000008c78ba in PostmasterMain (argc=1, argv=0x1630110) at postmaster.c:1478
#21 0x00000000007cdaff in main (argc=1, argv=0x1630110) at main.c:202

ExecInitAppend函数会根据优化器记录的信息,node->part_prune_info:
在这里插入图片描述
来对执行计划进行运行时剪枝:ExecInitPartitionPruning

从上图part_prune_info的结构来看,运行是剪枝和优化器剪枝用的是一套逻辑,都是先走prune step,最后combine,详见上一篇。

问题二:为什么Generic Plan不能剪枝?

Generic Plan不带参数,也就是只能靠

prepare p1 (date, date) as select * from measurement where logdate between $1 and $2;

生成计划,无法拿到参数也就无法剪枝了。

代码上有什么区别呢?

对比下prune_append_rel_partitions的执行流程区别:

prepare p1 (date, date) as select * from measurement where logdate between $1 and $2;
explain analyze execute p1('2006-04-05', '2006-04-20');

在这里插入图片描述