共计 30439 个字符,预计需要花费 77 分钟才能阅读完成。
本篇内容主要讲解“PostgreSQL 源码中 NOT IN 的作用是什么”,感兴趣的朋友不妨来看看。本文介绍的方法操作简单快捷,实用性强。下面就让丸趣 TV 小编来带大家学习“PostgreSQL 源码中 NOT IN 的作用是什么”吧!
一、数据结构
SubPlanState
子计划运行期状态
/* ----------------
* SubPlanState node
* ----------------
*/
typedef struct SubPlanState
NodeTag type;
SubPlan *subplan; /* expression plan node */
struct PlanState *planstate; /* subselect plan s state tree */
struct PlanState *parent; /* parent plan node s state tree */
ExprState *testexpr; /* 组合表达式状态;state of combining expression */
List *args; /* 参数表达式状态;states of argument expression(s) */
HeapTuple curTuple; /* subplan 最近的元组;copy of most recent tuple from subplan */
Datum curArray; /* most recent array from ARRAY() subplan */
/* these are used when hashing the subselect s output: */
TupleDesc descRight; /* 投影后的子查询描述符;subselect desc after projection */
ProjectionInfo *projLeft; /* for projecting lefthand exprs */
ProjectionInfo *projRight; /* for projecting subselect output */
TupleHashTable hashtable; /* hash table for no-nulls subselect rows */
TupleHashTable hashnulls; /* hash table for rows with null(s) */
bool havehashrows; /* true if hashtable is not empty */
bool havenullrows; /* true if hashnulls is not empty */
MemoryContext hashtablecxt; /* memory context containing hash tables */
MemoryContext hashtempcxt; /* temp memory context for hash tables */
ExprContext *innerecontext; /* econtext for computing inner tuples */
AttrNumber *keyColIdx; /* control data for hash tables */
Oid *tab_eq_funcoids; /* equality func oids for table
* datatype(s) */
Oid *tab_collations; /* collations for hash and comparison */
FmgrInfo *tab_hash_funcs; /* hash functions for table datatype(s) */
FmgrInfo *tab_eq_funcs; /* equality functions for table datatype(s) */
FmgrInfo *lhs_hash_funcs; /* hash functions for lefthand datatype(s) */
FmgrInfo *cur_eq_funcs; /* equality functions for LHS vs. table */
ExprState *cur_eq_comp; /* equality comparator for LHS vs. table */
} SubPlanState;
SubPlan
子查询计划
/*
* SubPlan - executable expression node for a subplan (sub-SELECT)
*
* The planner replaces SubLink nodes in expression trees with SubPlan
* nodes after it has finished planning the subquery. SubPlan references
* a sub-plantree stored in the subplans list of the toplevel PlannedStmt.
* (We avoid a direct link to make it easier to copy expression trees
* without causing multiple processing of the subplan.)
* 查询规划器在完成子查询的规划后使用 SubPlan 节点替换表达式树中的 SubLink 节点。 * SubPlan 引用了存储在高层 PlannedStmt 中的 subplans 链表中的 sub-plantree。 * (避免使用直接链接,从而使得拷贝表达式树相对比较简单) *
* In an ordinary subplan, testexpr points to an executable expression
* (OpExpr, an AND/OR tree of OpExprs, or RowCompareExpr) for the combining
* operator(s); the left-hand arguments are the original lefthand expressions,
* and the right-hand arguments are PARAM_EXEC Param nodes representing the
* outputs of the sub-select. (NOTE: runtime coercion functions may be
* inserted as well.) This is just the same expression tree as testexpr in
* the original SubLink node, but the PARAM_SUBLINK nodes are replaced by
* suitably numbered PARAM_EXEC nodes.
* 常规情况下,testexpr 指向用于组合操作的可执行表达式(OpExpr、OpExprs 的 AND/OR 树或者 RowCompareExpr); * 左参数是原始的左表达式,右参数是 PARAM_EXEC 参数节点用以表示子查询的输出。 * 与原始 SubLink 节点的 testexpr 具有相同的表达式树,但 PARAM_SUBLINK 节点则使用合适的已编号 PARAM_EXEC 节点替代。 *
* If the sub-select becomes an initplan rather than a subplan, the executable
* expression is part of the outer plan s expression tree (and the SubPlan
* node itself is not, but rather is found in the outer plan s initPlan
* list). In this case testexpr is NULL to avoid duplication.
* 如果子查询成了 initplan 而不是 subplan,可执行的表达式是外层 plan 表达式树的一部分。 * 这种情况下,testexpr 为 NULL 以避免重复。 *
* The planner also derives lists of the values that need to be passed into
* and out of the subplan. Input values are represented as a list args of
* expressions to be evaluated in the outer-query context (currently these
* args are always just Vars, but in principle they could be any expression).
* The values are assigned to the global PARAM_EXEC params indexed by parParam
* (the parParam and args lists must have the same ordering). setParam is a
* list of the PARAM_EXEC params that are computed by the sub-select, if it
* is an initplan; they are listed in order by sub-select output column
* position. (parParam and setParam are integer Lists, not Bitmapsets,
* because their ordering is significant.)
* 规划器还派生了需要传入和传出子计划的值的链表。 * 输入值标识位表达式的“args”链表,在外层查询上下文中进行解析。 * (这些 args 通常是 Vars,但原则上它们可以是任意表达式) * 这些值以 parParam 为索引给全局 PARAM_EXEC 参数赋值。 * setParam 是 PARAM_EXEC 参数链表,通过子查询(如为 initplan)计算所得。 * 它们按子查询输出列的位置进行排序组织为链表形式。 * (parParam 和 setParam 是整型链表,而不是 Bitmapsets 链表) *
* Also, the planner computes startup and per-call costs for use of the
* SubPlan. Note that these include the cost of the subquery proper,
* evaluation of the testexpr if any, and any hashtable management overhead.
* 同时,规划器计算 SubPlan 启动和每次调用的成本。注意:包括子查询正常解析 testexpr 的成本以及哈希表管理成本。 */
typedef struct SubPlan
Expr xpr;// 表达式
/* Fields copied from original SubLink: */
// 从 SubLink 中拷贝而来
SubLinkType subLinkType; /* see above */
/* The combining operators, transformed to an executable expression: */
// 组合操作符, 转换为可执行的表达式
Node *testexpr; /* OpExpr or RowCompareExpr expression tree */
List *paramIds; /* 参数 IDs;IDs of Params embedded in the above */
/* Identification of the Plan tree to use: */
//Plan tree 标识
int plan_id; /* Index (from 1) in PlannedStmt.subplans */
/* Identification of the SubPlan for EXPLAIN and debugging purposes: */
//EXPLAIN 和 debug 目的的 SubPlan 标识
char *plan_name; /* A name assigned during planning */
/* Extra data useful for determining subplan s output type: */
// 用于确定 subplan 输出类型的额外信息
Oid firstColType; /* subplan 结果的第一个列类型;Type of first column of subplan result */
int32 firstColTypmod; /* 第一列的 Typmod;Typmod of first column of subplan result */
Oid firstColCollation; /* 第一列的 Collation;Collation of first column of subplan
* result */
/* Information about execution strategy: */
// 执行阶段的相关信息
bool useHashTable; /* 是否使用哈希表存储子查询输出;true to store subselect output in a hash
* table (implies we are doing IN) */
bool unknownEqFalse; /* 如 OK 为 T, 如为未知则为 F; 快速处理 null 值;true if it s okay to return FALSE when the
* spec result is UNKNOWN; this allows much
* simpler handling of null values */
bool parallel_safe; /* 是否并行安全?is the subplan parallel-safe? */
/* Note: parallel_safe does not consider contents of testexpr or args */
/* Information for passing params into and out of the subselect: */
// 用于给子查询传入和传出参数的信息
/* setParam and parParam are lists of integers (param IDs) */
//setParam 和 parParam 是整型链表(param IDs)
List *setParam; /* initplan subqueries have to set these
* Params for parent plan */
List *parParam; /* indices of input Params from parent plan */
List *args; /* 以 parParam 值进行传递的表达式;exprs to pass as parParam values */
/* Estimated execution costs: */
// 估算执行成本
Cost startup_cost; /* one-time setup cost */
Cost per_call_cost; /* cost for each subplan evaluation */
} SubPlan;
SubLinkType
SubLink 类型
/*
* SubLink
*
* A SubLink represents a subselect appearing in an expression, and in some
* cases also the combining operator(s) just above it. The subLinkType
* indicates the form of the expression represented:
* EXISTS_SUBLINK EXISTS(SELECT ...)
* ALL_SUBLINK (lefthand) op ALL (SELECT ...)
* ANY_SUBLINK (lefthand) op ANY (SELECT ...)
* ROWCOMPARE_SUBLINK (lefthand) op (SELECT ...)
* EXPR_SUBLINK (SELECT with single targetlist item ...)
* MULTIEXPR_SUBLINK (SELECT with multiple targetlist items ...)
* ARRAY_SUBLINK ARRAY(SELECT with single targetlist item ...)
* CTE_SUBLINK WITH query (never actually part of an expression)
* For ALL, ANY, and ROWCOMPARE, the lefthand is a list of expressions of the
* same length as the subselect s targetlist. ROWCOMPARE will *always* have
* a list with more than one entry; if the subselect has just one target
* then the parser will create an EXPR_SUBLINK instead (and any operator
* above the subselect will be represented separately).
* ROWCOMPARE, EXPR, and MULTIEXPR require the subselect to deliver at most
* one row (if it returns no rows, the result is NULL).
* ALL, ANY, and ROWCOMPARE require the combining operators to deliver boolean
* results. ALL and ANY combine the per-row results using AND and OR
* semantics respectively.
* ARRAY requires just one target column, and creates an array of the target
* column s type using any number of rows resulting from the subselect.
*
* SubLink is classed as an Expr node, but it is not actually executable;
* it must be replaced in the expression tree by a SubPlan node during
* planning.
*
* NOTE: in the raw output of gram.y, testexpr contains just the raw form
* of the lefthand expression (if any), and operName is the String name of
* the combining operator. Also, subselect is a raw parsetree. During parse
* analysis, the parser transforms testexpr into a complete boolean expression
* that compares the lefthand value(s) to PARAM_SUBLINK nodes representing the
* output columns of the subselect. And subselect is transformed to a Query.
* This is the representation seen in saved rules and in the rewriter.
*
* In EXISTS, EXPR, MULTIEXPR, and ARRAY SubLinks, testexpr and operName
* are unused and are always null.
*
* subLinkId is currently used only for MULTIEXPR SubLinks, and is zero in
* other SubLinks. This number identifies different multiple-assignment
* subqueries within an UPDATE statement s SET list. It is unique only
* within a particular targetlist. The output column(s) of the MULTIEXPR
* are referenced by PARAM_MULTIEXPR Params appearing elsewhere in the tlist.
*
* The CTE_SUBLINK case never occurs in actual SubLink nodes, but it is used
* in SubPlans generated for WITH subqueries.
*/
typedef enum SubLinkType
EXISTS_SUBLINK,
ALL_SUBLINK,
ANY_SUBLINK,
ROWCOMPARE_SUBLINK,
EXPR_SUBLINK,
MULTIEXPR_SUBLINK,
ARRAY_SUBLINK,
CTE_SUBLINK /* for SubPlans only */
} SubLinkType;
SubLink
SubLink 结构体
typedef struct SubLink
Expr xpr;
SubLinkType subLinkType; /* see above */
int subLinkId; /* ID (1..n); 0 if not MULTIEXPR */
Node *testexpr; /* outer-query test for ALL/ANY/ROWCOMPARE */
List *operName; /* originally specified operator name */
Node *subselect; /* subselect as Query* or raw parsetree */
int location; /* token location, or -1 if unknown */
} SubLink;
二、源码解读
ExecScanSubPlan
/*
* ExecScanSubPlan: default case where we have to rescan subplan each time
* 默认情况下每次都不得不重新扫描 subplan
*/
static Datum
ExecScanSubPlan(SubPlanState *node,
ExprContext *econtext,
bool *isNull)
SubPlan *subplan = node- subplan;// 子计划
PlanState *planstate = node- planstate;// 计划运行期状态
SubLinkType subLinkType = subplan- subLinkType;// 子链接类型
MemoryContext oldcontext;// 原内存上下文
TupleTableSlot *slot;// 元组 slot
Datum result;// 结果指针
bool found = false; /* 如找到至少一个元组, 则返回 T;true if got at least one subplan tuple */
ListCell *pvar;// 临时变量
ListCell *l;// 临时变量
ArrayBuildStateAny *astate = NULL;//
/*
* MULTIEXPR subplans, when executed , just return NULL; but first we
* mark the subplan s output parameters as needing recalculation. (This
* is a bit of a hack: it relies on the subplan appearing later in its
* targetlist than any of the referencing Params, so that all the Params
* have been evaluated before we re-mark them for the next evaluation
* cycle. But in general resjunk tlist items appear after non-resjunk
* ones, so this should be safe.) Unlike ExecReScanSetParamPlan, we do
* *not* set bits in the parent plan node s chgParam, because we don t
* want to cause a rescan of the parent.
*
* MULTIEXPR 处理逻辑
*/
if (subLinkType == MULTIEXPR_SUBLINK)
{
EState *estate = node- parent- state;
foreach(l, subplan- setParam)
{ int paramid = lfirst_int(l);
ParamExecData *prm = (estate- es_param_exec_vals[paramid]);
prm- execPlan = node;
}
*isNull = true;
return (Datum) 0;
}
/* Initialize ArrayBuildStateAny in caller s context, if needed */
// 数组
if (subLinkType == ARRAY_SUBLINK)
astate = initArrayResultAny(subplan- firstColType,
CurrentMemoryContext, true);
/*
* We are probably in a short-lived expression-evaluation context. Switch
* to the per-query context for manipulating the child plan s chgParam,
* calling ExecProcNode on it, etc.
*/
// 切换上下文
oldcontext = MemoryContextSwitchTo(econtext- ecxt_per_query_memory);
/*
* Set Params of this plan from parent plan correlation values. (Any
* calculation we have to do is done in the parent econtext, since the
* Param values don t need to have per-query lifetime.)
*/
// 通过父计划相关值中设置子计划参数
Assert(list_length(subplan- parParam) == list_length(node- args));
forboth(l, subplan- parParam, pvar, node- args)
{ int paramid = lfirst_int(l);
ParamExecData *prm = (econtext- ecxt_param_exec_vals[paramid]);
prm- value = ExecEvalExprSwitchContext((ExprState *) lfirst(pvar),
econtext,
(prm- isnull));
planstate- chgParam = bms_add_member(planstate- chgParam, paramid);
}
/*
* Now that we ve set up its parameters, we can reset the subplan.
*/
// 执行 ReScan
//Reset a plan node so that its output can be re-scanned.
ExecReScan(planstate);
/*
* For all sublink types except EXPR_SUBLINK and ARRAY_SUBLINK, the result
* is boolean as are the results of the combining operators. We combine
* results across tuples (if the subplan produces more than one) using OR
* semantics for ANY_SUBLINK or AND semantics for ALL_SUBLINK.
* (ROWCOMPARE_SUBLINK doesn t allow multiple tuples from the subplan.)
* NULL results from the combining operators are handled according to the
* usual SQL semantics for OR and AND. The result for no input tuples is
* FALSE for ANY_SUBLINK, TRUE for ALL_SUBLINK, NULL for
* ROWCOMPARE_SUBLINK.
* 除 EXPR_SUBLINK 和 ARRAY_SUBLINK 外的所有 sublink, 结果是布尔值(组合运算符的结果).
* PG 通过跨元组 (如子计划产生多个元组) 合并结果, 对于 ANY_SUBLINK 使用 OR 语义,ALL_SUBLINK 则使用 AND 语义.
* (ROWCOMPARE_SUBLINK 不允许子计划返回多个元组)
* 从组合操作符中返回的 NULL 遵循 SQL 中的 OR 和 AND 语义.
* 如没有输入元组,ANY_SUBLINK 为 FALSE,ALL_SUBLINK 为 TRUE,ROWCOMPARE_SUBLINK 为 NULL.
*
* For EXPR_SUBLINK we require the subplan to produce no more than one
* tuple, else an error is raised. If zero tuples are produced, we return
* NULL. Assuming we get a tuple, we just use its first column (there can
* be only one non-junk column in this case).
* 对于 EXPR_SUBLINK, 需要 subplan 产生不超过一个元组, 否则报错. 如果没有元组产生, 返回 NULL.
* 假定获取到一个元组, 则使用第一个列(这种情况下只有一个 non-junk 列).
*
* For ARRAY_SUBLINK we allow the subplan to produce any number of tuples,
* and form an array of the first column s values. Note in particular
* that we produce a zero-element array if no tuples are produced (this is
* a change from pre-8.3 behavior of returning NULL).
* 对于 ARRAY_SUBLINK, 允许 subplan 产生任意数目的元组, 使用第一个列值组成数组.
* 特别注意的是如没有元组产生则产生 0 个元素的数组(8.3 以前是返回 NULL).
*/
result = BoolGetDatum(subLinkType == ALL_SUBLINK);//ALL 为 T, 否则为 F
*isNull = false;
for (slot = ExecProcNode(planstate);
!TupIsNull(slot);
slot = ExecProcNode(planstate))// 循环获取元组, 直至没有元组为 NULL(即已完成)
{
// 元组描述符
TupleDesc tdesc = slot- tts_tupleDescriptor;
Datum rowresult;// 结果
bool rownull;// 是否为空?
int col;// 列计数器
ListCell *plst;// 临时变量
if (subLinkType == EXISTS_SUBLINK)//EXISTS
{
found = true;
result = BoolGetDatum(true);
break;
}
if (subLinkType == EXPR_SUBLINK)//EXPR 表达式
{
/* cannot allow multiple input tuples for EXPR sublink */
if (found)
ereport(ERROR,
(errcode(ERRCODE_CARDINALITY_VIOLATION),
errmsg(more than one row returned by a subquery used as an expression)));
found = true;
/*
* We need to copy the subplan s tuple in case the result is of
* pass-by-ref type --- our return value will point into this
* copied tuple! Can t use the subplan s instance of the tuple
* since it won t still be valid after next ExecProcNode() call.
* node- curTuple keeps track of the copied tuple for eventual
* freeing.
*/
if (node- curTuple)
heap_freetuple(node- curTuple);
node- curTuple = ExecCopySlotHeapTuple(slot);
result = heap_getattr(node- curTuple, 1, tdesc, isNull);
/* keep scanning subplan to make sure there s only one tuple */
continue;
}
if (subLinkType == ARRAY_SUBLINK)// 数组
{
Datum dvalue;
bool disnull;
found = true;
/* stash away current value */
Assert(subplan- firstColType == TupleDescAttr(tdesc, 0)- atttypid);
dvalue = slot_getattr(slot, 1, disnull);
astate = accumArrayResultAny(astate, dvalue, disnull,
subplan- firstColType, oldcontext);
/* keep scanning subplan to collect all values */
continue;
}
/* cannot allow multiple input tuples for ROWCOMPARE sublink either */
if (subLinkType == ROWCOMPARE_SUBLINK found)// 行比较
ereport(ERROR,
(errcode(ERRCODE_CARDINALITY_VIOLATION),
errmsg(more than one row returned by a subquery used as an expression)));
found = true;// 初始为 T
/*
* For ALL, ANY, and ROWCOMPARE sublinks, load up the Params
* representing the columns of the sub-select, and then evaluate the
* combining expression.
* 对于 ALL,ANY 和 ROWCOMPARE 子链接, 加载表示子查询列的 Params, 并解析组合表达式
*/
col = 1;// 列从 1 计数
foreach(plst, subplan- paramIds)// 循环遍历子查询参数
{ int paramid = lfirst_int(plst);
ParamExecData *prmdata;
prmdata = (econtext- ecxt_param_exec_vals[paramid]);
Assert(prmdata- execPlan == NULL);
// 获取参数值
prmdata- value = slot_getattr(slot, col, (prmdata- isnull));
// 下一个列
col++;
}
// 解析表达式
rowresult = ExecEvalExprSwitchContext(node- testexpr, econtext,
rownull);
if (subLinkType == ANY_SUBLINK)
{
//ANY : 使用 OR 语义组合
/* combine across rows per OR semantics */
if (rownull)
*isNull = true;
else if (DatumGetBool(rowresult))
{ result = BoolGetDatum(true);
*isNull = false;
break; /* needn t look at any more rows */
}
}
else if (subLinkType == ALL_SUBLINK)
{
//ALL : 使用 AND 语义
/* combine across rows per AND semantics */
if (rownull)
*isNull = true;
else if (!DatumGetBool(rowresult))
{ result = BoolGetDatum(false);
*isNull = false;
break; /* needn t look at any more rows */
}
}
else
{
/* must be ROWCOMPARE_SUBLINK */
// 这里一定是 ROWCOMPARE
result = rowresult;
*isNull = rownull;
}
}
MemoryContextSwitchTo(oldcontext);
if (subLinkType == ARRAY_SUBLINK)
{
/* We return the result in the caller s context */
// 在调用者上下文中返回结果
result = makeArrayResultAny(astate, oldcontext, true);
}
else if (!found)
{
/*
* deal with empty subplan result. result/isNull were previously
* initialized correctly for all sublink types except EXPR and
* ROWCOMPARE; for those, return NULL.
* subplan 没有结果返回, 设置 result isNull 值
*/
if (subLinkType == EXPR_SUBLINK ||
subLinkType == ROWCOMPARE_SUBLINK)
{ result = (Datum) 0;
*isNull = true;
}
}
// 返回结果
return result;
}
ExecProcNode
执行输入参数 node 的 ExecProcNode 方法。
/* ----------------------------------------------------------------
* ExecProcNode
*
* Execute the given node to return a(nother) tuple.
* ----------------------------------------------------------------
*/
#ifndef FRONTEND
static inline TupleTableSlot *
ExecProcNode(PlanState *node)
if (node- chgParam != NULL) /* something changed? */
ExecReScan(node); /* let ReScan handle this */
return node- ExecProcNode(node);
#endif
三、跟踪分析
执行 SQL:
[pg12@localhost ~]$ psql -d testdb
Timing is on.
Expanded display is used automatically.
psql (12.0)
Type help for help.
[local]:5432 pg12@testdb=#
[local]:5432 pg12@testdb=# select * from tbl;
id | value
----+-------
1 | 2
(1 row)
Time: 2.678 ms
[local]:5432 pg12@testdb=# select count(*) from t_big_null;
count
----------
10000001
(1 row)
Time: 679.972 ms
[local]:5432 pg12@testdb=# analyze tbl;
ANALYZE
Time: 64.442 ms
[local]:5432 pg12@testdb=# analyze t_big_null;
ANALYZE
Time: 434.702 ms
[local]:5432 pg12@testdb=#
[local]:5432 pg12@testdb=# select pg_backend_pid();
pg_backend_pid
----------------
18758
(1 row)
Time: 1.990 ms
[local]:5432 pg12@testdb=# select * from tbl a where a.id not in (select b.id from t_big_null b);
启动 gdb 跟踪
(gdb) b ExecScanSubPlan
Breakpoint 1 at 0x73014b: file nodeSubplan.c, line 228.
(gdb) c
Continuing.
Breakpoint 1, ExecScanSubPlan (node=0x1c89158, econtext=0x1c88990, isNull=0x1c88cad)
at nodeSubplan.c:228
228 SubPlan *subplan = node- subplan;
(gdb)
(gdb) n
229 PlanState *planstate = node- planstate;
(gdb)
230 SubLinkType subLinkType = subplan- subLinkType;
(gdb)
234 bool found = false; /* true if got at least one subplan tuple */
subplan 变量值(结构体 SubPlan)
其中 testexpr 是测试表达式,操作符是整型的等值比较,
左操作符是 Var,数据表的第一个列,类型为 23-int4(select * from pg_type where oid=23;)
有操作符是 Param,执行期设置,对应参数索引编号为 0,类型为 23-int4
paramIds 是链表,第一项 ID 值为 0(int_value = 0)
子查询结果的第一个列类型为 23-int4
(gdb) p *subplan
$1 = {xpr = {type = T_SubPlan}, subLinkType = ANY_SUBLINK, testexpr = 0x1cb8790,
paramIds = 0x1cb8758, plan_id = 1, plan_name = 0x1cb8a80 SubPlan 1 ,
firstColType = 23, firstColTypmod = -1, firstColCollation = 0, useHashTable = false,
unknownEqFalse = false, parallel_safe = true, setParam = 0x0, parParam = 0x0,
args = 0x0, startup_cost = 0, per_call_cost = 129155.31875000001}
(gdb) p *subplan- testexpr
$2 = {type = T_OpExpr}
(gdb) p *(OpExpr *)subplan- testexpr
$3 = {xpr = {type = T_OpExpr}, opno = 96, opfuncid = 65, opresulttype = 16,
opretset = false, opcollid = 0, inputcollid = 0, args = 0x1cb8868, location = 31}
(gdb) set $expr=(OpExpr *)subplan- testexpr
(gdb) p *$expr- args
$4 = {type = T_List, length = 2, head = 0x1cb8840, tail = 0x1cb88d8}
(gdb) p *$expr- args- head
$5 = {data = {ptr_value = 0x1cb87e8, int_value = 30115816, oid_value = 30115816},
next = 0x1cb88d8}
(gdb) p *(Node *)$expr- args- head- data.ptr_value
$6 = {type = T_Var}
(gdb) p *(Var *)$expr- args- head- data.ptr_value
$7 = {xpr = {type = T_Var}, varno = 1, varattno = 1, vartype = 23, vartypmod = -1,
varcollid = 0, varlevelsup = 0, varnoold = 1, varoattno = 1, location = 26}
(gdb) p *(Var *)$expr- args- head- next- data.ptr_value
$8 = {xpr = {type = T_Param}, varno = 1, varattno = 0, vartype = 23, vartypmod = -1,
varcollid = 0, varlevelsup = 4294967295, varnoold = 2139062142, varoattno = 16,
location = 0}
(gdb) p *(Param *)$expr- args- head- next- data.ptr_value
$9 = {xpr = {type = T_Param}, paramkind = PARAM_EXEC, paramid = 0, paramtype = 23,
paramtypmod = -1, paramcollid = 0, location = -1}
(gdb)
[local]:5432 pg12@testdb=# select * from pg_proc where oid = 65; -- opfuncid = 65,函数
-[ RECORD 1 ]---+-------
oid | 65
proname | int4eq
pronamespace | 11
proowner | 10
prolang | 12
procost | 1
prorows | 0
provariadic | 0
prosupport | -
prokind | f
prosecdef | f
proleakproof | t
proisstrict | t
proretset | f
provolatile | i
proparallel | s
pronargs | 2
pronargdefaults | 0
prorettype | 16
proargtypes | 23 23
proallargtypes |
proargmodes |
proargnames |
proargdefaults |
protrftypes |
prosrc | int4eq
probin |
proconfig |
proacl |
Time: 6.253 ms
(gdb) p *subplan- paramIds
$11 = {type = T_IntList, length = 1, head = 0x1cb8730, tail = 0x1cb8730}
(gdb) p *subplan- paramIds- head
$12 = {data = {ptr_value = 0x7f7f7f7f00000000, int_value = 0, oid_value = 0}, next = 0x0}
(gdb)
planstate 变量值,实际的执行节点是 ExecMaterial
(gdb) p *planstate
$15 = {type = T_MaterialState, plan = 0x1cb8900, state = 0x1c87da8,
ExecProcNode = 0x6f802a ExecProcNodeFirst ,
ExecProcNodeReal = 0x720ecf ExecMaterial , instrument = 0x0, worker_instrument = 0x0,
worker_jit_instrument = 0x0, qual = 0x0, lefttree = 0x1c88160, righttree = 0x0,
initPlan = 0x0, subPlan = 0x0, chgParam = 0x0, ps_ResultTupleDesc = 0x1c88580,
ps_ResultTupleSlot = 0x1c88698, ps_ExprContext = 0x0, ps_ProjInfo = 0x0,
scandesc = 0x1c88468, scanops = 0xc3e720 TTSOpsMinimalTuple , outerops = 0x0,
innerops = 0x0, resultops = 0xc3e720 TTSOpsMinimalTuple , scanopsfixed = true,
outeropsfixed = false, inneropsfixed = false, resultopsfixed = true, scanopsset = true,
outeropsset = false, inneropsset = false, resultopsset = true}
(gdb)
subLinkType 变量,类型为 ANY_SUBLINK,把 NOT IN 转为 ANY_SUBLINK
(gdb) p subLinkType
$16 = ANY_SUBLINK
设置变量,执行相关检查等
(gdb) n
237 ArrayBuildStateAny *astate = NULL;
(gdb)
250 if (subLinkType == MULTIEXPR_SUBLINK)
(gdb)
266 if (subLinkType == ARRAY_SUBLINK)
(gdb)
275 oldcontext = MemoryContextSwitchTo(econtext- ecxt_per_query_memory);
(gdb)
282 Assert(list_length(subplan- parParam) == list_length(node- args));
(gdb)
284 forboth(l, subplan- parParam, pvar, node- args)
(gdb) p subplan- parParam
$17 = (List *) 0x0
执行 ExecReScan,planstate- ExecProcNode 实际为 ExecMaterial
(gdb) n
298 ExecReScan(planstate);
(gdb) p *planstate
$20 = {type = T_MaterialState, plan = 0x1cb8900, state = 0x1c87da8,
ExecProcNode = 0x720ecf ExecMaterial , ExecProcNodeReal = 0x720ecf ExecMaterial ,
instrument = 0x0, worker_instrument = 0x0, worker_jit_instrument = 0x0, qual = 0x0,
lefttree = 0x1c88160, righttree = 0x0, initPlan = 0x0, subPlan = 0x0, chgParam = 0x0,
ps_ResultTupleDesc = 0x1c88580, ps_ResultTupleSlot = 0x1c88698, ps_ExprContext = 0x0,
ps_ProjInfo = 0x0, scandesc = 0x1c88468, scanops = 0xc3e720 TTSOpsMinimalTuple ,
outerops = 0x0, innerops = 0x0, resultops = 0xc3e720 TTSOpsMinimalTuple ,
scanopsfixed = true, outeropsfixed = false, inneropsfixed = false,
resultopsfixed = true, scanopsset = true, outeropsset = false, inneropsset = false,
resultopsset = true}
(gdb)
初始化 result isNull 变量
(gdb) n
321 result = BoolGetDatum(subLinkType == ALL_SUBLINK);
(gdb)
322 *isNull = false;
(gdb) p result
$18 = 0
执行循环,slot 从 planstate- ExecProcNode(即 ExecMaterial)中获取
(gdb) n
324 for (slot = ExecProcNode(planstate);
(gdb)
325 !TupIsNull(slot);
(gdb)
(gdb) p *slot
$22 = {type = T_TupleTableSlot, tts_flags = 20, tts_nvalid = 0,
tts_ops = 0xc3e720 TTSOpsMinimalTuple , tts_tupleDescriptor = 0x1c88580,
tts_values = 0x1c88708, tts_isnull = 0x1c88710, tts_mcxt = 0x1c87c90, tts_tid = { ip_blkid = {bi_hi = 65535, bi_lo = 65535}, ip_posid = 0}, tts_tableOid = 0}
(gdb) p slot- tts_values[0]
$23 = 0
(gdb)
获取 slot 的描述符
(gdb)
328 TupleDesc tdesc = slot- tts_tupleDescriptor;
(gdb) p tdesc
$21 = (TupleDesc) 0x206e6f6900000000
判断 subLink 类型执行相关逻辑,循环 paramIds,填充参数值
(gdb) n
334 if (subLinkType == EXISTS_SUBLINK)
(gdb)
341 if (subLinkType == EXPR_SUBLINK)
(gdb)
367 if (subLinkType == ARRAY_SUBLINK)
(gdb)
383 if (subLinkType == ROWCOMPARE_SUBLINK found)
(gdb)
388 found = true;
(gdb)
395 col = 1;
(gdb)
396 foreach(plst, subplan- paramIds)
(gdb) p *slot
$22 = {type = T_TupleTableSlot, tts_flags = 20, tts_nvalid = 0,
tts_ops = 0xc3e720 TTSOpsMinimalTuple , tts_tupleDescriptor = 0x1c88580,
tts_values = 0x1c88708, tts_isnull = 0x1c88710, tts_mcxt = 0x1c87c90, tts_tid = { ip_blkid = {bi_hi = 65535, bi_lo = 65535}, ip_posid = 0}, tts_tableOid = 0}
(gdb) p *slot- tts_values[0]
Cannot access memory at address 0x0
(gdb) p slot- tts_values[0]
$23 = 0
(gdb) n
398 int paramid = lfirst_int(plst);
(gdb)
401 prmdata = (econtext- ecxt_param_exec_vals[paramid]);
(gdb) p paramid
$24 = 0
(gdb) n
402 Assert(prmdata- execPlan == NULL);
(gdb) p *prmdata
$25 = {execPlan = 0x0, value = 0, isnull = false}
(gdb) n
403 prmdata- value = slot_getattr(slot, col, (prmdata- isnull));
(gdb) n
404 col++;
(gdb) p *prmdata
$26 = {execPlan = 0x0, value = 2, isnull = false}
(gdb) n
396 foreach(plst, subplan- paramIds)
(gdb)
解析表达式,亦即解析表达式 tbl.id = 2,获取结果
(gdb)
407 rowresult = ExecEvalExprSwitchContext(node- testexpr, econtext,
(gdb) p *(Var *)$expr- args- head- data.ptr_value
$28 = {xpr = {type = T_Var}, varno = 1, varattno = 1, vartype = 23, vartypmod = -1,
varcollid = 0, varlevelsup = 0, varnoold = 1, varoattno = 1, location = 26}
(gdb) p *(Param *)$expr- args- head- next- data.ptr_value
$29 = {xpr = {type = T_Param}, paramkind = PARAM_EXEC, paramid = 0, paramtype = 23,
paramtypmod = -1, paramcollid = 0, location = -1}
(gdb) p econtext- ecxt_param_exec_vals[0]
$30 = {execPlan = 0x0, value = 2, isnull = false}
由于 tbl.id = 1,因此表达式解析结果为 F
(gdb) n
410 if (subLinkType == ANY_SUBLINK)
(gdb) p rowresult
$31 = 0
(gdb)
判断 subLink 类型,如为 ANY 则进入相应逻辑
(gdb) n
410 if (subLinkType == ANY_SUBLINK)
(gdb) p rowresult
$31 = 0
(gdb)
n
413 if (rownull)
(gdb) p rownull
$32 = false
(gdb) n
415 else if (DatumGetBool(rowresult))
(gdb)
结果不为 T,因此再次循环执行,获取 slot,这次的值应为 3,因为 3 1,因此再次循环直至 t_big_null 中出现值 1 或完成扫描
326 slot = ExecProcNode(planstate))
(gdb)
(gdb)
403 prmdata- value = slot_getattr(slot, col, (prmdata- isnull));
(gdb)
404 col++;
(gdb) p prmdata- value
$36 = 3
(gdb) p slot- tts_values[0]
$37 = 3
(gdb)
...
完成执行
(gdb) c
Continuing.
[local]:5432 pg12@testdb=# select * from tbl a where a.id not in (select b.id from t_big_null b);
id | value
----+-------
(0 rows)
Time: 3089703.031 ms (51:29.703)
[local]:5432 pg12@testdb=#
[local]:5432 pg12@testdb=#
###
调用栈
Breakpoint 1, ExecScanSubPlan (node=0x1c89158, econtext=0x1c88990, isNull=0x1c88cad)
at nodeSubplan.c:228
228 SubPlan *subplan = node- subplan;
(gdb) bt
#0 ExecScanSubPlan (node=0x1c89158, econtext=0x1c88990, isNull=0x1c88cad)
at nodeSubplan.c:228
#1 0x000000000072fe5e in ExecSubPlan (node=0x1c89158, econtext=0x1c88990,
isNull=0x1c88cad) at nodeSubplan.c:90
#2 0x00000000006e90a8 in ExecEvalSubPlan (state=0x1c88ca8, op=0x1c88d80,
econtext=0x1c88990) at execExprInterp.c:3783
#3 0x00000000006e48b2 in ExecInterpExpr (state=0x1c88ca8, econtext=0x1c88990,
isnull=0x7fff7b9514f7) at execExprInterp.c:1484
#4 0x00000000006e50b8 in ExecInterpExprStillValid (state=0x1c88ca8, econtext=0x1c88990,
isNull=0x7fff7b9514f7) at execExprInterp.c:1769
#5 0x00000000006f9ec0 in ExecEvalExprSwitchContext (state=0x1c88ca8, econtext=0x1c88990,
isNull=0x7fff7b9514f7) at ../../../src/include/executor/executor.h:307
#6 0x00000000006f9fb8 in ExecQual (state=0x1c88ca8, econtext=0x1c88990)
at ../../../src/include/executor/executor.h:376
#7 0x00000000006fa37d in ExecScan (node=0x1c88878, accessMtd=0x72b84c SeqNext ,
recheckMtd=0x72b8f1 SeqRecheck ) at execScan.c:228
#8 0x000000000072b93b in ExecSeqScan (pstate=0x1c88878) at nodeSeqscan.c:112
#9 0x00000000006f8077 in ExecProcNodeFirst (node=0x1c88878) at execProcnode.c:445
#10 0x00000000006ed3cd in ExecProcNode (node=0x1c88878)
at ../../../src/include/executor/executor.h:239
#11 0x00000000006efbb3 in ExecutePlan (estate=0x1c87da8, planstate=0x1c88878,
use_parallel_mode=false, operation=CMD_SELECT, sendTuples=true, numberTuples=0,
direction=ForwardScanDirection, dest=0x1cbc7c0, execute_once=true) at execMain.c:1646
#12 0x00000000006ed9df in standard_ExecutorRun (queryDesc=0x1be78b8,
direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:364
#13 0x00000000006ed815 in ExecutorRun (queryDesc=0x1be78b8,
direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:308
#14 0x00000000008f1010 in PortalRunSelect (portal=0x1c27d78, forward=true, count=0,
dest=0x1cbc7c0) at pquery.c:929
#15 0x00000000008f0cae in PortalRun (portal=0x1c27d78, count=9223372036854775807,
isTopLevel=true, run_once=true, dest=0x1cbc7c0, altdest=0x1cbc7c0,
completionTag=0x7fff7b951890 ) at pquery.c:770
#16 0x00000000008ead35 in exec_simple_query ( query_string=0x1bc1d88 select * from tbl a where a.id not in (select b.id from t_big_null b); ) at postgres.c:1215
#17 0x00000000008eefa5 in PostgresMain (argc=1, argv=0x1bedf18,
dbname=0x1bedd60 testdb , username=0x1bbeaa8 pg12 ) at postgres.c:4236
#18 0x0000000000845915 in BackendRun (port=0x1be3d30) at postmaster.c:4431
#19 0x00000000008450f3 in BackendStartup (port=0x1be3d30) at postmaster.c:4122
#20 0x000000000084132f in ServerLoop () at postmaster.c:1704
#21 0x0000000000840be5 in PostmasterMain (argc=1, argv=0x1bbca60) at postmaster.c:1377
#22 0x0000000000761469 in main (argc=1, argv=0x1bbca60) at main.c:228
(gdb)
到此,相信大家对“PostgreSQL 源码中 NOT IN 的作用是什么”有了更深的了解,不妨来实际操作一番吧!这里是丸趣 TV 网站,更多相关内容可以进入相关频道进行查询,关注我们,继续学习!