上一节我们介绍了PostgreSQL的子查询优化,子查询优化把一部分可以优化的子查询上拉到主查询成为join。
preprocess_expression
将表达式(目标列,where,join,having)简化表达式
static Node *
preprocess_expression(PlannerInfo *root, Node *expr, int kind)
{
/*
* If the query has any join RTEs, replace join alias variables with
* base-relation variables. We must do this before sublink processing,
* else sublinks expanded out from join aliases would not get processed.
* We can skip it in non-lateral RTE functions, VALUES lists, and
* TABLESAMPLE clauses, however, since they can't contain any Vars of the
* current query level.
*/
if (root->hasJoinRTEs &&
!(kind == EXPRKIND_RTFUNC ||
kind == EXPRKIND_VALUES ||
kind == EXPRKIND_TABLESAMPLE))
expr = flatten_join_alias_vars(root, expr);
//简化常量表达式,特别处理and和or
expr = eval_const_expressions(root, expr);
//标准化条件表达式(where和having)
if (kind == EXPRKIND_QUAL)
{
expr = (Node *) canonicalize_qual((Expr *) expr);
}
/* 求解子链接的表达式,对于不能上拉的子链接会递归调用subquery_planner */
if (root->parse->hasSubLinks)
expr = SS_process_sublinks(root, expr, (kind == EXPRKIND_QUAL));
/* 给相关子查询替换变量SubPlan */
if (root->query_level > 1)
expr = SS_replace_correlation_vars(root, expr);
/*把where和having中的and表达式转换为隐含形式*/
if (kind == EXPRKIND_QUAL)
expr = (Node *) make_ands_implicit((Expr *) expr);
return expr;
}
eval_const_expressions
canonicalize_qual
使用OR分配律简化逻辑表达式,得到顶层全是AND的表达式。例如:
((A AND B) OR (A AND C)) -> (A AND (B OR C))
reduce_outer_joins
void
reduce_outer_joins(PlannerInfo *root)
{
reduce_outer_joins_state *state;
/*
* To avoid doing strictness checks on more quals than necessary, we want
* to stop descending the jointree as soon as there are no outer joins
* below our current point. This consideration forces a two-pass process.
* The first pass gathers information about which base rels appear below
* each side of each join clause, and about whether there are outer
* join(s) below each side of each join clause. The second pass examines
* qual clauses and changes join types as it descends the tree.
*/
state = reduce_outer_joins_pass1((Node *) root->parse->jointree);
/* planner.c shouldn't have called me if no outer joins */
if (state == NULL || !state->contains_outer)
elog(ERROR, "so where are the outer joins?");
reduce_outer_joins_pass2((Node *) root->parse->jointree,
state, root, NULL, NIL, NIL);
}
static void
reduce_outer_joins_pass2(Node *jtnode,
reduce_outer_joins_state *state,
PlannerInfo *root,
Relids nonnullable_rels,
List *nonnullable_vars,
List *forced_null_vars)
{
else if (IsA(jtnode, JoinExpr))
{
switch (jointype)
{
case JOIN_INNER:
break;
case JOIN_LEFT://优化为内连接
if (bms_overlap(nonnullable_rels, right_state->relids))
jointype = JOIN_INNER;
break;
case JOIN_RIGHT://优化为内连接
if (bms_overlap(nonnullable_rels, left_state->relids))
jointype = JOIN_INNER;
break;
case JOIN_FULL:
if (bms_overlap(nonnullable_rels, left_state->relids))
{
if (bms_overlap(nonnullable_rels, right_state->relids))
jointype = JOIN_INNER;//优化为内连接
else
jointype = JOIN_LEFT;//优化为左外连接
}
else
{
if (bms_overlap(nonnullable_rels, right_state->relids))
jointype = JOIN_RIGHT;//优化为右外连接
}
break;
case JOIN_SEMI:
case JOIN_ANTI:
break;
default:
elog(ERROR, "unrecognized join type: %d",
(int) jointype);
break;
}
/*剩余的右外连接优化为左外连接*/
if (jointype == JOIN_RIGHT)
{
}
/*左外连接转换为反半连接*/
if (jointype == JOIN_LEFT)
{
}
/* 递归优化之树 */
if (left_state->contains_outer || right_state->contains_outer)
{
if (left_state->contains_outer)
{
reduce_outer_joins_pass2(j->larg, left_state, root,
pass_nonnullable_rels,
pass_nonnullable_vars,
pass_forced_null_vars);
}
if (right_state->contains_outer)
{
reduce_outer_joins_pass2(j->rarg, right_state, root,
pass_nonnullable_rels,
pass_nonnullable_vars,
pass_forced_null_vars);
}
bms_free(local_nonnullable_rels);
}
}
else
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(jtnode));
}
消除外连接,不是所有的外连接都能转换为内连接,只有基于"空值拒绝"的才可以。其实SQL还是外连接,只是这样可以通过调整多表连接顺序优化。
例如:left join,join后右边的列被非空条件排除
select * from x left join y on (x.x = y.y) where y.y is not null;
优化后只有FULL JOIN