• openGauss内核分析:执行计划生成


    摘要:SQL语句解析完成后被解析成Query结构,在进行优化时是以Query为单位进行的,Query的优化分为基于规则的逻辑优化(查询重写)和基于代价的物理优化(计划生成),主入口函数为subquery_planner。subquery_planner函数接收Query(查询树),返回一个Plan(计划树)。

    本文分享自华为云社区《openGauss内核分析(六) 执行计划生成》,作者:Gauss松鼠会。

    SQL语句解析完成后被解析成Query结构,在进行优化时是以Query为单位进行的,Query的优化分为基于规则的逻辑优化(查询重写)和基于代价的物理优化(计划生成),主入口函数为subquery_planner。subquery_planner函数接收Query(查询树),返回一个Plan(计划树)。

    Plan* subquery_planner(PlannerGlobal* glob, Query* parse, PlannerInfo* parent_root, bool hasRecursion,
        double tuple_fraction, PlannerInfo** subroot, int options, ItstDisKey* diskeys, List* subqueryRestrictInfo)
    {
        PlannerInfo* root = NULL;
    Plan* plan = NULL; //返回结果
    …
    preprocess_const_params(root, (Node*)parse->jointree); // 常数替换等式
    if (parse->hasSubLinks) {
            pull_up_sublinks(root); //提升子链接
            DEBUG_QRW("After sublink pullup");
        }
     /* Reduce orderby clause in subquery for join */
        reduce_orderby(parse, false); //减少orderby
        DEBUG_QRW("After order by reduce");
        if (u_sess->attr.attr_sql.enable_constraint_optimization) {
            removeNotNullTest(root); //删除NotNullTest
            DEBUG_QRW("After soft constraint removal");
        }
    …
        if ((LAZY_AGG & u_sess->attr.attr_sql.rewrite_rule) && permit_from_rewrite_hint(root, LAZY_AGG)) {
            lazyagg_main(parse); // lazyagg重写
            DEBUG_QRW("After lazyagg");
    }
    …
    parse->jointree = (FromExpr*)pull_up_subqueries(root, (Node*)parse->jointree); //提升子查询
    if (parse->setOperations) {
            flatten_simple_union_all(root); //UNIONALL优化
            DEBUG_QRW("After simple union all flatten");
    }
    …
    expand_inherited_tables(root); //展开继承表
    …
    parse->targetList = (List*)preprocess_expression(root, (Node*)parse->targetList, EXPRKIND_TARGET); //预处理表达式
    …
    parse->havingQual = (Node *) newHaving; //处理HAVING子句
    …
            reduce_outer_joins(root); //外连接消除
    …
                    reduce_inequality_fulljoins(root); //全连接重写
        …
    plan = grouping_planner(root, tuple_fraction); //主要的计划过程
    return plan;
    }

    subquery_planner函数由函数standard_planner调用,standard_planner函数由exec_simple_query->pg_plan_queries->pg_plan_query->planner函数调用。standard_planner将Query(查询树)生成规划好的语句,可用于执行器实际执行。

    PlannedStmt* standard_planner(Query* parse, int cursorOptions, ParamListInfo boundParams)
    {
        PlannedStmt* result = NULL;  //返回结果
    PlannerGlobal* glob = NULL;  
    double tuple_fraction;
        PlannerInfo* root = NULL;
    Plan* top_plan = NULL;
    …
        glob = makeNode(PlannerGlobal); 
     /* primary planning entry point (may recurse for subqueries) */
    top_plan = subquery_planner(glob, parse, NULL, false, tuple_fraction, &root);  //主规划过程入口
    /* build the PlannedStmt result */
    result = makeNode(PlannedStmt);  //构造PlannedStmt
    result->commandType = parse->commandType;
    result->queryId = parse->queryId;
    result->uniqueSQLId = parse->uniqueSQLId;
    result->hasReturning = (parse->returningList != NIL);
    result->hasModifyingCTE = parse->hasModifyingCTE;
    result->canSetTag = parse->canSetTag;
    result->transientPlan = glob->transientPlan;
    result->dependsOnRole = glob->dependsOnRole;
    result->planTree = top_plan; //执行计划
    result->rtable = glob->finalrtable;
    result->resultRelations = glob->resultRelations;
    …
        return result;
    }

    仍然以前文的join列子来说明

    SELECT * FROM t1 inner JOIN t2 ON t1.c1 = t2.c1;复制

    在planner函数打断点,用gdb查看standard_planner返回的PlannedStmt

    (gdb) bt
    #0  planner (parse=0x7fd93a410288, cursorOptions=0, boundParams=0x0) at planner.cpp:389
    #1  0x0000000001936fbd in pg_plan_query (querytree=0x7fd93a410288, cursorOptions=0, boundParams=0x0, underExplain=false) at postgres.cpp:1197
    #2  0x0000000001937381 in pg_plan_queries (querytrees=0x7fd939b81090, cursorOptions=0, boundParams=0x0) at postgres.cpp:1315
    #3  0x000000000193a6b8 in exec_simple_query (query_string=0x7fd966ad2060 "SELECT * FROM t1 inner JOIN t2 ON t1.c1 = t2.c1;", messageType=QUERY_MESSAGE, msg=0x7fd931056210)
        at postgres.cpp:2560
    #4  0x0000000001947104 in PostgresMain (argc=1, argv=0x7fd93a2cf1c0, dbname=0x7fd93a2ce1f8 "postgres", username=0x7fd93a2ce1b0 "test") at postgres.cpp:8403
    #5  0x0000000001890740 in BackendRun (port=0x7fd931056720) at postmaster.cpp:8053
    #6  0x00000000018a00b1 in GaussDbThreadMain<(knl_thread_role)1> (arg=0x7fd97c55c5f0) at postmaster.cpp:12181
    #7  0x000000000189c0de in InternalThreadFunc (args=0x7fd97c55c5f0) at postmaster.cpp:12755
    #8  0x00000000024bf7d8 in ThreadStarterFunc (arg=0x7fd97c55c5e0) at gs_thread.cpp:382
    #9  0x00007fd9a60cfdd5 in start_thread () from /lib64/libpthread.so.0
    #10 0x00007fd9a5df8ead in clone () from /lib64/libc.so.6
    (gdb) p *result
    $14 = {type = T_PlannedStmt, commandType = CMD_SELECT, queryId = 0, hasReturning = false, hasModifyingCTE = false, canSetTag = true, transientPlan = false, dependsOnRole = false,
      planTree = 0x7fd93a409d58, rtable = 0x7fd939b81660, …}
    (gdb) p *result->planTree->lefttree
    $46 = {type = T_SeqScan, plan_node_id = 2, parent_node_id = 1, exec_type = EXEC_ON_DATANODES, startup_cost = 0, total_cost = 1.03, plan_rows = 3, multiple = 1, plan_width = 8,…}

    将Query规划后得到PlannedStmt

    可以看到,Plannedstmt 与explain执行计划是一致的

     

    点击关注,第一时间了解华为云新鲜技术~

  • 相关阅读:
    Thinking in Java Reading Note(9.接口)
    Thinking in java Reading Note(8.多态)
    Thinking in Java Reading Note(7.复用类)
    SQL必知必会
    Thinking in Java Reading Note(5.初始化与清理)
    Thinking in Java Reading Note(2.一切都是对象)
    鸟哥的Linux私房菜笔记(1.基础)
    Thinking in Java Reading Note(1.对象导论)
    CoreJava2 Reading Note(2:I/O)
    CoreJava2 Reading Note(1:Stream)
  • 原文地址:https://www.cnblogs.com/huaweiyun/p/16635198.html
Copyright © 2020-2023  润新知