• postgresql/lightdb prepare、bind、execute与fetch cursor的内核与jdbc驱动实现详解


      preparedstatement将PREPARE,BIND,EXECUTE分开,其好处是避免了重复的语法分析、语义分析与重写,对于复杂SQL来说,其效果更加明显。

    In the extended-query protocol, execution of SQL commands is divided into multiple steps. The state retained between steps is represented by two types of objects: prepared statements and portals. A prepared statement represents the result of parsing and semantic analysis of a textual query string. A prepared statement is not in itself ready to execute, because it might lack specific values for parameters. A portal represents a ready-to-execute or already-partially-executed statement, with any missing parameter values filled in. (For SELECT statements, a portal is equivalent to an open cursor, but we choose to use a different term since cursors don't handle non-SELECT statements.)
    
    The overall execution cycle consists of a parse step, which creates a prepared statement from a textual query string; a bind step, which creates a portal given a prepared statement and values for any needed parameters; and an execute step that runs a portal's query. In the case of a query that returns rows (SELECT, SHOW, etc), the execute step can be told to fetch only a limited number of rows, so that multiple execute steps might be needed to complete the operation.
    
    The backend can keep track of multiple prepared statements and portals (but note that these exist only within a session, and are never shared across sessions). Existing prepared statements and portals are referenced by names assigned when they were created. In addition, an “unnamed” prepared statement and portal exist. Although these behave largely the same as named objects, operations on them are optimized for the case of executing a query only once and then discarding it, whereas operations on named objects are optimized on the expectation of multiple uses.

    不同于SIMPLE QUERY,P B E的入口不太一样:

            switch (firstchar)
            {
                case 'Q':            /* simple query */
                    {
                        const char *query_string;
    
                        /* Set statement_timestamp() */
                        SetCurrentStatementStartTimestamp();
    
                        query_string = pq_getmsgstring(&input_message);
                        pq_getmsgend(&input_message);
    
                        if (am_walsender)
                        {
                            if (!exec_replication_command(query_string))
                                exec_simple_query(query_string);
                        }
                        else
                            exec_simple_query(query_string);
    
                        send_ready_for_query = true;
                    }
                    break;
    
                case 'P':            /* parse */
                    {
                        const char *stmt_name;
                        const char *query_string;
                        int            numParams;
                        Oid           *paramTypes = NULL;
    
                        forbidden_in_wal_sender(firstchar);
    
                        /* Set statement_timestamp() */
                        SetCurrentStatementStartTimestamp();
    
                        stmt_name = pq_getmsgstring(&input_message);
                        query_string = pq_getmsgstring(&input_message);
                        numParams = pq_getmsgint(&input_message, 2);
                        if (numParams > 0)
                        {
                            paramTypes = (Oid *) palloc(numParams * sizeof(Oid));
                            for (int i = 0; i < numParams; i++)
                                paramTypes[i] = pq_getmsgint(&input_message, 4);
                        }
                        pq_getmsgend(&input_message);
    
                        exec_parse_message(query_string, stmt_name,
                                           paramTypes, numParams);
                    }
                    break;
    
                case 'B':            /* bind */
                    forbidden_in_wal_sender(firstchar);
    
                    /* Set statement_timestamp() */
                    SetCurrentStatementStartTimestamp();
    
                    /*
                     * this message is complex enough that it seems best to put
                     * the field extraction out-of-line
                     */
                    exec_bind_message(&input_message);
                    break;
    
                case 'E':            /* execute */
                    {
                        const char *portal_name;
                        int            max_rows;
    
                        forbidden_in_wal_sender(firstchar);
    
                        /* Set statement_timestamp() */
                        SetCurrentStatementStartTimestamp();
    
                        portal_name = pq_getmsgstring(&input_message);
                        max_rows = pq_getmsgint(&input_message, 4);
                        pq_getmsgend(&input_message);
    
                        exec_execute_message(portal_name, max_rows);
                    }
                    break;

    prepare处理流程:

    StorePreparedStatement prepare.c:432
    PrepareQuery prepare.c:173
    standard_ProcessUtility utility.c:737
    pgss_ProcessUtility pg_stat_statements.c:1201
    pgaudit_ProcessUtility_hook pgaudit.c:1412
    ProcessUtility utility.c:521
    PortalRunUtility pquery.c:1157
    PortalRunMulti pquery.c:1303
    PortalRun pquery.c:779
    exec_simple_query postgres.c:1326
    PostgresMain postgres.c:4445
    BackendRun postmaster.c:4883
    BackendStartup postmaster.c:4567
    ServerLoop postmaster.c:1854
    PostmasterMain postmaster.c:1487
    main main.c:231
    __libc_start_main 0x00007f32f566f555
    _start 0x0000000000484799
    

      PREPARE时,语句将被parsed, analyzed, and rewritten。BIND的时候被plan,EXECUTE的时候被执行。

      最后通过调用StorePreparedStatement保存在per-backend prepared_queries哈希中(实际上把它放在全局变量的话,是可以做到backend无关的)。

    FetchPreparedStatement prepare.c:477
    UtilityReturnsTuples utility.c:2020
    ChoosePortalStrategy pquery.c:258
    PortalStart pquery.c:464
    exec_simple_query postgres.c:1287
    PostgresMain postgres.c:4445
    BackendRun postmaster.c:4883
    BackendStartup postmaster.c:4567
    ServerLoop postmaster.c:1854
    PostmasterMain postmaster.c:1487
    main main.c:231
    __libc_start_main 0x00007f32f566f555
    _start 0x0000000000484799
    GetCachedPlan plancache.c:1157
    ExecuteQuery prepare.c:233
    standard_ProcessUtility utility.c:742
    pgss_ProcessUtility pg_stat_statements.c:1201
    pgaudit_ProcessUtility_hook pgaudit.c:1412
    ProcessUtility utility.c:521
    PortalRunUtility pquery.c:1157
    PortalRunMulti pquery.c:1303
    PortalRun pquery.c:779
    exec_simple_query postgres.c:1326
    PostgresMain postgres.c:4445
    BackendRun postmaster.c:4883
    BackendStartup postmaster.c:4567
    ServerLoop postmaster.c:1854
    PostmasterMain postmaster.c:1487
    main main.c:231
    __libc_start_main 0x00007f32f566f555
    _start 0x0000000000484799

    BIND:

      执行计划生成发生在BIND环节,因为CBO生成执行计划需要依赖于参数,选择custom还是generic执行计划也是在这一步。通过FetchPreparedStatement获取CachedPlanSource(未plan,所以叫plansource,至于已经plan的,那叫做generic plan,由choose_custom_plan和参数plan_cache_mode决定),如下:

    /*
     * Lookup an existing query in the hash table. If the query does not
     * actually exist, throw ereport(ERROR) or return NULL per second parameter.
     *
     * Note: this does not force the referenced plancache entry to be valid,
     * since not all callers care.
     */
    PreparedStatement *
    FetchPreparedStatement(const char *stmt_name, bool throwError)
    {
        PreparedStatement *entry;
    
        /*
         * If the hash table hasn't been initialized, it can't be storing
         * anything, therefore it couldn't possibly store our plan.
         */
        if (prepared_queries)
            entry = (PreparedStatement *) hash_search(prepared_queries,
                                                      stmt_name,
                                                      HASH_FIND,
                                                      NULL);
        else
            entry = NULL;
    
        if (!entry && throwError)
            ereport(ERROR,
                    (errcode(ERRCODE_UNDEFINED_PSTATEMENT),
                     errmsg("prepared statement \"%s\" does not exist",
                            stmt_name)));
    
        return entry;
    }

      决定新生成plan还是复用generic_plan的逻辑如下:

    CachedPlan *
    GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
                  bool useResOwner, QueryEnvironment *queryEnv)
    {
    ......
        /* Make sure the querytree list is valid and we have parse-time locks */
        qlist = RevalidateCachedQuery(plansource, queryEnv);
    
        /* Decide whether to use a custom plan */
        customplan = choose_custom_plan(plansource, boundParams);
    
        if (!customplan)
        {
            if (CheckCachedPlan(plansource))
            {
                /* We want a generic plan, and we already have a valid one */
                plan = plansource->gplan;
                Assert(plan->magic == CACHEDPLAN_MAGIC);
            }
            else
            {
                /* Build a new generic plan */
                plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
                /* Just make real sure plansource->gplan is clear */
                ReleaseGenericPlan(plansource);
                /* Link the new generic plan into the plansource */
                plansource->gplan = plan;
                plan->refcount++;
                /* Immediately reparent into appropriate context */
                if (plansource->is_saved)
                {
                    /* saved plans all live under CacheMemoryContext */
                    MemoryContextSetParent(plan->context, CacheMemoryContext);
                    plan->is_saved = true;
                }
                else
                {
                    /* otherwise, it should be a sibling of the plansource */
                    MemoryContextSetParent(plan->context,
                                           MemoryContextGetParent(plansource->context));
                }
                /* Update generic_cost whenever we make a new generic plan */
                plansource->generic_cost = cached_plan_cost(plan, false);
    
                /*
                 * If, based on the now-known value of generic_cost, we'd not have
                 * chosen to use a generic plan, then forget it and make a custom
                 * plan.  This is a bit of a wart but is necessary to avoid a
                 * glitch in behavior when the custom plans are consistently big
                 * winners; at some point we'll experiment with a generic plan and
                 * find it's a loser, but we don't want to actually execute that
                 * plan.
                 */
                customplan = choose_custom_plan(plansource, boundParams);
    
                /*
                 * If we choose to plan again, we need to re-copy the query_list,
                 * since the planner probably scribbled on it.  We can force
                 * BuildCachedPlan to do that by passing NIL.
                 */
                qlist = NIL;
            }
        }
    
        if (customplan)
        {
            /* Build a custom plan */
            plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
            /* Accumulate total costs of custom plans, but 'ware overflow */
            if (plansource->num_custom_plans < INT_MAX)
            {
                plansource->total_custom_cost += cached_plan_cost(plan, true);
                plansource->num_custom_plans++;
            }
        }
    ......

      需要注意的是,BIND中区分了语句和portal(虽然portal是运行时表示,portal是execute的前提条件)。命名portal和未命名portal的区别在于:命名portal会持续到事务结束或被显示销毁,未命名会在下一次BIND执行时自动销毁,命名PORTAL必须在下次BIND之前显示被销毁。

    https://jdbc.postgresql.org/documentation/head/server-prepare.html

     

      注:上图也说明了,PG服务端是支持一次接收多个可连续的请求命令的,如上面的PARSE和BIND。因为协议是从解析第一个字符开始、然后读取长度,最后消息体。pgcomm.c pq_startmsgread pq_getbyte、及postgres.c中的SocketBackend已经处理的很干净了。pg协议也明确说明了不能根据顺序要求接收服务端应答。所以可以发现PARSE/BIND/DESCRIBE/EXECUTE以及对应的Complete分别都只有一个tcp消息。极大的减少了通信开销。

       第二次请求不包含PARSE,如下:

      这是在客户端进行处理的。

      https://www.postgresql.org/docs/current/sql-prepare.html

      https://www.postgresql.org/docs/current/view-pg-prepared-statements.html

      显示执行的PREPARE会在pg_prepared_statements中实时显示预编译的语句,协议级的PREPARE则不会在此体现。因为postgresql的plancache是per backend的,所以要验证的话,就得在java中查询pg_prepared_statements。

    zjh@postgres=# PREPARE fooplan (int, text, bool, numeric) AS
    zjh@postgres-#     INSERT INTO foo VALUES($1, $2, $3, $4);
    PREPARE
    zjh@postgres=# select * from pg_prepared_statements ;
      name   |                   statement                   |         prepare_time         |        parameter_types         | from_sql 
    ---------+-----------------------------------------------+------------------------------+--------------------------------+----------
     fooplan | PREPARE fooplan (int, text, bool, numeric) AS+| 2022-01-26 10:04:10.81974+00 | {integer,text,boolean,numeric} | t
             |     INSERT INTO foo VALUES($1, $2, $3, $4);   |                              |                                | 
    (1 row)

      按照https://www.postgresql.org/message-id/CAL454F2yiTPqnTAVw78teOCnHvYxMSjzSekH8wjOPxVNTLFejw%40mail.gmail.com的说法,JDBC只要设置setPrepareThreshold(1)即可。javadoc也确实是如此说的,如下:

      /**
       * Turn on the use of prepared statements in the server (server side prepared statements are
       * unrelated to jdbc PreparedStatements) As of build 302, this method is equivalent to
       * <code>setPrepareThreshold(1)</code>.
       *
       * @param flag use server prepare
       * @throws SQLException if something goes wrong
       * @since 7.3
       * @deprecated As of build 302, replaced by {@link #setPrepareThreshold(int)}
       */
      @Deprecated
      void setUseServerPrepare(boolean flag) throws SQLException;

       代码中判断是否oneshotquery是通过org.postgresql.jdbc.PgStatement#executeInternal中调用isOneShotQuery(cachedQuery),而它这事通过mPrepareThreshold==0来判断,pgjdbc客户端好几个鸡肋。

  • 相关阅读:
    IE hasLayout详解
    seajs引入jquery
    jquery实现轮播插件
    CSS视觉格式化模型
    js事件冒泡和事件捕获详解
    你尽力了么===BY cloudsky
    前向否定界定符 python正则表达式不匹配某个字符串 以及无捕获组和命名组(转)
    php safe mode bypass all <转>
    WAF指纹探测及识别技术<freebuf>
    linux集群管理<转>
  • 原文地址:https://www.cnblogs.com/zhjh256/p/15844512.html
Copyright © 2020-2023  润新知