• 【PostgreSQL】pg 窗口函数总结


    窗口函数基于结果集进行计算,将计算出的结果合并到输出的结果集上,并返回多行。使用窗口函数能大幅度简化SQL代码。
    gaussdb提供内置的窗口函数,例如row_num()、rank()、lag()等,除了内置的窗口函数外,聚合函数、自定义函数后接OVER属性也可以作为窗口函数。
    1,创建测试表并插入数据。

    postgres=# DROP TABLE IF EXISTS scores;
    NOTICE:  table "scores" does not exist, skipping
    DROP TABLE
    postgres=# CREATE TABLE scores(id serial PRIMARY KEY,subject varchar(32),stu_name varchar(32),score numeric(3,0));
    CREATE TABLE
    postgres=# INSERT INTO scores(subject,stu_name,score) VALUES('Chinese','user1',80),('Chinese','user2',90),('Chinese','user3',90),('math','user1',90),('math','user2',80),('math','user3',100),('English','user1',80),('English','user2',90),('English','user3',70);
    INSERT 0 9
    postgres=# SELECT * FROM scores;
     id | subject | stu_name | score 
    ----+---------+----------+-------
      1 | Chinese | user1    |    80
      2 | Chinese | user2    |    90
      3 | Chinese | user3    |    90
      4 | math    | user1    |    90
      5 | math    | user2    |    80
      6 | math    | user3    |   100
      7 | English | user1    |    80
      8 | English | user2    |    90
      9 | English | user3    |    70
    (9 rows)

    2,avg() OVER()计算分组后数据的平均值。

    postgres=# SELECT subject,stu_name,score,avg(score) OVER(PARTITION BY subject) FROM scores;
     subject | stu_name | score |         avg         
    ---------+----------+-------+---------------------
     Chinese | user1    |    80 | 86.6666666666666667
     Chinese | user2    |    90 | 86.6666666666666667
     Chinese | user3    |    90 | 86.6666666666666667
     English | user3    |    70 | 80.0000000000000000
     English | user1    |    80 | 80.0000000000000000
     English | user2    |    90 | 80.0000000000000000
     math    | user1    |    90 | 90.0000000000000000
     math    | user2    |    80 | 90.0000000000000000
     math    | user3    |   100 | 90.0000000000000000
    (9 rows)

    3,row_number() OVER()对分组后的数据标注行号,从1开始。

    postgres=# SELECT row_number() OVER(PARTITION BY subject ORDER BY score DESC),* FROM scores;
     row_number | id | subject | stu_name | score 
    ------------+----+---------+----------+-------
              1 |  2 | Chinese | user2    |    90
              2 |  3 | Chinese | user3    |    90
              3 |  1 | Chinese | user1    |    80
              1 |  8 | English | user2    |    90
              2 |  7 | English | user1    |    80
              3 |  9 | English | user3    |    70
              1 |  6 | math    | user3    |   100
              2 |  4 | math    | user1    |    90
              3 |  5 | math    | user2    |    80
    (9 rows)

    4,rank() OVER()与row_number() OVER()类似主要区别是当组内某行字段值相同时,行号重复并且行号产生间隙。

    postgres=# SELECT rank() OVER(PARTITION BY subject ORDER BY score DESC),* FROM scores;
     rank | id | subject | stu_name | score 
    ------+----+---------+----------+-------
        1 |  2 | Chinese | user2    |    90
        1 |  3 | Chinese | user3    |    90
        3 |  1 | Chinese | user1    |    80
        1 |  8 | English | user2    |    90
        2 |  7 | English | user1    |    80
        3 |  9 | English | user3    |    70
        1 |  6 | math    | user3    |   100
        2 |  4 | math    | user1    |    90
        3 |  5 | math    | user2    |    80
    (9 rows)

    5,dense_rank() OVER()与rank() 类似,主要区别为当组内某行字段值相同时,虽然重复行号,但行号不产生间隙。

    postgres=# SELECT dense_rank() OVER(PARTITION BY subject ORDER BY score DESC),* FROM scores;
     dense_rank | id | subject | stu_name | score 
    ------------+----+---------+----------+-------
              1 |  2 | Chinese | user2    |    90
              1 |  3 | Chinese | user3    |    90
              2 |  1 | Chinese | user1    |    80
              1 |  8 | English | user2    |    90
              2 |  7 | English | user1    |    80
              3 |  9 | English | user3    |    70
              1 |  6 | math    | user3    |   100
              2 |  4 | math    | user1    |    90
              3 |  5 | math    | user2    |    80
    (9 rows)

    6,lag() OVER()可以获取行偏移offset那行字段的数据。

    postgres=# SELECT LAG(id,-1) OVER(),* FROM scores;
     lag | id | subject | stu_name | score 
    -----+----+---------+----------+-------
       2 |  1 | Chinese | user1    |    80
       3 |  2 | Chinese | user2    |    90
       4 |  3 | Chinese | user3    |    90
       5 |  4 | math    | user1    |    90
       6 |  5 | math    | user2    |    80
       7 |  6 | math    | user3    |   100
       8 |  7 | English | user1    |    80
       9 |  8 | English | user2    |    90
         |  9 | English | user3    |    70
    (9 rows)
    
    postgres=# SELECT LAG(id,1,100) OVER(),* FROM scores;--不存在时指定默认值
     lag | id | subject | stu_name | score 
    -----+----+---------+----------+-------
     100 |  1 | Chinese | user1    |    80
       1 |  2 | Chinese | user2    |    90
       2 |  3 | Chinese | user3    |    90
       3 |  4 | math    | user1    |    90
       4 |  5 | math    | user2    |    80
       5 |  6 | math    | user3    |   100
       6 |  7 | English | user1    |    80
       7 |  8 | English | user2    |    90
       8 |  9 | English | user3    |    70
    (9 rows)

    7,first_value() OVER()用来取结果集每一个分组的第一行数据的字段值。

    postgres=# SELECT first_value(score) OVER(PARTITION BY subject ORDER BY score DESC),* FROM scores;
     first_value | id | subject | stu_name | score 
    -------------+----+---------+----------+-------
              90 |  2 | Chinese | user2    |    90
              90 |  3 | Chinese | user3    |    90
              90 |  1 | Chinese | user1    |    80
              90 |  8 | English | user2    |    90
              90 |  7 | English | user1    |    80
              90 |  9 | English | user3    |    70
             100 |  6 | math    | user3    |   100
             100 |  4 | math    | user1    |    90
             100 |  5 | math    | user2    |    80
    (9 rows)

    8,last_value() OVER()用来取结果集每一个分组的最后一行数据的字段值。

    postgres=# SELECT last_value(score) OVER(PARTITION BY subject),* FROM scores;
     last_value | id | subject | stu_name | score 
    ------------+----+---------+----------+-------
             90 |  1 | Chinese | user1    |    80
             90 |  2 | Chinese | user2    |    90
             90 |  3 | Chinese | user3    |    90
             90 |  9 | English | user3    |    70
             90 |  7 | English | user1    |    80
             90 |  8 | English | user2    |    90
            100 |  4 | math    | user1    |    90
            100 |  5 | math    | user2    |    80
            100 |  6 | math    | user3    |   100
    (9 rows)

    9,nth_value() OVER()用来取结果集每一个分组的指定行数据的字段值。

    postgres=# SELECT nth_value(score,2) OVER(PARTITION BY subject),* FROM scores;
     nth_value | id | subject | stu_name | score 
    -----------+----+---------+----------+-------
            90 |  1 | Chinese | user1    |    80
            90 |  2 | Chinese | user2    |    90
            90 |  3 | Chinese | user3    |    90
            80 |  9 | English | user3    |    70
            80 |  7 | English | user1    |    80
            80 |  8 | English | user2    |    90
            80 |  4 | math    | user1    |    90
            80 |  5 | math    | user2    |    80
            80 |  6 | math    | user3    |   100
    (9 rows)

    10,如果窗口函数需要多次使用,可以使用窗口函数别名。

    postgres=# SELECT avg(score) OVER(r),sum(score) OVER(r),* FROM scores WINDOW r AS (PARTITION BY subject);
             avg         | sum | id | subject | stu_name | score 
    ---------------------+-----+----+---------+----------+-------
     86.6666666666666667 | 260 |  1 | Chinese | user1    |    80
     86.6666666666666667 | 260 |  2 | Chinese | user2    |    90
     86.6666666666666667 | 260 |  3 | Chinese | user3    |    90
     80.0000000000000000 | 240 |  9 | English | user3    |    70
     80.0000000000000000 | 240 |  7 | English | user1    |    80
     80.0000000000000000 | 240 |  8 | English | user2    |    90
     90.0000000000000000 | 270 |  4 | math    | user1    |    90
     90.0000000000000000 | 270 |  5 | math    | user2    |    80
     90.0000000000000000 | 270 |  6 | math    | user3    |   100
    (9 rows)
  • 相关阅读:
    批量替换文本的工具
    wcf异常显示错误到客户端
    文件以二进制存入数据库和从数据库读取二进制文件
    关于关系数据库的范式
    对于挑战书上的很久之前都看不懂的DP看懂的突破
    操作系统概念
    关于P,V操作理解的突破,关于并发设计与并行
    关于快速沃尔什变换
    我觉得我应该养成经常翻收藏夹的习惯
    目前我的思考模式
  • 原文地址:https://www.cnblogs.com/jelly12345/p/16718175.html
Copyright © 2020-2023  润新知