• 【累积型快照事实表】订单表


    1、concat

    concat函数在连接字符串的时候,只要其中一个是NULL,那么将返回NULL

    hive> select concat('a','b');
    ab
    
    hive> select concat('a','b',null);
    NULL

    2、concat_ws

    concat_ws函数在连接字符串的时候,只要有一个字符串不是NULL,就不会返回NULL。

    concat_ws函数需要指定分隔符。

    hive> select concat_ws('-','a','b');
    a-b
    
    hive> select concat_ws('-','a','b',null);
    a-b
    
    hive> select concat_ws('','a','b',null);
    ab

    3、STR_TO_MAP

    • 语法

    STR_TO_MAP(VARCHAR text, VARCHAR listDelimiter, VARCHAR keyValueDelimiter)

    • 功能
    1. 使用listDelimiter将text分隔成K-V对,
    2. 然后使用keyValueDelimiter分隔每个K-V对,组装成MAP返回。
    3. 默认listDelimiter为( ,)keyValueDelimiter为(=)
    • 案例
    str_to_map('1001=2020-03-10,1002=2020-03-10',  ','  ,  '=')
    输出
    {"1001":"2020-03-10","1002":"2020-03-10"}

    4、案例

    第一步:

    hive> select order_id, concat(order_status,'=', operate_time) from order_status_log where dt='2020-03-10'; 
    
    3210    1001=2020-03-10 00:00:00.0
    3211    1001=2020-03-10 00:00:00.0
    3212    1001=2020-03-10 00:00:00.0
    
    3210    1002=2020-03-10 00:00:00.0
    3211    1002=2020-03-10 00:00:00.0
    3212    1002=2020-03-10 00:00:00.0
    
    3210    1005=2020-03-10 00:00:00.0
    3211    1004=2020-03-10 00:00:00.0
    3212    1004=2020-03-10 00:00:00.0

    第二步:

    hive > select order_id, collect_set(concat(order_status,'=',operate_time)) from order_status_log where dt='2020-03-10' group by order_id;
    
    3210    ["1001=2020-03-10 00:00:00.0","1002=2020-03-10 00:00:00.0","1005=2020-03-10 00:00:00.0"]
    3211    ["1001=2020-03-10 00:00:00.0","1002=2020-03-10 00:00:00.0","1004=2020-03-10 00:00:00.0"]
    3212    ["1001=2020-03-10 00:00:00.0","1002=2020-03-10 00:00:00.0","1004=2020-03-10 00:00:00.0"]

    第三步:

    hive> 
    select order_id, concat_ws(',', collect_set(concat(order_status,'=',operate_time))) from order_status_log where dt='2020-03-10' group by order_id;
    
    3210    1001=2020-03-10 00:00:00.0,1002=2020-03-10 00:00:00.0,1005=2020-03-10 00:00:00.0
    3211    1001=2020-03-10 00:00:00.0,1002=2020-03-10 00:00:00.0,1004=2020-03-10 00:00:00.0
    3212    1001=2020-03-10 00:00:00.0,1002=2020-03-10 00:00:00.0,1004=2020-03-10 00:00:00.0

    第四步:

    hive >
    select order_id, str_to_map(concat_ws(',',collect_set(concat(order_status,'=',operate_time))), ','  ,  '=') tms  from order_status_log where dt='2020-03-10' group by order_id;
    
    3210    {"1001":"2020-03-10 00:00:00.0","1002":"2020-03-10 00:00:00.0","1005":"2020-03-10 00:00:00.0"}
    3211    {"1001":"2020-03-10 00:00:00.0","1002":"2020-03-10 00:00:00.0","1004":"2020-03-10 00:00:00.0"}
    3212    {"1001":"2020-03-10 00:00:00.0","1002":"2020-03-10 00:00:00.0","1004":"2020-03-10 00:00:00.0"}

    第五步:取值

    • tms['1001']:创建时间(未支付状态)
    • tms['1002']:支付时间(支付状态)
    • tms['1003']:取消时间(已取消状态)
    • tms['1004']:完成时间(已完成状态)
    • tms['1005']:退款时间(退款状态)
    • tms['1006']:退款完成时间(退款完成状态)
  • 相关阅读:
    python全栈开发从入门到放弃之内置函数
    python全栈开发从入门到放弃之递归函数的调用
    python全栈开发从入门到放弃之字典的应用
    python全栈开发从入门到放弃之元组的内置应用
    python全栈开发从入门到放弃之装饰器函数
    [LeetCode-JAVA] Remove Duplicates from Sorted Array II
    [LeetCode-JAVA] Simplify Path
    [LeetCode-JAVA] Permutations
    tensorboard在windows系统浏览器显示空白的解决writer =tf.summary.FileWriter("logs/", sess.graph)
    Windows64位安装CPU版TensorFlow
  • 原文地址:https://www.cnblogs.com/hyunbar/p/13602107.html
Copyright © 2020-2023  润新知