• 2018-10-29#regexp_extract+get_json_object


    Hive/LanguageManual+UDF

    Hive/LanguageManual+UDF

    LanguageManual+UDF

    正则表达式解析函数:regexp_extract

    语法: regexp_extract(string subject, string pattern, int index)

    返回值: string

    说明:将字符串subject按照pattern正则表达式的规则拆分,返回index指定的字符。注意,在有些情况下要使用转义字符
    举例:

    hive> select regexp_extract(‘foothebar’, ‘foo(.*?)(bar)’, 1) from dual;
    the
    
    hive> select regexp_extract(‘foothebar’, ‘foo(.*?)(bar)’, 2) from dual;
    bar
    
    hive> select regexp_extract(‘foothebar’, ‘foo(.*?)(bar)’, 0) from dual;
    foothebar
    
    -- 最近遇到的一个例子
    select regexp_extract('{"search_content": "bikerjacket","result_content": "2`bikerjacket`1","abtest": ""}', 'result_content": "(.*?)",', 0);
    
    

    json串解析:get_json_object

    select get_json_object('{"search_content": "bikerjacket","result_content": "2`bikerjacket`1","abtest": ""}', '$.result_content');
    

    json array 串解析

    待解析数据

    [{"orderPromotionId":"order_149","orderPromotionTag":"日亚美妆专题-2件8折","orderPromotionType":"10","orderPromotionValue":"110.60"}]
    

    json array string 解析

    select get_json_object(regexp_extract('[{"orderPromotionId":"order_149","orderPromotionTag":"日亚美妆专题-2件8折","orderPromotionType":"10","orderPromotionValue":"110.60"}]','^\[(.+)\]$',1), '$.orderPromotionId');
    

    行转列

    参考示例

    SELECT get_json_object(single_json_table.single_json, '$.ts') AS ts,
    get_json_object(single_json_table.single_json, '$.id') AS id,
    get_json_object(single_json_table.single_json, '$.log') AS log
    FROM (
        SELECT explode(json_array_col) as single_json FROM jt
        ) single_json_table ;
    
    SELECT get_json_object(single_json_table.single_json, '$.orderPromotionId') AS ts,
    get_json_object(single_json_table.single_json, '$.orderPromotionTag') AS id,
    get_json_object(single_json_table.single_json, '$.orderPromotionType') AS log
    FROM (
        SELECT explode('[{"orderPromotionId":"order_149","orderPromotionTag":"日亚美妆专题-2件8折","orderPromotionType":"10","orderPromotionValue":"110.60"}]') as single_json 
        -- FROM jt
        ) single_json_table ;
    

    参考资料:csdn-hive中解析json数组

  • 相关阅读:
    如何解决DEDE织梦友情链接字数限制与链接个数限制的问题?
    织梦内容页调取文章缩略图
    如何删除织梦系统power by dedecms
    在联系我们里面添加公司地图坐标地理位置
    织梦后台添加了一篇文章 但是前台显示两篇一模一样的文章
    dedecms织梦上下页标签和CSS
    ZooKeeper+Dubbox分布式框架
    mac远程控制linux,安装jdk
    mac下myeclipse2017安装activiti插件
    java为什么要实现serializable序列化
  • 原文地址:https://www.cnblogs.com/myblog1900/p/10031834.html
Copyright © 2020-2023  润新知