• Hive 11、Hive嵌入Python


    Hive嵌入Python

    Python的输入输出都是 为分隔符,否则会出错,python脚本输入print出规定格式的数据

    用法为先add file使用语法为TRANSFORM (name, items)   USING 'python test.py'  AS (name string, item1 string,item2 string,item3 string),这里后面几个字段对应python的类型

     下面是一个将一列转成多列表小案例:

    create table test (name string,items string) 
    
    ROW FORMAT DELIMITED 
    
    FIELDS TERMINATED BY '	';
    LOAD DATA local INPATH '/opt/data/tt.txt' OVERWRITE INTO TABLE test ;

    tt.txt的内容:

    tom	shu fa,wei qi,chang ge
    jack	game,kan shu,shang wang
    lusi	lv you,guang jie,gou wu

    表2:

    create table test2 (name string,item1 string,item2 string,item3 string) 
    
    ROW FORMAT DELIMITED 
    
    FIELDS TERMINATED BY '	';
    -- 将python脚本上传到Hive
    Hive> add file /root/test.py
    -- 将结果放到test2中
    INSERT OVERWRITE TABLE test2  
    
    SELECT  TRANSFORM (name, items)  
    USING 'python test.py'  
    AS (name string, item1 string,item2 string,item3 string)  
    FROM test;
    #!/usr/bin/python  
    
    import sys  
    for line in sys.stdin:  
         line = line.strip()    
         name,it = line.split('	')  
         count = it.count(',')+1
         for i in range(0,3-count):
              it = it+',NULL'
         result = it.split(',')[0:3]
         print '%s	%s'%(name,'	'.join(result))
    结果:
    --
    表1 hive> select * from test; OK tom shu fa,wei qi,chang ge jack game,kan shu,shang wang lusi lv you,guang jie,gou wu Time taken: 0.07 seconds, Fetched: 3 row(s)

     hive> desc test2;
     OK
     name string
     item1 string
     item2 string
     item3 string
     Time taken: 0.141 seconds, Fetched: 4 row(s)

    -- 表2
    hive> select * from test2;
    OK
    tom    shu fa    wei qi    chang ge
    jack    game    kan shu    shang wang
    lusi    lv you    guang jie    gou wu
    Time taken: 1.368 seconds, Fetched: 3 row(s)
  • 相关阅读:
    openssl自签发证书
    安装tomcat8 env
    路由信息相关 route 网卡
    安装jdk env
    sublime使用与配置
    docker仓库登录 配置insecure-registries
    harobor私有docker镜像仓库
    git版本回退的两种方式
    git diff命令的使用
    Kali Linux中的自带字典&crunch自建字典
  • 原文地址:https://www.cnblogs.com/raphael5200/p/5221927.html
Copyright © 2020-2023  润新知