• Hive学习笔记——保存select结果,Join,多重插入


    1. 保存select查询结果的几种方式:

    1、将查询结果保存到一张新的hive表中

    create table t_tmp
    as
    select * from t_p;

    2、将查询结果保存到一张已经存在的hive表中(用load的时候,可以是into或者overwrite into,这里是into或者overwrite)

    insert into/overwrite table t_tmp
    select * from t_p;

    3、将查询结果保存到指定的文件目录(可以是本地,也可以是hdfs)

    insert overwrite local directory '/home/hadoop/test'
    select * from t_p;
    insert overwrite directory '/aaa/test'
    select * from t_p;

    2. 关于hive中的各种join

    准备数据
    1,a
    2,b
    3,c
    4,d
    7,y
    8,u

    2,bb
    3,cc
    7,yy
    9,pp

    建表:

    create table a(id int,name string)
    row format delimited fields terminated by ',';
    
    create table b(id int,name string)
    row format delimited fields terminated by ',';

    导入数据:

    load data local inpath '/home/hadoop/a.txt' into table a;
    load data local inpath '/home/hadoop/b.txt' into table b;

    实验:
    ** inner join

    select * from a join b on a.id=b.id;

    +-------+---------+-------+---------+--+
    | a.id | a.name | b.id | b.name |
    +-------+---------+-------+---------+--+
    | 2 | b | 2 | bb |
    | 3 | c | 3 | cc |
    | 7 | y | 7 | yy |
    +-------+---------+-------+---------+--+

    **left join

    select * from a left outer join b on a.id=b.id;

    +-------+---------+-------+---------+--+
    | a.id | a.name | b.id | b.name |
    +-------+---------+-------+---------+--+
    | 1 | a | NULL | NULL |
    | 2 | b | 2 | bb |
    | 3 | c | 3 | cc |
    | 4 | d | NULL | NULL |
    | 7 | y | 7 | yy |
    | 8 | u | NULL | NULL |
    +-------+---------+-------+---------+--+

    **right join

    select * from a right outer join b on a.id=b.id;

    同上效果,只不过这次b的全部显示,a的会有NULL。

    **full join

    select * from a full outer join b on a.id=b.id;

    +-------+---------+-------+---------+--+
    | a.id | a.name | b.id | b.name |
    +-------+---------+-------+---------+--+
    | 1 | a | NULL | NULL |
    | 2 | b | 2 | bb |
    | 3 | c | 3 | cc |
    | 4 | d | NULL | NULL |
    | 7 | y | 7 | yy |
    | 8 | u | NULL | NULL |
    | NULL | NULL | 9 | pp |
    +-------+---------+-------+---------+--+

    **left semi join

    select * from a left semi join b on a.id = b.id;

    +-------+---------+--+
    | a.id | a.name |
    +-------+---------+--+
    | 2 | b |
    | 3 | c |
    | 7 | y |
    +-------+---------+--+

    3. 多重插入

    from student
    insert into table student_p partition(part='a')
    select * where id<95011;
    insert into table student_p partition(part='a')
    select * where id<95011;
  • 相关阅读:
    python note 19 异常处理
    python note 18 序列化模块
    python note 17 random、time、sys、os模块
    python note 16 re模块的使用
    python note 15 正则表达式
    python note 13 内置函数
    python note 12 生成器、推导式
    C++ int型负数除法取余问题
    Leetcode162. 寻找峰值
    Leetcode450. 删除二叉搜索树中的节点
  • 原文地址:https://www.cnblogs.com/DarrenChan/p/6786182.html
Copyright © 2020-2023  润新知