• hive DML操作


    1、数据导入

    1)向表中装载数据(load)

    语法
    hive> load data [local] inpath '/opt/module/datas/student.txt' [overwrite] into table student [partition (partcol1=val1,…)];
    

    (1)load data:表示加载数据

    (2)local:表示从本地加载数据到hive表;否则从HDFS加载数据到hive表

    (3)inpath:表示加载数据的路径

    (4)overwrite:表示覆盖表中已有数据,否则表示追加

    (5)into table:表示加载到哪张表

    (6)student:表示具体的表

    (7)partition:表示上传到指定分区

    实例操作

    创建一张表

    create table student(id string,name string)
    row format delimited fields terminated by '	';
    

    (1)加载本地文件到hive

    load data local inpath '/opt/student.txt' into table student;
    

    (2)加载hdfs文件到hive

    load data inpath '/user/hive/warehouse/stu.txt' into table student;
    

    (3)加载数据覆盖表中已有的数据

    load data inpath '/user/hive/warehouse/stu.txt' overwrite into table student;
    

    2)通过查询语句向表中插入数据(insert)

    创建一张分区表

    create table student(id int, name string) partitioned by (month string) row format delimited fields terminated by '	';
    

    (1)基本插入

    insert into table student partition(month='2019') values(1,"wang"),(2,"zhang");
    
    insert  overwrite table stu partition(month='2019') select id,name from  student where month='2019';
    

    insert into :以追加数据的方式插入到表或分区,原有数据不会删除

    insert overwrite:会覆盖表或分区中已经存在的数据

    insert 不支持插入部分字段

    (2)多表(多分区)插入

    from student
    insert overwrite table student partition(month='201706')
    select id,name where month='201706'
    insert overwrite table student partition(month='201707')
    select id,name where month='201707';
    

    3)查询语句中创建表并加载数据(as select)

    create table if not exists stu
    as
    select id,name from student;
    

    4)创建表时通过location指定加载数据路径

    create external table if not exists student(
    id int ,name string
    )
    row format delimited fields terminated by '	'
    location '/student';
    

    5)Import数据到指定hive表中

    先用export导出后,再将数据导入

    import table stu partition(month='201901')
    from
    '/user/hive/warehouse/export/student';
    

    2、数据导出

    1)insert导出

    (1)将查询的结果导出到本地(各列数据紧挨着,没有分隔符)

    insert overwrite local directory '/opt/export/student' select * from student;
    

    (2)将查询的结果格式化导出到本地

    insert overwrite local directory '/opt/export/student' row format delimited fileds terminated by '	' select * from student;
    

    (3)将查询结果导出到hdfs上(没有local)

    insert overwrite local directory '/user/hive/warehouse/export/student' row format delimited fileds terminated by '	' select * from student;
    

    2)hadoop命令导出到本地

    dfs -get /user/hive/warehouse/student/month=201709/000000_0 /opt/module/datas/export/student.txt;
    

    3)hive shell命令导出

    hive -f/-e 执行语句或者脚本 > file

    hive -e 'select * from student' > /opt/datas/student.txt;
    

    4)export导出到hdfs上

    export table student to  '/user/hive/warehouse/export/student';
    

    export 和import主要用于两个hadoop平台集群之间hive表迁移

  • 相关阅读:
    poj 3255
    (DP) bzoj 2091
    (最短路) bzoj 2118
    (点双联通分量) poj 2942
    (树直径) bzoj 1509
    (离线处理+BFS) poi Tales of seafaring
    (并查集+DFS) poi guilds
    (记忆话搜索)POI Fibonacci Representation
    (DP) POI Bytecomputer
    (DP) bzoj 1296
  • 原文地址:https://www.cnblogs.com/hyunbar/p/11732252.html
Copyright © 2020-2023  润新知