• 2.7-2.8 导入、导出数据(进/出)hive表的方式


    一、导入数据进hive表

    1、语法

    LOAD DATA [LOCAL] INPATH 'filepath'
    [OVERWRITE] INTO TABLE tablename
    [PARTITION (partcol1=val1,partcol2=val2...)]
    
    
    ##
    * 原始文件的存储位置
        *在本地要写local
        *在HDFS不用写local
    
    * ‘filepath’ 文件路径要加引号
    
    * 对表的数据是否覆盖
        * 覆盖要写overwrite
        * 追加不写overwrite
    
    * 分区表要写partition


    2、加载本地数据进hive表

    load data local inpath '/opt/datas/emp. txt' into table default. emp;


    3、加载hdfs文件到hive中

    #hdfs上的数据
    hive (default)> dfs -ls -R /user/root/hive/datas;
    -rw-r--r--   1 root supergroup        659 2019-04-23 11:23 /user/root/hive/datas/emp.txt
    
    
    #emp表中现在有14条数据
    hive (default)> select * from emp;
    OK
    emp.empno    emp.ename    emp.job    emp.mgr    emp.hiredate    emp.sal    emp.comm    emp.deptno
    7369    SMITH    CLERK    7902    1980-12-17    800.0    NULL    20
    7499    ALLEN    SALESMAN    7698    1981-2-20    1600.0    300.0    30
    7521    WARD    SALESMAN    7698    1981-2-22    1250.0    500.0    30
    7566    JONES    MANAGER    7839    1981-4-2    2975.0    NULL    20
    7654    MARTIN    SALESMAN    7698    1981-9-28    1250.0    1400.0    30
    7698    BLAKE    MANAGER    7839    1981-5-1    2850.0    NULL    30
    7782    CLARK    MANAGER    7839    1981-6-9    2450.0    NULL    10
    7788    SCOTT    ANALYST    7566    1987-4-19    3000.0    NULL    20
    7839    KING    PRESIDENT    NULL    1981-11-17    5000.0    NULL    10
    7844    TURNER    SALESMAN    7698    1981-9-8    1500.0    0.0    30
    7876    ADAMS    CLERK    7788    1987-5-23    1100.0    NULL    20
    7900    JAMES    CLERK    7698    1981-12-3    950.0    NULL    30
    7902    FORD    ANALYST    7566    1981-12-3    3000.0    NULL    20
    7934    MILLER    CLERK    7782    1982-1-23    1300.0    NULL    10
    Time taken: 0.606 seconds, Fetched: 14 row(s)
    
    
    #将hdfs上的数据加载到emp表中,加载到hive表后,hdfs上的数据会被删除
    hive (default)> load data inpath '/user/root/hive/datas/emp.txt' into table default.emp;
    Loading data to table default.emp
    Table default.emp stats: [numFiles=2, numRows=0, totalSize=1318, rawDataSize=0]
    OK
    Time taken: 0.228 seconds
    
    
    #emp表中的数据增加到了28条
    hive (default)> select * from emp;                                                      
    OK
    emp.empno    emp.ename    emp.job    emp.mgr    emp.hiredate    emp.sal    emp.comm    emp.deptno
    7369    SMITH    CLERK    7902    1980-12-17    800.0    NULL    20
    
    7499    ALLEN    SALESMAN    7698    1981-2-20    1600.0    300.0    30
    7521    WARD    SALESMAN    7698    1981-2-22    1250.0    500.0    30
    7566    JONES    MANAGER    7839    1981-4-2    2975.0    NULL    20
    7654    MARTIN    SALESMAN    7698    1981-9-28    1250.0    1400.0    30
    7698    BLAKE    MANAGER    7839    1981-5-1    2850.0    NULL    30
    7782    CLARK    MANAGER    7839    1981-6-9    2450.0    NULL    10
    7788    SCOTT    ANALYST    7566    1987-4-19    3000.0    NULL    20
    7839    KING    PRESIDENT    NULL    1981-11-17    5000.0    NULL    10
    7844    TURNER    SALESMAN    7698    1981-9-8    1500.0    0.0    30
    7876    ADAMS    CLERK    7788    1987-5-23    1100.0    NULL    20
    7900    JAMES    CLERK    7698    1981-12-3    950.0    NULL    30
    7902    FORD    ANALYST    7566    1981-12-3    3000.0    NULL    20
    7934    MILLER    CLERK    7782    1982-1-23    1300.0    NULL    10
    7369    SMITH    CLERK    7902    1980-12-17    800.0    NULL    20
    7499    ALLEN    SALESMAN    7698    1981-2-20    1600.0    300.0    30
    7521    WARD    SALESMAN    7698    1981-2-22    1250.0    500.0    30
    7566    JONES    MANAGER    7839    1981-4-2    2975.0    NULL    20
    7654    MARTIN    SALESMAN    7698    1981-9-28    1250.0    1400.0    30
    7698    BLAKE    MANAGER    7839    1981-5-1    2850.0    NULL    30
    7782    CLARK    MANAGER    7839    1981-6-9    2450.0    NULL    10
    7788    SCOTT    ANALYST    7566    1987-4-19    3000.0    NULL    20
    7839    KING    PRESIDENT    NULL    1981-11-17    5000.0    NULL    10
    7844    TURNER    SALESMAN    7698    1981-9-8    1500.0    0.0    30
    7876    ADAMS    CLERK    7788    1987-5-23    1100.0    NULL    20
    7900    JAMES    CLERK    7698    1981-12-3    950.0    NULL    30
    7902    FORD    ANALYST    7566    1981-12-3    3000.0    NULL    20
    7934    MILLER    CLERK    7782    1982-1-23    1300.0    NULL    10
    Time taken: 0.033 seconds, Fetched: 28 row(s)


    4、加载数据覆盖表中已有的数据

    #重新加载到hdfs,因为刚才的数据加载到hdfs后已被删除
    hive (default)> dfs -put /opt/datas/emp.txt /user/root/hive/datas;   
    
    
    #覆盖加载金hive表                             
    hive (default)> load data inpath '/user/root/hive/datas/emp.txt' overwrite into table default.emp;
    Loading data to table default.emp
    rmr: DEPRECATED: Please use 'rm -r' instead.
    Moved: 'hdfs://hadoop-senior.ibeifeng.com:8020/user/hive/warehouse/emp' to trash at: hdfs://hadoop-senior.ibeifeng.com:8020/user/root/.Trash/Current
    Table default.emp stats: [numFiles=1, numRows=0, totalSize=659, rawDataSize=0]
    OK
    Time taken: 0.192 seconds
    
    
    #此时emp中只有14条数据,刚才此表中有28条数据
    hive (default)> select * from emp;                                                                
    OK
    emp.empno    emp.ename    emp.job    emp.mgr    emp.hiredate    emp.sal    emp.comm    emp.deptno
    7369    SMITH    CLERK    7902    1980-12-17    800.0    NULL    20
    7499    ALLEN    SALESMAN    7698    1981-2-20    1600.0    300.0    30
    7521    WARD    SALESMAN    7698    1981-2-22    1250.0    500.0    30
    7566    JONES    MANAGER    7839    1981-4-2    2975.0    NULL    20
    7654    MARTIN    SALESMAN    7698    1981-9-28    1250.0    1400.0    30
    7698    BLAKE    MANAGER    7839    1981-5-1    2850.0    NULL    30
    7782    CLARK    MANAGER    7839    1981-6-9    2450.0    NULL    10
    7788    SCOTT    ANALYST    7566    1987-4-19    3000.0    NULL    20
    7839    KING    PRESIDENT    NULL    1981-11-17    5000.0    NULL    10
    7844    TURNER    SALESMAN    7698    1981-9-8    1500.0    0.0    30
    7876    ADAMS    CLERK    7788    1987-5-23    1100.0    NULL    20
    7900    JAMES    CLERK    7698    1981-12-3    950.0    NULL    30
    7902    FORD    ANALYST    7566    1981-12-3    3000.0    NULL    20
    7934    MILLER    CLERK    7782    1982-1-23    1300.0    NULL    10
    Time taken: 0.036 seconds, Fetched: 14 row(s)


    5、创建表是通过insert加载

    ##
    hive (default)> create table default.emp_ci like emp; 
    OK
    Time taken: 0.092 seconds
    
    hive (default)> select * from emp_ci;
    OK
    emp_ci.empno    emp_ci.ename    emp_ci.job    emp_ci.mgr    emp_ci.hiredate    emp_ci.sal    emp_ci.comm    emp_ci.deptno
    Time taken: 0.034 seconds
    
    
    ##
    hive (default)> insert into table default.emp_ci select * from default.emp;
    
    hive (default)> select * from emp_ci;                                      
    OK
    emp_ci.empno    emp_ci.ename    emp_ci.job    emp_ci.mgr    emp_ci.hiredate    emp_ci.sal    emp_ci.comm    emp_ci.deptno
    7369    SMITH    CLERK    7902    1980-12-17    800.0    NULL    20
    7499    ALLEN    SALESMAN    7698    1981-2-20    1600.0    300.0    30
    7521    WARD    SALESMAN    7698    1981-2-22    1250.0    500.0    30
    7566    JONES    MANAGER    7839    1981-4-2    2975.0    NULL    20
    7654    MARTIN    SALESMAN    7698    1981-9-28    1250.0    1400.0    30
    7698    BLAKE    MANAGER    7839    1981-5-1    2850.0    NULL    30
    7782    CLARK    MANAGER    7839    1981-6-9    2450.0    NULL    10
    7788    SCOTT    ANALYST    7566    1987-4-19    3000.0    NULL    20
    7839    KING    PRESIDENT    NULL    1981-11-17    5000.0    NULL    10
    7844    TURNER    SALESMAN    7698    1981-9-8    1500.0    0.0    30
    7876    ADAMS    CLERK    7788    1987-5-23    1100.0    NULL    20
    7900    JAMES    CLERK    7698    1981-12-3    950.0    NULL    30
    7902    FORD    ANALYST    7566    1981-12-3    3000.0    NULL    20
    7934    MILLER    CLERK    7782    1982-1-23    1300.0    NULL    10
    Time taken: 0.028 seconds, Fetched: 14 row(s)


    6、创建表的时候通过location指定加载


    二、导出hive表数据

    1、导出到本地

    #将查询结果插入到本地,本地目录会自动创建
    hive (default)> insert overwrite local directory '/opt/datas/hive_exp_emp' 
                  > ROW FORMAT DELIMITED FIELDS TERMINATED BY '	' COLLECTION ITEMS TERMINATED BY '/n'
                  > select * from default.emp;
    
    ROW FORMAT DELIMITED FIELDS TERMINATED BY '	'    #列之间的分隔符
    COLLECTION ITEMS TERMINATED BY '/n'        #行之间的分隔符
    
    
    [root@hadoop-senior modules]# cat /opt/datas/hive_exp_emp/000000_0 
    7369    SMITH    CLERK    7902    1980-12-17    800.0    N    20
    7499    ALLEN    SALESMAN    7698    1981-2-20    1600.0    300.0    30
    7521    WARD    SALESMAN    7698    1981-2-22    1250.0    500.0    30
    7566    JONES    MANAGER    7839    1981-4-2    2975.0    N    20
    7654    MARTIN    SALESMAN    7698    1981-9-28    1250.0    1400.0    30
    7698    BLAKE    MANAGER    7839    1981-5-1    2850.0    N    30
    7782    CLARK    MANAGER    7839    1981-6-9    2450.0    N    10
    7788    SCOTT    ANALYST    7566    1987-4-19    3000.0    N    20
    7839    KING    PRESIDENT    N    1981-11-17    5000.0    N    10
    7844    TURNER    SALESMAN    7698    1981-9-8    1500.0    0.0    30
    7876    ADAMS    CLERK    7788    1987-5-23    1100.0    N    20
    7900    JAMES    CLERK    7698    1981-12-3    950.0    N    30
    7902    FORD    ANALYST    7566    1981-12-3    3000.0    N    20
    7934    MILLER    CLERK    7782    1982-1-23    1300.0    N    10


    2、查询输出到文件

    ##
    [root@hadoop-senior hive-0.13.1]# bin/hive -e "select * from default.emp;" >/opt/datas/exp_res.txt
    
    Logging initialized using configuration in file:/opt/modules/hive-0.13.1/conf/hive-log4j.properties
    OK
    Time taken: 0.837 seconds, Fetched: 14 row(s)
    
    
    [root@hadoop-senior hive-0.13.1]# cat /opt/datas/exp_res.txt 
    emp.empno    emp.ename    emp.job    emp.mgr    emp.hiredate    emp.sal    emp.comm    emp.deptno
    7369    SMITH    CLERK    7902    1980-12-17    800.0    NULL    20
    7499    ALLEN    SALESMAN    7698    1981-2-20    1600.0    300.0    30
    7521    WARD    SALESMAN    7698    1981-2-22    1250.0    500.0    30
    7566    JONES    MANAGER    7839    1981-4-2    2975.0    NULL    20
    7654    MARTIN    SALESMAN    7698    1981-9-28    1250.0    1400.0    30
    7698    BLAKE    MANAGER    7839    1981-5-1    2850.0    NULL    30
    7782    CLARK    MANAGER    7839    1981-6-9    2450.0    NULL    10
    7788    SCOTT    ANALYST    7566    1987-4-19    3000.0    NULL    20
    7839    KING    PRESIDENT    NULL    1981-11-17    5000.0    NULL    10
    7844    TURNER    SALESMAN    7698    1981-9-8    1500.0    0.0    30
    7876    ADAMS    CLERK    7788    1987-5-23    1100.0    NULL    20
    7900    JAMES    CLERK    7698    1981-12-3    950.0    NULL    30
    7902    FORD    ANALYST    7566    1981-12-3    3000.0    NULL    20
    7934    MILLER    CLERK    7782    1982-1-23    1300.0    NULL    10


    3、导出到HDFS上

    ##
    hive (default)> insert overwrite directory '/user/root/hive/hive_exp_emp'
                  > select * from default.emp;
    
    ##
    hive (default)> dfs -ls -R /user/root/hive/hive_exp_emp;
    -rw-r--r--   1 root supergroup        661 2019-04-23 13:33 /user/root/hive/hive_exp_emp/000000_0
    
    hive (default)> dfs -text /user/root/hive/hive_exp_emp/000000_0; 
    7369SMITHCLERK79021980-12-17800.0N20
    7499ALLENSALESMAN76981981-2-201600.0300.030
    7521WARDSALESMAN76981981-2-221250.0500.030
    7566JONESMANAGER78391981-4-22975.0N20
    7654MARTINSALESMAN76981981-9-281250.01400.030
    7698BLAKEMANAGER78391981-5-12850.0N30
    7782CLARKMANAGER78391981-6-92450.0N10
    7788SCOTTANALYST75661987-4-193000.0N20
    7839KINGPRESIDENTN1981-11-175000.0N10
    7844TURNERSALESMAN76981981-9-81500.00.030
    7876ADAMSCLERK77881987-5-231100.0N20
    7900JAMESCLERK76981981-12-3950.0N30
    7902FORDANALYST75661981-12-33000.0N20
    7934MILLERCLERK77821982-1-231300.0N10
    
    
    #也可以从hdfs上get到本地查看


    4、sqoop

    hive—>rdbms

    rdbms—>hive

  • 相关阅读:
    《Centos服务器版安装教程》
    从CentOS官网下载系统镜像详细教程
    一键LNMP文件
    Centos 7 ip地址
    cmd常用命令
    bat命令
    JAVA学习资源整理
    DevOps 高效 shell 命令
    编程范式:命令式编程(Imperative)、声明式编程(Declarative)和函数式编程(Functional)
    Java 中的函数式编程(Functional Programming):Lambda 初识
  • 原文地址:https://www.cnblogs.com/weiyiming007/p/10755946.html
Copyright © 2020-2023  润新知