• hive内表和外表的创建、载入数据、区别


    创建表

    创建内表

    create table customer(
        customerId int,
        firstName string,
        lastName STRING,
        birstDay timestamp
    ) row format delimited fields  terminated by ','
    
    

    创建外表

    CREATE EXTERNAL table salaries(
        gender string,
        age int ,
        salary DOUBLE,
        zip int 
    )row format delimited fields  terminated by ',' LOCATION '/user/train/salaries/';
    

    载入数据

    load DATA LOCAL inpath '/root/user/customer.txt' overwrite into table customer;
    load DATA LOCAL inpath '/root/user/salaries.txt' overwrite into table salaries;
    

    查看文本数据

    [root@centos172 user]# cat /root/user/customer.txt
    1,f,jack,,
    2,f,luccy,,
    [root@centos172 user]# cat /root/user/salaries.txt
    male,21,10000,1
    female,22,12000,2
    

    查看数据库数据

    hive> desc customer;
    OK
    customerid              int
    firstname               string
    lastname                string
    birstday                timestamp
    Time taken: 0.053 seconds, Fetched: 4 row(s)
    hive> desc salaries;
    OK
    gender                  string
    age                     int
    salary                  double
    zip                     int
    Time taken: 0.041 seconds, Fetched: 4 row(s)
    hive> select * from customer;
    OK
    1       f       jack    NULL
    2       f       luccy   NULL
    Time taken: 0.067 seconds, Fetched: 2 row(s)
    hive> select * from salaries;
    OK
    male    21      10000.0 1
    female  22      12000.0 2
    Time taken: 0.066 seconds, Fetched: 2 row(s)
    hive>
    

    区别

    因为我hive也是刚开始了解,所以只讲一部分
    1.内表主要放在hdfs中默认的hive目录。外表指定了location
    2.删除内表,重新创建一个一样的内表,数据不会装载
    删除外表,重新创建一个一样的外表,数据会自动的装载
    删除外表的操作如下

    hive> drop table salaries;
    OK
    Time taken: 0.092 seconds
    hive> select * from salaries;
    FAILED: SemanticException [Error 10001]: Line 1:14 Table not found 'salaries'
    hive> show tables;
    OK
    customer
    Time taken: 0.035 seconds, Fetched: 1 row(s)
    hive> CREATE EXTERNAL table salaries(
        >     gender string,
        >     age int ,
        >     salary DOUBLE,
        >     zip int
        > )row format delimited fields  terminated by ',' LOCATION '/user/train/salaries/';
    OK
    Time taken: 0.058 seconds
    hive> show tables;
    OK
    customer
    salaries
    Time taken: 0.025 seconds, Fetched: 2 row(s)
    hive> select * from salaries;
    OK
    male    21      10000.0 1
    female  22      12000.0 2
    Time taken: 0.058 seconds, Fetched: 2 row(s)
    hive>
    
    

    区别1的:
    内表的默认路径

    指定外表的路径如图:

    hive是什么

    我当前接触到就是:
    1.把hdf文件具体为table
    2.用来查询,类似sql语句处理

  • 相关阅读:
    概述反射和序列化
    读书笔记6pandas简单使用
    读书笔记5基于matplotlib画图
    读书笔记4数据的读入和保存
    读书笔记3数组的一些常用函数
    introduction to python for statistics,analysis笔记3
    introduction to python for statistics,analysis笔记2
    introduction to anaconda
    图像的线性空间滤波matlab实现
    C-I/O操作函数详解
  • 原文地址:https://www.cnblogs.com/JuncaiF/p/12336563.html
Copyright © 2020-2023  润新知