• hive创建桶表


    • 创建桶表
    hive> create table house_1(id int,name string,age int) clustered by (id) into 3 buckets row format delimited fields terminated by ',';
    • 查看表详细信息
    hive> desc formatted house_1;
    OK
    # col_name              data_type               comment
    
    id                      int
    name                    string
    age                     int
    
    # Detailed Table Information
    Database:               cr
    Owner:                  xiaoqiu
    CreateTime:             Mon Aug 13 10:59:08 EDT 2018
    LastAccessTime:         UNKNOWN
    Retention:              0
    Location:               hdfs://mycluster/user/hive/warehouse/cr.db/house_1
    Table Type:             MANAGED_TABLE
    Table Parameters:
            COLUMN_STATS_ACCURATE   {"BASIC_STATS":"true"}
            numFiles                0
            numRows                 0
            rawDataSize             0
            totalSize               0
            transient_lastDdlTime   1534172348
    
    # Storage Information
    SerDe Library:          org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    InputFormat:            org.apache.hadoop.mapred.TextInputFormat
    OutputFormat:           org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    Compressed:             No
    Num Buckets:            3  #桶的个数
    Bucket Columns:         [id]  #根据id划分桶
    Sort Columns:           []
    Storage Desc Params:
            field.delim             ,
            serialization.format    ,
    Time taken: 2.636 seconds, Fetched: 33 row(s)
    • 加载数据到桶表,采用insert into ....select ...
    hive> load data local inpath '/home/xiaoqiu/teacher.txt' into table house_1;
    FAILED: SemanticException Please load into an intermediate table and use 'insert... select' to allow Hive to enforce bucketing. Load into bucketed tables are disabled for safety reasons. If you know what you are doing, please sethive.strict.checks.bucketing to false and that hive.mapred.mode is not set to 'strict' to proceed. Note that if you may get errors or incorrect results if you make a mistake while using some of the unsafe features.
    hive> insert into house_1 select * from house;
    hive> select * from house;
    OK
    2       'bom1'  24
    3       'rom1'  25
    4       'som1'  26
    5       'tom10' 23
    6       'tom9'  20
    7       'tom8'  11
    8       'tom7'  12
    9       'tom6'  13
    10      'tom5'  33
    12      'tom4'  45
    22      'tom3'  20
    32      'tom2'  23
    NULL    NULL    NULL
    hive> select * from house_1; #根据id进行分桶
    OK
    3       'rom1'  25
    12      'tom4'  45
    9       'tom6'  13
    NULL    NULL    NULL
    6       'tom9'  20
    22      'tom3'  20
    4       'som1'  26
    10      'tom5'  33
    7       'tom8'  11
    8       'tom7'  12
    32      'tom2'  23
    5       'tom10' 23
    2       'bom1'  24
    
    

    查看HDFS目录 

    [xiaoqiu@s150 /home/xiaoqiu]$ hadoop fs -lsr /
    drwxr-xr-x   - xiaoqiu   supergroup          0 2018-08-13 11:45 /user/hive/warehouse/cr.db              /house_1
    -rwxr-xr-x   3 xiaoqiu   supergroup         58 2018-08-13 11:43 /user/hive/warehouse/cr.db              /house_1/000000_0 #桶表1
    -rwxr-xr-x   3 xiaoqiu   supergroup         50 2018-08-13 11:44 /user/hive/warehouse/cr.db              /house_1/000001_0#桶表2
    -rwxr-xr-x   3 xiaoqiu   supergroup         50 2018-08-13 11:43 /user/hive/warehouse/cr.db              /house_1/000002_0#桶表3
    [xiaoqiu@s150 /home/xiaoqiu]$ hadoop fs -cat /user/hive/warehouse/cr.db/house_1/000000_0#桶表1
    3,'rom1',25
    12,'tom4',45
    9,'tom6',13
    N,N,N
    6,'tom9',20
    [xiaoqiu@s150 /home/xiaoqiu]$ hadoop fs -cat /user/hive/warehouse/cr.db/house_1/000001_0#桶表2
    22,'tom3',20
    4,'som1',26
    10,'tom5',33
    7,'tom8',11
    [xiaoqiu@s150 /home/xiaoqiu]$ hadoop fs -cat /user/hive/warehouse/cr.db/house_1/000002_0#桶表3
    8,'tom7',12
    32,'tom2',23
    5,'tom10',23
    2,'bom1',24
    [xiaoqiu@s150 /home/xiaoqiu]$
    
    
    欢迎关注我的公众号:小秋的博客 CSDN博客:https://blog.csdn.net/xiaoqiu_cr github:https://github.com/crr121 联系邮箱:rongchen633@gmail.com 有什么问题可以给我留言噢~
  • 相关阅读:
    selenium 浏览器标签切换
    selenium 执行js代码
    selenium 图片懒加载
    selenium 浏览器最大化
    selenium-无窗口模式
    selenium 提取天猫网页数据
    selenium--数据填充
    selenium---输入内容后搜索
    bittorrent 学习(四) tracker peer通讯
    sha1 算法源码
  • 原文地址:https://www.cnblogs.com/flyingcr/p/10326863.html
Copyright © 2020-2023  润新知