• 026.hive通过hdfs文件复制表


    https://www.cnblogs.com/airnew/p/9788122.html
    
    不能直接通过 create table  as   select  创建表,对改变表结构丢失表备注信息(分区表)
    create table dwd_cstm_points_record_mi_back__20220311as  
      as select * from dwd_cstm_points_record_mi limit 1 ;
    
    这样分区、 注释会消失,存储格式、分隔符会改变
    hive> show create table dwd_cstm_points_record_mi_back__20220311as;
    OK
    CREATE TABLE `dwd_cstm_points_record_mi_back__20220311as`(
      `record_id` bigint, 
      `instance_id` bigint, 
      `tenant_id` bigint, 
      `member_model_id` bigint, 
      `member_id` bigint, 
      `points_id` bigint, 
      `trade_id` bigint, 
      `points_type` string, 
      `points_code` string, 
      `points` int, 
      `change_type` string, 
      `order_no` string, 
      `trade_type` int, 
      `channel` string, 
      `channel_no` string, 
      `current_points` int, 
      `available_points` int, 
      `effect_time` timestamp, 
      `expire_time` timestamp, 
      `is_expired` int, 
      `remark` string, 
      `status` int, 
      `extension` string, 
      `dr` tinyint, 
      `create_time` timestamp, 
      `create_person` string, 
      `update_time` timestamp, 
      `update_person` string, 
      `sync_type` int, 
      `migrated_time` timestamp, 
      `dm` int)
    ROW FORMAT SERDE 
      'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
    STORED AS INPUTFORMAT 
      'org.apache.hadoop.mapred.TextInputFormat' 
    OUTPUTFORMAT 
      'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
    LOCATION
      'hdfs://xxxive/warehouse/saas.db/dwd_cstm_points_record_mi_back__20220311as'
    TBLPROPERTIES (
      'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}', 
      'numFiles'='1', 
      'numRows'='1', 
      'rawDataSize'='348', 
      'totalSize'='349', 
      'transient_lastDdlTime'='1646993827')
    
    
    drop table dwd_cstm_points_record_mi_back__20220311as;
    方法1. 通过 like建表  ,插入数据
    方法2:通过hdfs文件复制,复制表
    
    方法2:
    
    1.创建表结构
    CREATE TABLE dwd_cstm_points_record_mi_back_20220311 like dwd_cstm_points_record_mi;
    2.查看表路径
    50070端口
    http://XXXXXXX:50070/explorer.html#/apps/hive/warehouse/ads_biz_order_period_mi
    
    表结构查看
    hive> show create table dwd_cstm_points_record_mi;
    OK
    CREATE TABLE `dwd_cstm_points_record_mi`(
      `record_id` bigint COMMENT '????????', 
      `instance_id` bigint COMMENT '????????', 
      `tenant_id` bigint COMMENT '????????', 
      `member_model_id` bigint COMMENT '????????', 
      `member_id` bigint COMMENT '????????', 
      `points_id` bigint COMMENT '????????', 
      `trade_id` bigint COMMENT '????????', 
      `points_type` string COMMENT '????', 
      `points_code` string COMMENT '????', 
      `points` int COMMENT '????', 
      `change_type` string COMMENT '????', 
      `order_no` string COMMENT '????', 
      `trade_type` int COMMENT '?????1.???? 2.???? 3.???? 4.???? 5.???? 6.?????', 
      `channel` string COMMENT '????', 
      `channel_no` string COMMENT '????', 
      `current_points` int COMMENT '???????????', 
      `available_points` int COMMENT '????????????????????', 
      `effect_time` timestamp COMMENT '????', 
      `expire_time` timestamp COMMENT '??????????????????????????????????', 
      `is_expired` int COMMENT '????', 
      `remark` string COMMENT '??', 
      `status` int COMMENT '?????1????2????3????', 
      `extension` string COMMENT '????', 
      `dr` tinyint COMMENT '?????0?????1?????', 
      `create_time` timestamp COMMENT '????', 
      `create_person` string COMMENT '???', 
      `update_time` timestamp COMMENT '????', 
      `update_person` string COMMENT '???', 
      `sync_type` int COMMENT '?????0?????1?????', 
      `migrated_time` timestamp COMMENT '????')
    PARTITIONED BY ( 
      `dm` int)
    ROW FORMAT DELIMITED 
      FIELDS TERMINATED BY '\u0001' 
    STORED AS INPUTFORMAT 
      'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
    OUTPUTFORMAT 
      'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
    LOCATION
      'hdfs://xxxxx/hive/warehouse/xxs.db/dwd_cstm_points_record_mi'
    TBLPROPERTIES (
      'transient_lastDdlTime'='1622021164')
    3.查看原表分区、查看新表分区
    hive下
    
    show partitions dwd_cstm_points_record_mi;
    show partitions dwd_cstm_points_record_mi_back_20220311;
    4.查看表文件hdfs文件
    复制hdfs文件到新的路径 
    su hdfs 用户下
    
    hadoop fs -ls  /apps/hive/warehouse/saas.db/dwd_cstm_points_record_mi/
    hadoop fs -cp  /apps/hive/warehouse/saas.db/dwd_cstm_points_record_mi/*  /apps/hive/warehouse/saas.db/dwd_cstm_points_record_mi_back_20220311/
    
    查看复制后的新hdfs文件
    hadoop fs -ls /apps/hive/warehouse/saas.db/dwd_cstm_points_record_mi_back_20220311/
    5.查看新表分区(无分区数据)
    show partitions dwd_cstm_points_record_mi_back_20220311;
    6.修复分区
    hive下
    
    MSCK REPAIR TABLE dwd_cstm_points_record_mi_back_20220311;
    7.查看分区数据已经显示
    show partitions dwd_cstm_points_record_mi_back_20220311;
  • 相关阅读:
    sql server mdx
    mysql 按照 汉字的第一个拼音排序
    转,mysql的select * into
    mysql 日期的操作
    google 地图api
    ip_test
    AJAX (转w3cschool)
    jquery ajax 失败
    安装AdventureWorks2008R2示例数据库
    弹出新的网页窗口 js
  • 原文地址:https://www.cnblogs.com/star521/p/15994930.html
Copyright © 2020-2023  润新知