• HiveQL DML 常用QL示例资料


     hive 2.1.1  DML操作

    将文件加载到hive表
    //官方指导
    LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
    //示例
    LOAD DATA LOCAL INPATH '/root/a.txt' OVERWRITE INTO TABLE testdb.test partition(dd='a',ed='2018');

    说明:local 如果没有指定local关键字,则filepath引用表所在的文件系统中的文件。

    查询数据到hive表
    //官方指导
    Standard syntax:
    INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...) [IF NOT EXISTS]] select_statement1 FROM from_statement;
    INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1 FROM from_statement;
     
    Hive extension (multiple inserts):
    FROM from_statement
    INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...) [IF NOT EXISTS]] select_statement1
    [INSERT OVERWRITE TABLE tablename2 [PARTITION ... [IF NOT EXISTS]] select_statement2]
    [INSERT INTO TABLE tablename2 [PARTITION ...] select_statement2] ...;
    FROM from_statement
    INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1
    [INSERT INTO TABLE tablename2 [PARTITION ...] select_statement2]
    [INSERT OVERWRITE TABLE tablename2 [PARTITION ... [IF NOT EXISTS]] select_statement2] ...;
     
    Hive extension (dynamic partition inserts):
    INSERT OVERWRITE TABLE tablename PARTITION (partcol1[=val1], partcol2[=val2] ...) select_statement FROM from_statement;
    INSERT INTO TABLE tablename PARTITION (partcol1[=val1], partcol2[=val2] ...) select_statement FROM from_statement;
    //示例
    insert into table test2  partition(area,year) select id,ad,bd,cd,dd,ed  from test where ed='2019';

    说明:后面的字段对应的是分区字段

      into是拼接新的值

      overwrite是新值覆盖掉全部旧的值,又分区会覆盖指定分区内的值

    表中写入到文件系统
    //官方指导
    Standard syntax:
    INSERT OVERWRITE [LOCAL] DIRECTORY directory1
      [ROW FORMAT row_format] [STORED AS file_format] (Note: Only available starting with Hive 0.11.0)
      SELECT ... FROM ...
     
    Hive extension (multiple inserts):
    FROM from_statement
    INSERT OVERWRITE [LOCAL] DIRECTORY directory1 select_statement1
    [INSERT OVERWRITE [LOCAL] DIRECTORY directory2 select_statement2] ...
     
     
    row_format
      : DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY char]
            [MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]
            [NULL DEFINED AS char] (Note: Only available starting with Hive 0.13)
    //示例
    insert  overwrite local directory  '/root' row format delimited fields terminated by '	 ' collection items terminated by ',' map keys terminated by ':' lines terminated by '
    '  select * from test;

    说明:如果不指定字段分割方式 将以  ^ A   ^ B  ^C 分割

    修改表数据
    //官方指导
    更新只支持在ACID表上执行。 
    客户端
    hive.support.concurrency – true
    hive.enforce.bucketing – true (Not required as of Hive 2.0) 
    hive.exec.dynamic.partition.mode – nonstrict
    hive.txn.manager – org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
    服务端
    hive.compactor.initiator.on – true (See table below for more details)
    hive.compactor.worker.threads – a positive number on at least one instance of the Thrift metastore service
    Standard Syntax:
    UPDATE tablename SET column = value [, column = value ...] [WHERE expression]
    //示例
    //服务端
    set hive.compactor.initiator.on = true ;
    set hive.compactor.worker.threads = 1;
    //客户端
    set hive.support.concurrency = true;
    set hive.enforce.bucketing = true;
    set hive.exec.dynamic.partition.mode = nonstrict;
    set hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
    update testdb.test set ad = 'xiaoming' where id =1 ;

    说明:创建的表要为分桶表,还要设置参数为 'transactional'='true'

      不支持子查询

      当where条件匹配上了才能更新

      分区,分桶的列不能更新

      设置hive.optimize.sort.dynamic.partition=false可以提高效率

      属性都可以在hive-site.xml文件中配置 

    删除表数据
    //官方指导
    只支持在ACID表上执行。 
    客户端
    hive.support.concurrency – true
    hive.enforce.bucketing – true (Not required as of Hive 2.0) 
    hive.exec.dynamic.partition.mode – nonstrict
    hive.txn.manager – org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
    服务端
    hive.compactor.initiator.on – true (See table below for more details)
    hive.compactor.worker.threads – a positive number on at least one instance of the Thrift metastore service
    
    Standard Syntax:
    DELETE FROM tablename [WHERE expression]
    //示例
    //服务端
    set hive.compactor.initiator.on = true ;
    set hive.compactor.worker.threads = 1;
    set hive.optimize.sort.dynamic.partition = false;
    //客户端
    set hive.support.concurrency = true;
    set hive.enforce.bucketing = true;
    set hive.exec.dynamic.partition.mode = nonstrict;
    set hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;

    delete from testdb.test where id = 1;

    说明:见修改表数据说明

    从hive2.2开始支持表合并

     
  • 相关阅读:
    Redis提供的持久化机制(RDB和AOF)
    linux创建子进程--fork()方法
    数据库-锁的实践
    nginx中,$request_uri和$uri的区别
    journal size
    目的:将两个三T的硬盘做成LVM(sdc,sdd)
    安装 rbbitMQ redis mongo的三个扩展
    MySQL server has gone away
    mysql创建utf-8字符集数据库
    Linux下杀毒软件clamav的安装和使用
  • 原文地址:https://www.cnblogs.com/jottings/p/7727571.html
Copyright © 2020-2023  润新知