2、hive的基本操作

1、创建数据库和表

1）创建数据库

hive> CREATE DATABASE IF NOT EXISTS userdb;
OK
Time taken: 0.252 seconds
hive> CREATE SCHEMA userdb_2;
OK
Time taken: 0.041 seconds

　2）创建表

 hive>CREATE TABLE userTables(id INT,name STRING);

或者

hive> CREATE TABLE userTables(id int, name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' LINES TERMINATED BY '
' STORED AS TEXTFILE;

其中TERMINATED BY ' '指定了数据分隔符是一个空格

创建一个新表，结构与其他一样

hive> create table new_table like testUser;


2、创建分区表

hive> create table logs(ts bigint,line string) partitioned by (dt String,country String)ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' LINES TERMINATED BY '
' STORED AS TEXTFILE;

加载分区表数据：

hive> load data local inpath '/home/test.txt' into table logs partition (dt='2017-07-20',country='GB');

展示表中有多少分区：

hive> show partitions logs;

3、展示所有表：

hive> SHOW TABLES;
       
hive> SHOW TABLES '.*s';

　4、显示表结构

hive> DESCRIBE test;

　5、更新表名称：

hive> ALTER TABLE table_name RENAME TO new_table_name;

6、添加新一列：

hive> ALTER TABLE test ADD COLUMNS (new_col2 INT);

7、删除表：

hive> DROP TABLE table_name;

　删除表中数据，但要保持表的结构定义

hive> dfs -rmr /user/hive/warehouse/records;

8、从本地文件加载数据：

hive> LOAD DATA LOCAL INPATH '/home/sample.txt' OVERWRITE INTO TABLE test_table;

9、显示所有函数、查看所有函数用法

hive> show functions;

hive> describe function substr;

10、查看数组、map结构

hive> select col1[0],col2['b'],col3.c from test_table;

11、内连接

hive> SELECT test.*,test_2.* FROM test JOIN test_2 ON(test.id = test_2.id);

查看hive为某个查询使用多少个mapreduce　

hive> EXPLAIN SELECT test.*,test_2.* FROM test JOIN test_2 ON(test.id = test_2.id);

12、外连接

hive> SELECT test.*, test_2.* FROM test LEFT OUTER JOIN test_2 ON (test.id = test_2.id);
hive> SELECT test.*, test_2.* FROM test RIGHT OUTER JOIN test_2 ON (test.id = test_2.id);
hive> SELECT test.*, test_2.* FROM test FULL OUTER JOIN test_2 ON (sales.id = things.id);

相关阅读:
第二次结对编程作业
团队项目-需求分析报告
团队项目-选题报告
第一次结对编程作业
第一次个人编程作业
第一次博客作业
第04组团队Git现场编程实战
第二次结对编程作业
团队项目-需求分析报告
团队项目-选题报告

原文地址：https://www.cnblogs.com/royfans/p/7212646.html