Hive分区表 - 润新知

Hive分区表
1.分区表

分区实质：在数据表文件夹下再次创建分区文件夹

　　分区在创建表时用Partitioned By定义，创建表后可以使用Alter Table语句来增加或移除分区。
```
create table logs (ts bigint,line string)
 partitioned by (dt string,country string) 
 Row Format Delimited Fields Terminated By ‘	’ ;
```
　　 Load数据时，显示指定分区值：
```
load data local inpath '/root/hive/file2'
 into table logs
 partition (dt='2001-01-01',country='GB');
```
　　更多数据文件加载到logs表之后，目录结构：

　　日志表中：两个日期分区 + 两个国家分区。数据文件则存放在底层目录中。
```
/user/hive/warehouse/logs /dt=2010-01-01/country=GB/file1
/file2

/country=US/file3

/dt=2010-01-02/country=GB/file4

/country=US/file5
  /file6
```
- 注意：现在hdfs的 /user/hive/warehouse/logs 目录下，可以直接通过 hadoop fs -mkdir /book/nation=china 的方式创建一个形式上的分区文件夹，再向其中上传数据文件a.c。但通过hive语句查询，却不认这个分区里的数据，这是因为在Metastore文件中并没有保存此分区信息。
- 解决方案为：alter table book add partition (nation='china') location "/book/nation=china" 这种方式修改mysql中的元数据信息。
　　使用show partitions logs命令获得logs表中有那些分区：
```
dt=2001-01-01/country=GB

dt=2001-01-01/country=US

dt=2001-01-02/country=GB

dt=2001-01-02/country=US
```
　　显示表的结构信息：Describe logs;
```
ts                      bigint                                 
line                    string                                      
dt                      string                                      
country                 string                                

# Partition Information          

# col_name              data_type               comment                         

dt                          string                                      
country                     string 
```
- 需要注意，Partitioned by子句中的列定义是表中正式的列，称为“分区列”partition column。
- 但是，数据文件并不包含这些列的值，因为他们源于目录名。
- 可以在select语句中以普通方式使用分区列。Hive会对输入进行修剪，从而只扫描相关的分区。
- 还要注意，这个查询返回dt分区列的值。这个值是Hive从目录名中读取。
```
select ts,dt,line
 from logs
 where country='GB';
```
　　将只扫描file1,file2,file4。
相关阅读:
~随笔A007~html中input输入框的字数限制、同步input的输入内容至div中
 ~随笔A006~微信扫码的授权、用户绑定、关注公众号、消息反馈
 【CV】实验二:特征检测与匹配
 【笔记】DLX算法及常见应用
 【笔记】和算法无关的那些东东
 【笔记】康拓展开＆逆康拓展开
 【笔记】离散对数
 【笔记】Shift-And算法＆Shift-OR算法
 【模板】中缀表达式求值
 【笔记】数据库系统
原文地址：https://www.cnblogs.com/skyl/p/4736283.html