1.问题
在开发过程中,向hive分区表新增字段,发现查询新增字段的值为NULL
2.问题复现
1.创建分区表,并插入数据
create table student(id int,name string) partitioned by (dt string);
insert into table student partition(dt = '2019-11-13') select 1,'zhangsan';
insert into table student partition(dt = '2019-11-14') select 2,'lisi';
select * from student where dt = '2019-11-13';
select * from student where dt = '2019-11-14';
2.增加字段,插入数据
alter table student add columns(sex string);
insert into table student partition(dt = '2019-11-13') select 1,'zhangsan','male';
insert into table student partition(dt = '2019-11-14') select 2,'lisi','female';
insert into table student partition(dt = '2019-11-15') select 3,'wangwu','female';
3.验证
select * from student where dt = '2019-11-13';
select * from student where dt = '2019-11-14';
但是 impala查询正常
select * from student where dt = '2019-11-15';
4.结论
- 分区在增加字段前存在,会出现查询新增字段值为NULL的情况
- 分区在增加字段前不存在,正常
3.解决方法
对于在增加字段前已经存在的分区,必须再执行
alter table student paritition(dt = '2019-11-14') add columns(sex string);
select * from student where dt = '2019-11-14';