Hive数据导入Hbase

Hive数据导入Hbase
方案一：Hive关联HBase表方式

适用场景：数据量不大4T以下（走hbase的api导入数据）

一、hbase表不存在的情况

创建hive表hive_hbase_table映射hbase表hbase_table，会自动创建hbase表hbase_table，且会随着hive表删除而删除，这里需要指定hive的schema到hbase schema的映射关系：

1、建表
```
CREATE TABLE hive_hbase_table(key int, name String,age String) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:name,cf1:age") 
TBLPROPERTIES ("hbase.table.name" = "hbase_table", 
"hbase.mapred.output.outputtable" = "hbase_table");
```
2、创建一张原始的hive表，准备一些数据
```
create table hive_data (key int,name String,age string);
insert into hive_data values(1,"za","13");
insert into hive_data values(2,"ff","44");
```
3、把hive原表hive_data的数据，通过hive表hive_hbase_table导入到hbase的表hbase_table中
```
insert into table hive_hbase_table select * from hive_data;
```
4、查看hbase表hbase_table中是否有数据

二、hbase表存在的情况

创建hive的外表关联hbase表,注意hive schema到hbase schema的映射关系。删除外表不会删除对应hbase表
```
CREATE EXTERNAL TABLE hive_hbase_external_table(key String, name string,sex String,age String,department String) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:name,info:sex,info:age,info:department") 
TBLPROPERTIES ("hbase.table.name" = "filtertest", 
"hbase.mapred.output.outputtable" = "filtertest");
```
其他步骤与上面相同

方案二：HIve表生成hfile，通过bulkload导入到hbase

1、适用场景：数据量大（4T以上）

2、把hive数据转换为hfile

3、启动hive并添加相关的hbase的jar包

add jar /mnt/hive/lib/hive-hbase-handler-2.1.1.jar;
add jar /mnt/hive/lib/hbase-common-1.1.1.jar;
add jar /mnt/hive/lib/hbase-client-1.1.1.jar;
add jar /mnt/hive/lib/hbase-protocol-1.1.1.jar;
add jar /mnt/hive/lib/hbase-server-1.1.1.jar;

4、创建一个outputformat为HiveHFileOutputFormat的hive表

其中/tmp/hbase_table_hfile/cf_0是hfile保存到hdfs的路径，cf_0是hbase family的名字
```
create table hbase_hfile_table(key int, name string,age String) 
stored as
INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.hbase.HiveHFileOutputFormat'
TBLPROPERTIES ('hfile.family.path' = '/tmp/hbase_table_hfile/cf_0');
```
5、原始数据表的数据通过hbase_hfile_table表保存为hfile
```
insert into table hbase_hfile_table select * from hive_data;
```
6、查看对应hdfs路径是否生成了hfile

7、通过bulkload将数据导入到hbase表中

建表：使用hbase客户端创建具有上面对应family的hbase表
```
create 'hbase_hfile_load_table','cf_0'
```
下载hbase客户端,配置hbase-site.xml，并将hdfs-site.xml、core-site.xml拷贝到hbase/conf目录

导入：
```
 hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles 
 hdfs://master:9000/tmp/hbase_table_hfile/  hbase_hfile_load_table
```
8、查看
相关阅读:
wordpress通过$wpdb获取一个分类下所有的文章
 WordPress的摘要显示方式
 WordPress简洁的SEO标题、关键词和描述
 WordPress获取特色图像的链接地址
 WordPress的Bootstrap面包屑导航
 destoon 6.0 手机站支持在所有浏览器访问
 dede织梦5.7的安全防护设置
 WordPress主题制作：基础样式文件
 LInux常用到的命令（面试）
1030 完美数列 (25分) PAT-B
原文地址：https://www.cnblogs.com/yfb918/p/10882323.html