HBase集成hive

一、为了创建一个新的由Hive管理的HBase表，请使用CREATE TABLE

CREATE TABLE hbase_table_1(key int, value string) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
TBLPROPERTIES ("hbase.table.name" = "xyz", "hbase.mapred.output.outputtable" = "xyz");

The hbase.columns.mapping property is required and will be explained in the next section.
The hbase.table.name property is optional;
- it controls the name of the table as known by HBase, and allows the Hive table to have a different name.
- In this example, the table is known as hbase_table_1 within Hive, and as xyz within HBase.
- If not specified, then the Hive and HBase table names will be identical.
The hbase.mapred.output.outputtable property is optional;
- it's needed if you plan to insert data to the table (the property is used by hbase.mapreduce.TableOutputFormat)

2、列的映射

There are two SERDEPROPERTIES that control the mapping of HBase columns to Hive:

hbase.columns.mapping
hbase.table.default.storage.type: Can have a value of either string (the default) or binary, this option is only available as of Hive 0.9 and the string behavior is the only one available in earlier versions

多列和多列族

hive中创建表

CREATE TABLE hbase_table_1(key int, value1 string, value2 int, value3 int) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
"hbase.columns.mapping" = ":key,a:b,a:c,d:e"
);

插入数据

hive> insert into table hbase_table_1 values(100,val_100,101,102);
hive> insert into table hbase_table_1 values(100,val_100,101,102);

hbase查看表结构

HBASE查看数据

hbase(main):004:0> scan "hbase_table_1"
ROW                       COLUMN+CELL                                                            
 100                      column=a:b, timestamp=1595817016732, value=val_100                     
 100                      column=a:c, timestamp=1595817016732, value=101                         
 100                      column=d:e, timestamp=1595817016732, value=102                         
 98                       column=a:b, timestamp=1595817050488, value=val_98                      
 98                       column=a:c, timestamp=1595817050488, value=99                          
 98                       column=d:e, timestamp=1595817050488, value=100                         
2 row(s) in 0.0410 seconds

总结

（1）hive的key即为hbase的rowkey

（2）"hbase.columns.mapping" = ":key,a:b,a:c,d:e"中，:key 即为rowkey

3、列的映射

hbase中插入数据

hbase(main):006:0> put "hbase_table_1",102,"a:b","val_102"
hbase(main):008:0> put "hbase_table_1",102,"a:c","101"
hbase(main):009:0> put "hbase_table_1",102,"d:e","102"

scan数据

hive查看数据

相关阅读:
CVSps 3.8 发布，CVS 资料库更改收集
 Cobra WinLDTP 3.0 发布，GUI 自动化测试
 SolusOS 2 Alpha 6 发布，桌面 Linux 发行
 微软 Windows Phone 8 原创应用大赛起航
 JAVA削足适履适应RESTful设计
 如何撰写编程书籍
 Synbak 2.1 发布，系统备份工具
 LibreOffice 4.0 RC1 发布，支持火狐兼容主题
 Rails 3.2.11 发布，修复关键安全问题
 haveged 1.7 发布，随机数生成器
原文地址：https://www.cnblogs.com/hyunbar/p/13384490.html