• python 通过thrift 简单操作hbase


    thrift 是facebook开发并开源的一个二进制通讯中间件,通过thrift,我们可以充分利用各个语言的优势,编写高效的代码。

    关于thrift的论文:http://pan.baidu.com/share/link?shareid=234128&uk=3238841275

    安装thrift:http://thrift.apache.org/docs/install/ubuntu/

    安装完成后到hbase的目录下,找到Hbase.thrift,该文件在

    hbase-0.94.4/src/main/resources/org/apache/hadoop/hbase/thrift下可以找到

    thrift --gen py hbase.thrift 会生成gen-py文件夹,将其修改成hbase

    安装python的thrift库

    sudo pip install thrift

    启动hbase的thrift服务:bin/hbase-daemon.sh start thrift 默认端口是9090

    创建hbase表:

     1 from thrift import Thrift
     2 from thrift.transport import TSocket
     3 from thrift.transport import TTransport
     4 from thrift.protocol import TBinaryProtocol
     5 
     6 from hbase import Hbase
     7 from hbase.ttypes import *
     8 
     9 transport = TSocket.TSocket('localhost', 9090);
    10 
    11 transport = TTransport.TBufferedTransport(transport)
    12 
    13 protocol = TBinaryProtocol.TBinaryProtocol(transport);
    14 
    15 client = Hbase.Client(protocol)
    16 transport.open()
    17 
    18 
    19 contents = ColumnDescriptor(name='cf:', maxVersions=1)
    20 client.createTable('test', [contents])
    21 
    22 print client.getTableNames()

    执行代码,成功后,进入hbase的shell,用命令list可以看到刚刚的test表已经创建成功。

    插入数据:

     1 from thrift import Thrift
     2 from thrift.transport import TSocket
     3 from thrift.transport import TTransport
     4 from thrift.protocol import TBinaryProtocol
     5 
     6 from hbase import Hbase
     7 
     8 from hbase.ttypes import *
     9 
    10 transport = TSocket.TSocket('localhost', 9090)
    11 
    12 transport = TTransport.TBufferedTransport(transport)
    13 
    14 protocol = TBinaryProtocol.TBinaryProtocol(transport)
    15 
    16 client = Hbase.Client(protocol)
    17 
    18 transport.open()
    19 
    20 row = 'row-key1'
    21 
    22 mutations = [Mutation(column="cf:a", value="1")]
    23 client.mutateRow('test', row, mutations, None)

    插入成功,通过scan命令查看插入结果:

    获取一行数据:

     1 from thrift import Thrift
     2 from thrift.transport import TSocket
     3 from thrift.transport import TTransport
     4 from thrift.protocol import TBinaryProtocol
     5 
     6 from hbase import Hbase
     7 from hbase.ttypes import *
     8 
     9 transport = TSocket.TSocket('localhost', 9090)
    10 transport = TTransport.TBufferedTransport(transport)
    11 
    12 protocol = TBinaryProtocol.TBinaryProtocol(transport)
    13 
    14 client = Hbase.Client(protocol)
    15 
    16 transport.open()
    17 
    18 tableName = 'test'
    19 rowKey = 'row-key1'
    20 
    21 result = client.getRow(tableName, rowKey, None)
    22 print result
    23 for r in result:
    24     print 'the row is ' , r.row
    25     print 'the values is ' , r.columns.get('cf:a').value

    getRow返回的是TResult列表,结果如下:

    返回多行则需要使用scan:

     1 from thrift import Thrift
     2 from thrift.transport import TSocket
     3 from thrift.transport import TTransport
     4 from thrift.protocol import TBinaryProtocol
     5 
     6 from hbase import Hbase
     7 from hbase.ttypes import *
     8 
     9 transport = TSocket.TSocket('localhost', 9090)
    10 transport = TTransport.TBufferedTransport(transport)
    11 
    12 protocol = TBinaryProtocol.TBinaryProtocol(transport)
    13 
    14 client = Hbase.Client(protocol)
    15 transport.open()
    16 
    17 scan = TScan()
    18 tableName = 'test'
    19 id = client.scannerOpenWithScan(tableName, scan, None)
    20 
    21 result2 = client.scannerGetList(id, 10)
    22 
    23 print result2

    scannerGetList会取10条数据,然后输出结果

     scannerGet则是每次只取一行数据:

     1 from thrift import Thrift
     2 from thrift.transport import TSocket
     3 from thrift.transport import TTransport
     4 from thrift.protocol import TBinaryProtocol
     5 
     6 from hbase import Hbase
     7 from hbase.ttypes import *
     8 
     9 transport = TSocket.TSocket('localhost', 9090)
    10 transport = TTransport.TBufferedTransport(transport)
    11 
    12 protocol = TBinaryProtocol.TBinaryProtocol(transport)
    13 
    14 client = Hbase.Client(protocol)
    15 transport.open()
    16 
    17 scan = TScan()
    18 tableName = 'test'
    19 id = client.scannerOpenWithScan(tableName, scan, None)
    20 result = client.scannerGet(id)
    21 while result:
    22     print result
    23     result = client.scannerGet(id)

    输出结果:

  • 相关阅读:
    领域驱动设计实践,精通业务,面向对象编程,面条编程,过程编程
    日志聚合与全链路监控
    Spring Security OAuth2 之token 和 refresh token
    Web开发技术发展历程(笔记)
    JDBC 数据处理 总结
    Idea创建Maven多模块项目
    中国 / 省市区县 / 四级联动 / 地址选择器(京东商城地址选择)
    左膀mongostat,右臂mongotop——MongoDB的监控之道
    制定项目章程
    挣值分析
  • 原文地址:https://www.cnblogs.com/hitandrew/p/2870419.html
Copyright © 2020-2023  润新知