• Hbase的使用和分析


    基本命令:

    [root@master shell]# cd ${HBASE_HOME}
    [root@master hbase]# pwd
    /opt/hbase

    [root@master hbase]# ./bin/hbase shell

    hbase(main):001:0> status
    1 active master, 0 backup masters, 2 servers, 0 dead, 1.0000 average load

    hbase(main):002:0> create 'test','info'
    ;
    0 row(s) in 2.3730 seconds

    => Hbase::Table - test

    hbase(main):005:0* status
    1 active master, 0 backup masters, 2 servers, 0 dead, 1.5000 average load

    hbase(main):009:0> list 'test'
    TABLE
    test
    1 row(s) in 0.0230 seconds

    => ["test"]

    hbase(main):011:0> describe 'test'
    Table test is ENABLED
    test
    COLUMN FAMILIES DESCRIPTION
    {NAME => 'info', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSIO
    N => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
    1 row(s) in 0.0860 seconds

    插入和更新

    hbase(main):012:0> put 'test', 'row_key1', 'info:course','physic'
    0 row(s) in 0.1670 seconds

    hbase(main):013:0> put 'test', 'row_key1', 'info:course','math'   #这条数据会将上面的一条update掉
    0 row(s) in 0.0340 seconds

    hbase(main):014:0> put 'test', 'row_key2', 'info:score','98'
    0 row(s) in 0.0060 seconds

    hbase(main):015:0> put 'test', 'row_key2', 'info:score','90'  #这条数据会将上面的一条update掉
    0 row(s) in 0.0330 seconds

    hbase(main):016:0> scan 'test'
    ROW COLUMN+CELL
    row_key1 column=info:course, timestamp=1524017746124, value=math
    row_key2 column=info:score, timestamp=1524017756181, value=90
    2 row(s) in 0.0880 seconds

    列族:

    hbase(main):017:0> put 'test', 'row_key2', 'info:course','math'
    0 row(s) in 0.0210 seconds

    hbase(main):018:0> put 'test', 'row_key1', 'info:score','98'
    0 row(s) in 0.0080 seconds

    hbase(main):020:0> scan 'test'
    ROW COLUMN+CELL
    row_key1 column=info:course, timestamp=1524017746124, value=math
    row_key1 column=info:score, timestamp=1524017895247, value=98
    row_key2 column=info:course, timestamp=1524017889154, value=math
    row_key2 column=info:score, timestamp=1524017899494, value=90
    2 row(s) in 0.0150 seconds

    查询

    hbase(main):023:0> get 'test', 'row_key1'
    COLUMN CELL
    info:course timestamp=1524017746124, value=math
    info:score timestamp=1524017895247, value=98
    2 row(s) in 0.0260 seconds

    hbase(main):024:0>

    注意,我从node2节点获取数据:

    [root@node2 ~]# /opt/hbase/bin/hbase shell
    2018-04-18 10:23:54,026 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/opt/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    HBase Shell; enter 'help<RETURN>' for list of supported commands.
    Type "exit<RETURN>" to leave the HBase Shell
    Version 1.2.6, rUnknown, Mon May 29 02:25:32 CDT 2017

    hbase(main):001:0> scan 'test'
    ROW COLUMN+CELL
    row_key1 column=info:course, timestamp=1524017746124, value=math
    row_key1 column=info:score, timestamp=1524017895247, value=98
    row_key2 column=info:course, timestamp=1524017889154, value=math
    row_key2 column=info:score, timestamp=1524017899494, value=90
    2 row(s) in 0.2680 seconds

    hbase(main):002:0>

    由此可以得出hbase是基于hdfs存储的,可以多读多写。而数据又是分布式存储的(hdfs)。

    数据模型图:

    [root@master hadoop]# vi /opt/hbase/conf/hbase-site.xml 
    
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <!--
    /**
     *
     * Licensed to the Apache Software Foundation (ASF) under one
     * or more contributor license agreements.  See the NOTICE file
     * distributed with this work for additional information
     * regarding copyright ownership.  The ASF licenses this file
     * to you under the Apache License, Version 2.0 (the
     * "License"); you may not use this file except in compliance
     * with the License.  You may obtain a copy of the License at
     *
     *     http://www.apache.org/licenses/LICENSE-2.0
     *
     * Unless required by applicable law or agreed to in writing, software
     * distributed under the License is distributed on an "AS IS" BASIS,
     * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
     * See the License for the specific language governing permissions and
     * limitations under the License.
     */
    -->
    <configuration>
        <property>
            <name>hbase.rootdir</name>
            <value>hdfs://master:9000/hbase/hbase_db</value>
        </property>

    此处的 hbase.rootdir : hdfs://master:9000/hbase/hbase_db ,如果用shell命令用 ls 是查不出来的,

    因为它是一个hdfs的相对路径,而存储hdfs的数据目录是可在hdfs-site.xml里配置,但是数据是以block形式存储的,只能用hdfs的命令查询

    查看hdfs数据存储路径

    [root@master hadoop]# vi /opt/hadoop/etc/hadoop/hdfs-site.xml 

    <!-- 设置datanode存放的路径 -->
    <property>
    <name>dfs.datanode.data.dir</name>
    <value>/opt/hadoop/data</value>
    </property>
    </configuration>

    从上面的配置可以看出,hdfs的数据是存放在datanode服务器下的。所以在master下没这个目录的

    [root@master hadoop]# cd /opt/hadoop/data
    -bash: cd: /opt/hadoop/data: No such file or directory

    [root@node1 current]# cd /opt/hadoop/data
    [root@node1 data]# ls
    current in_use.lock
    [root@node1 data]#

    [root@node2 hadoop]# cd /opt/hadoop/data
    [root@node2 data]# ls
    current in_use.lock
    [root@node2 data]#

    查看hbase的数据存放路径

    hadoop fs -ls  /hbase/hbase_db

    [root@master hadoop]# hadoop fs -ls  /hbase/hbase_db  (备注:现在的版本hadoop fs被改成了hdfs fs)
    Found 9 items
    drwxr-xr-x - root supergroup 0 2018-04-18 10:04 /hbase/hbase_db/.tmp
    drwxr-xr-x - root supergroup 0 2018-04-18 11:10 /hbase/hbase_db/MasterProcWALs
    drwxr-xr-x - root supergroup 0 2018-04-18 09:32 /hbase/hbase_db/WALs
    drwxr-xr-x - root supergroup 0 2018-04-18 10:18 /hbase/hbase_db/archive
    drwxr-xr-x - root supergroup 0 2018-04-17 02:59 /hbase/hbase_db/corrupt
    drwxr-xr-x - root supergroup 0 2018-04-15 10:29 /hbase/hbase_db/data
    -rw-r--r-- 3 root supergroup 42 2018-04-15 10:29 /hbase/hbase_db/hbase.id
    -rw-r--r-- 3 root supergroup 7 2018-04-15 10:29 /hbase/hbase_db/hbase.version
    drwxr-xr-x - root supergroup 0 2018-04-18 11:43 /hbase/hbase_db/oldWALs

    表test的数据存放目录/hbase/hbase_db/data/default/test/

    [root@node1 opt]# hadoop fs -ls /hbase/hbase_db/data/default/test/402710672d039c97b3006531c037d9ae
    Found 4 items
    -rw-r--r-- 3 root supergroup 39 2018-04-18 10:05 /hbase/hbase_db/data/default/test/402710672d039c97b3006531c037d9ae/.regioninfo
    drwxr-xr-x - root supergroup 0 2018-04-18 11:17 /hbase/hbase_db/data/default/test/402710672d039c97b3006531c037d9ae/.tmp
    drwxr-xr-x - root supergroup 0 2018-04-18 11:17 /hbase/hbase_db/data/default/test/402710672d039c97b3006531c037d9ae/info
    drwxr-xr-x - root supergroup 0 2018-04-18 10:05 /hbase/hbase_db/data/default/test/402710672d039c97b3006531c037d9ae/recovered.edits

    但是看不了,二进制文件。

    需要导出数据库数据后,再进行cat查看;

    [root@node1 opt]# /opt/hbase/bin/hbase org.apache.hadoop.hbase.mapreduce.Export  test /opt/hbase/testdata2

    这个目录/opt/hbase/testdata2 是hdfs的目录

    [root@node1 opt]# hdfs dfs -ls /opt/hbase/testdata2
    Found 2 items
    -rw-r--r-- 2 root supergroup 0 2018-04-18 12:46 /opt/hbase/testdata2/_SUCCESS
    -rw-r--r-- 2 root supergroup 322 2018-04-18 12:46 /opt/hbase/testdata2/part-m-00000

    [root@node1 opt]# hdfs dfs -cat /opt/hbase/testdata2/part-m-00000

    文件存储:

    备注:因为export调用的是mar-red模块,需要库支持,所以需要添加 mapreduce.application.classpath 

    [root@master hadoop]# vi /opt/hadoop/etc/hadoop/mapred-site.xml 

    #插入以下几行

    <property>  
     <name>mapreduce.application.classpath</name>  
      <value>  
        /opt/hadoop/etc/hadoop,  
        /opt/hadoop/share/hadoop/common/*,  
        /opt/hadoop/share/hadoop/common/lib/*,  
        /opt/hadoop/share/hadoop/hdfs/*,  
        /opt/hadoop/share/hadoop/hdfs/lib/*,  
        /opt/hadoop/share/hadoop/mapreduce/*,  
        /opt/hadoop/share/hadoop/mapreduce/lib/*,  
        /opt/hadoop/share/hadoop/yarn/*,  
        /opt/hadoop/share/hadoop/yarn/lib/*  
       </value>  
    </property>

    (未完待续)

    IT之界浩瀚无边 只有持之恒心方可打开一窗 偷窥此中奥秘之一二 取之受益,亦珍而视之 学之留香,方不失风范 共勉 共进
  • 相关阅读:
    Python实现以不同分辨率分类视频
    JPA 的增删改查方法
    JPA 中 find() 和 getReference() 的区别
    (三)JPA工具类
    (二)JPA实体类主键生成策略
    (一)配置JPA的开发环境
    自定义视图和自定义视图解析器
    view-controller
    RedirectAttributes 的使用
    SpringMVC视图解析中的 forward: 与 redirect: 前缀
  • 原文地址:https://www.cnblogs.com/zhangmin1987/p/8874151.html
Copyright © 2020-2023  润新知