• 表数据迁移(可以指定时间戳将数据导出方法)


    1 CopyTable 工具

    用法:

    CopyTable is a utility that can copy part or of all of a table, either to the same cluster or another cluster. The target table must first exist. The usage is as follows:

    $ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable [--starttime=X] [--endtime=Y] [--new.name=NEW] [--peer.adr=ADR] tablename
    

    Options:

    • starttime Beginning of the time range. Without endtime means starttime to forever.
    • endtime End of the time range. Without endtime means starttime to forever.
    • versions Number of cell versions to copy.
    • new.name New table's name.
    • peer.adr Address of the peer cluster given in the format hbase.zookeeper.quorum:hbase.zookeeper.client.port:zookeeper.znode.parent
    • families Comma-separated list of ColumnFamilies to copy.
    • all.cells Also copy delete markers and uncollected deleted cells (advanced option).

    Args:

    • tablename Name of table to copy.

    Example of copying 'TestTable' to a cluster that uses replication for a 1 hour window:

    $ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable
    --starttime=1265875194289 --endtime=1265878794289
    --peer.adr=server1,server2,server3:2181:/hbase TestTable

    Scanner Caching

    Caching for the input Scan is configured via hbase.client.scanner.caching in the job configuration.

    Versions

    By default, CopyTable utility only copies the latest version of row cells unless --versions=n is explicitly specified in the command.

    See Jonathan Hsieh's Online HBase Backups with CopyTable blog post for more on CopyTable.

    2 Export和Import工具

    Export is a utility that will dump the contents of table to HDFS in a sequence file. Invoke via:

    $ bin/hbase org.apache.hadoop.hbase.mapreduce.Export <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]]
    

    Note: caching for the input Scan is configured via hbase.client.scanner.caching in the job configuration.

    $ bin/hbase org.apache.hadoop.hbase.mapreduce.Export <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]]

    Import is a utility that will load data that has been exported back into HBase. Invoke via:

    $ bin/hbase org.apache.hadoop.hbase.mapreduce.Import <tablename> <inputdir>
    

    To import 0.94 exported files in a 0.96 cluster or onwards, you need to set system property "hbase.import.version" when running the import command as below:

    $ bin/hbase -Dhbase.import.version=0.94 org.apache.hadoop.hbase.mapreduce.Import <tablename> <inputdir>

    export带时间范围的具体用法: hbase org.apache.Hadoop.hbase.mapreduce.Export member5 hdfs://master24:9000/user/hadoop/dump2 1 1401938590466 1401938590467

    导出路径为HDFS路径,写全路径。

    导入的表必须存在预先定义好。

  • 相关阅读:
    最常用的CountDownLatch, CyclicBarrier你知道多少? (Java工程师必会)
    浅谈Java中的Condition条件队列,手摸手带你实现一个阻塞队列!
    实习到公司倒闭,2019我的技术踩坑之路!
    Java中的等待唤醒机制—至少50%的工程师还没掌握!
    告别编码5分钟,命名2小时!史上最全的Java命名规范参考!
    解决SELinux阻止Nginx访问服务
    ZooKeeper使用入门
    JVM致命错误日志详解
    虚拟机中设置 CentOS 静态 IP
    Spring 核心技术(7)
  • 原文地址:https://www.cnblogs.com/yingjie2222/p/6016771.html
Copyright © 2020-2023  润新知