• Backing Up and Restoring HBase Data

    There are two strategies for backing up HBase:
    1> Backing it up with a full cluster shutdown
    2> Backing it up on a live cluster
          A full shutdown backup has to stop HBase (or disable all tables) at first, then use Hadoop's distcp command to copy the contents of an HBase directory to either another directory on the same HDFS, or to a different HDFS. To restore from a full shutdown backup, just copy the backed up files, back to the HBase directory using distcp.

         There are several approaches for a live cluster backup:
    1> Using the CopyTable utility to copy data from one table to another
    2> Exporting an HBase table to HDFS files, and importing the files back to HBase
    3> HBase cluster replication

           The CopyTable utility could be used to copy data from one table to either another one on the same cluster, or to a different cluster. The Export utility dumps the data of a table to HDFS,which is on the same cluster. As a set of Export, the Import utility is used to restore the data of the dump files.

    方法 1:

    landen@Master:~/UntarFile/hbase-0.94.12$ bin/hbase org.apache.hadoop.hbase.mapreduce.Export

    Usage: Export [-D <property=value>]* <tablename> <outputdir> [<versions> [<starttime> [<endtime>]] [^[regex pattern] or [Prefix] to filter]]
      Note: -D properties will be applied to the conf used.
      For example:
       -D mapred.output.compress=true
       -D mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec
       -D mapred.output.compression.type=BLOCK
      Additionally, the following SCAN properties can be specified
      to control/limit what is exported..
       -D hbase.mapreduce.scan.column.family=<familyName>
       -D hbase.mapreduce.include.deleted.rows=true
    For performance consider the following properties:

    landen@Master:~/UntarFile/hbase-0.94.12$ bin/hbase org.apache.hadoop.hbase.mapreduce.Export -D mapred.output.compress=true -D mapred.output.compression.codec=org.apache.hadoop.io.compress.BZip2Codec -D mapred.output.compression.type=BLOCK -D hbase.mapreduce.scan.column.family=IPAddress(可以","添加多个列簇) HiddenIPInfo(对应的HBase需导出的表) /backup/HBaseExport(导出数据时自动创建该目录)
    13/12/10 20:12:15 INFO mapreduce.Export: versions=1, starttime=0, endtime=9223372036854775807, keepDeletedCells=false
    13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.zookeeper.ZooKeeper, using jar /home/landen/UntarFile/hbase-0.94.12/lib/zookeeper-3.4.5.jar
    13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class com.google.protobuf.Message, using jar /home/landen/UntarFile/hbase-0.94.12/lib/protobuf-java-2.4.0a.jar
    13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class com.google.common.collect.ImmutableSet, using jar /home/landen/UntarFile/hbase-0.94.12/lib/guava-11.0.2.jar
    13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.util.Bytes, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
    13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.LongWritable, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
    13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.Text, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
    13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.mapreduce.TableInputFormat, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
    13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.LongWritable, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
    13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.Text, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
    13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.mapreduce.lib.output.TextOutputFormat, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
    13/12/10 20:12:15 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.mapreduce.lib.partition.HashPartitioner, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
    13/12/10 20:12:29 DEBUG mapreduce.TableInputFormatBase: getSplits: split -> 0 -> slave1:,
    13/12/10 20:12:32 INFO mapred.JobClient: Running job: job_201312042044_0033
    13/12/10 20:12:33 INFO mapred.JobClient:  map 0% reduce 0%
    13/12/10 20:12:53 INFO mapred.JobClient:  map 100% reduce 0%
    13/12/10 20:12:58 INFO mapred.JobClient: Job complete: job_201312042044_0033
    13/12/10 20:12:59 INFO mapred.JobClient: Counters: 29
    13/12/10 20:12:59 INFO mapred.JobClient:   Job Counters
    13/12/10 20:12:59 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=11992
    13/12/10 20:12:59 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
    13/12/10 20:12:59 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
    13/12/10 20:12:59 INFO mapred.JobClient:     Rack-local map tasks=1
    13/12/10 20:12:59 INFO mapred.JobClient:     Launched map tasks=1
    13/12/10 20:12:59 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
    13/12/10 20:12:59 INFO mapred.JobClient:   HBase Counters
    13/12/10 20:12:59 INFO mapred.JobClient:     REMOTE_RPC_CALLS=0
    13/12/10 20:12:59 INFO mapred.JobClient:     RPC_CALLS=6
    13/12/10 20:12:59 INFO mapred.JobClient:     RPC_RETRIES=0
    13/12/10 20:12:59 INFO mapred.JobClient:     NOT_SERVING_REGION_EXCEPTION=0
    13/12/10 20:12:59 INFO mapred.JobClient:     NUM_SCANNER_RESTARTS=0
    13/12/10 20:12:59 INFO mapred.JobClient:     MILLIS_BETWEEN_NEXTS=6
    13/12/10 20:12:59 INFO mapred.JobClient:     BYTES_IN_RESULTS=1493
    13/12/10 20:12:59 INFO mapred.JobClient:     BYTES_IN_REMOTE_RESULTS=0
    13/12/10 20:12:59 INFO mapred.JobClient:     REGIONS_SCANNED=1
    13/12/10 20:12:59 INFO mapred.JobClient:     REMOTE_RPC_RETRIES=0
    13/12/10 20:12:59 INFO mapred.JobClient:   File Output Format Counters
    13/12/10 20:12:59 INFO mapred.JobClient:     Bytes Written=775
    13/12/10 20:12:59 INFO mapred.JobClient:   FileSystemCounters
    13/12/10 20:12:59 INFO mapred.JobClient:     HDFS_BYTES_READ=69
    13/12/10 20:12:59 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=35024
    13/12/10 20:12:59 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=775
    13/12/10 20:12:59 INFO mapred.JobClient:   File Input Format Counters
    13/12/10 20:12:59 INFO mapred.JobClient:     Bytes Read=0
    13/12/10 20:12:59 INFO mapred.JobClient:   Map-Reduce Framework
    13/12/10 20:12:59 INFO mapred.JobClient:     Map input records=3
    13/12/10 20:12:59 INFO mapred.JobClient:     Physical memory (bytes) snapshot=94224384
    13/12/10 20:12:59 INFO mapred.JobClient:     Spilled Records=0
    13/12/10 20:12:59 INFO mapred.JobClient:     CPU time spent (ms)=1110
    13/12/10 20:12:59 INFO mapred.JobClient:     Total committed heap usage (bytes)=82116608
    13/12/10 20:12:59 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=395390976
    13/12/10 20:12:59 INFO mapred.JobClient:     Map output records=3
    13/12/10 20:12:59 INFO mapred.JobClient:     SPLIT_RAW_BYTES=69
    landen@Master:~/UntarFile/hadoop-1.0.4$ bin/hadoop fs -ls /backup/HBaseExport/
    Warning: $HADOOP_HOME is deprecated.

    Found 3 items
    -rw-r--r--   1 landen supergroup          0 2013-12-10 20:12 /backup/HBaseExport/_SUCCESS
    drwxr-xr-x   - landen supergroup          0 2013-12-10 20:12 /backup/HBaseExport/_logs
    -rw-r--r--   1 landen supergroup        775 2013-12-10 20:12 /backup/HBaseExport/part-m-00000

    方法 2:

    landen@Master:~/UntarFile/hbase-0.94.12$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable
    Usage: CopyTable [general options] [--starttime=X] [--endtime=Y] [--new.name=NEW] [--peer.adr=ADR] <tablename>
     rs.class     hbase.regionserver.class of the peer cluster
                  specify if different from current cluster
     rs.impl      hbase.regionserver.impl of the peer cluster
     startrow     the start row
     stoprow      the stop row
     starttime    beginning of the time range (unixtime in millis)
                  without endtime means from starttime to forever
     endtime      end of the time range.  Ignored if no starttime specified.
     versions     number of cell versions to copy
     new.name     new table's name
     peer.adr     Address of the peer cluster given in the format
     families     comma-separated list of families to copy
                  To copy from cf1 to cf2, give sourceCfName:destCfName.
                  To keep the same name, just give "cfName"
     all.cells    also copy delete markers and deleted cells
     tablename    Name of the table to copy
     To copy 'TestTable' to a cluster that uses replication for a 1 hour window:
     $ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --starttime=1265875194289 --endtime=1265878794289
    --peer.adr=server1,server2,server3:2181:/hbase(指定另一个所在集群位置) --families=myOldCf:myNewCf,cf2,cf3 TestTable For performance consider the following general options: -Dhbase.client.scanner.caching=100 -Dmapred.map.tasks.speculative.execution=false

              CopyTable is a utility to copy the data of one table to another table, either on the samecluster, or on a different HBase cluster. You can copy to a table that is on the same cluster; however, if you have another cluster that you want to treat as a backup, you might want to use CopyTable as a live backup option to copy the data of a table to the backup cluster. CopyTable is configurable with a start and an end timestamp. If specified, only the datawith a timestamp in the specific time frame will be copied. This feature makes it possible for incremental backup of an HBase table in some situations.

    "Incremental backup" is a method to only back up the data that has been changed during the last backup.

    Note: Since the cluster keeps running, there is a risk that edits could be missed during the copy process.

    landen@Master:~/UntarFile/hbase-0.94.12$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --families=IPAddress --new.name=BackUpHiddenIPInfo(复制一个表的数据到另一个表进行备份->最好复制到不同集群) HiddenIPInfo(所需复制的数据对应的表)
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.zookeeper.ZooKeeper, using jar /home/landen/UntarFile/hbase-0.94.12/lib/zookeeper-3.4.5.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class com.google.protobuf.Message, using jar /home/landen/UntarFile/hbase-0.94.12/lib/protobuf-java-2.4.0a.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class com.google.common.collect.ImmutableSet, using jar /home/landen/UntarFile/hbase-0.94.12/lib/guava-11.0.2.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.util.Bytes, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.LongWritable, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.Text, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.mapreduce.TableInputFormat, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.LongWritable, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.Text, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.mapreduce.lib.output.TextOutputFormat, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.mapreduce.lib.partition.HashPartitioner, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.zookeeper.ZooKeeper, using jar /home/landen/UntarFile/hbase-0.94.12/lib/zookeeper-3.4.5.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class com.google.protobuf.Message, using jar /home/landen/UntarFile/hbase-0.94.12/lib/protobuf-java-2.4.0a.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class com.google.common.collect.ImmutableSet, using jar /home/landen/UntarFile/hbase-0.94.12/lib/guava-11.0.2.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.util.Bytes, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.io.ImmutableBytesWritable, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.Writable, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.mapreduce.TableInputFormat, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.io.ImmutableBytesWritable, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.io.Writable, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.hbase.mapreduce.TableOutputFormat, using jar /home/landen/UntarFile/hbase-0.94.12/hbase-0.94.12.jar
    13/12/10 16:15:59 DEBUG mapreduce.TableMapReduceUtil: For class org.apache.hadoop.mapreduce.lib.partition.HashPartitioner, using jar /home/landen/UntarFile/hbase-0.94.12/lib/hadoop-core-1.0.4.jar


    13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/home/landen/UntarFile/hadoop-1.0.4/libexec/../lib/native/Linux-i386-32:/home/landen/UntarFile/hbase-0.94.12/lib/native/Linux-i386-32
    13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
    13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
    13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
    13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:os.arch=i386
    13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:os.version=3.2.0-24-generic-pae
    13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:user.name=landen
    13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/landen
    13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Client environment:user.dir=/home/landen/UntarFile/hbase-0.94.12
    13/12/10 16:16:04 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=Slave1:2222,Master:2222,Slave2:2222 sessionTimeout=180000 watcher=hconnection
    13/12/10 16:16:04 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 16010@Master
    13/12/10 16:16:04 INFO zookeeper.ClientCnxn: Opening socket connection to server Master/ Will not attempt to authenticate using SASL (unknown error)
    13/12/10 16:16:04 INFO zookeeper.ClientCnxn: Socket connection established to Master/, initiating session
    13/12/10 16:16:04 INFO zookeeper.ClientCnxn: Session establishment complete on server Master/, sessionid = 0x42db7cbd1f0005, negotiated timeout = 180000
    13/12/10 16:16:04 DEBUG client.HConnectionManager$HConnectionImplementation: Looked up root region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@167a465; serverName=Slave1,60020,1386661855439
    13/12/10 16:16:04 DEBUG client.HConnectionManager$HConnectionImplementation: Cached location for .META.,,1.1028785192 is Slave1:60020
    13/12/10 16:16:05 DEBUG client.MetaScanner: Scanning .META. starting at row=BackUpHiddenIPInfo,,00000000000000 for max=10 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@167a465
    13/12/10 16:16:05 DEBUG client.HConnectionManager$HConnectionImplementation: Cached location for BackUpHiddenIPInfo,,1386662946878.48312c3f9b8715670432c413ca44f2f6. is Slave1:60020
    13/12/10 16:16:05 INFO mapreduce.TableOutputFormat: Created table instance for BackUpHiddenIPInfo
    13/12/10 16:16:05 DEBUG client.MetaScanner: Scanning .META. starting at row=HiddenIPInfo,,00000000000000 for max=10 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@167a465
    13/12/10 16:16:05 DEBUG client.HConnectionManager$HConnectionImplementation: Cached location for HiddenIPInfo,,1386509509553.9e1062d691dd4c25cdc030f8c3fc9860. is Slave1:60020
    13/12/10 16:16:05 DEBUG client.MetaScanner: Scanning .META. starting at row=HiddenIPInfo,,00000000000000 for max=2147483647 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@167a465
    13/12/10 16:16:05 ERROR mapreduce.TableInputFormatBase: Cannot resolve the host name for / because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name ''
    13/12/10 16:16:05 DEBUG mapreduce.TableInputFormatBase: getSplits: split -> 0 -> slave1:,
    13/12/10 16:16:07 INFO mapred.JobClient: Running job: job_201312042044_0030
    13/12/10 16:16:08 INFO mapred.JobClient:  map 0% reduce 0%
    13/12/10 16:16:27 INFO mapred.JobClient:  map 100% reduce 0%
    13/12/10 16:16:32 INFO mapred.JobClient: Job complete: job_201312042044_0030
    13/12/10 16:16:32 INFO mapred.JobClient: Counters: 28
    13/12/10 16:16:32 INFO mapred.JobClient:   Job Counters
    13/12/10 16:16:32 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=12305
    13/12/10 16:16:32 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
    13/12/10 16:16:32 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
    13/12/10 16:16:32 INFO mapred.JobClient:     Rack-local map tasks=1
    13/12/10 16:16:32 INFO mapred.JobClient:     Launched map tasks=1
    13/12/10 16:16:32 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
    13/12/10 16:16:32 INFO mapred.JobClient:   HBase Counters
    13/12/10 16:16:32 INFO mapred.JobClient:     REMOTE_RPC_CALLS=0
    13/12/10 16:16:32 INFO mapred.JobClient:     RPC_CALLS=6
    13/12/10 16:16:32 INFO mapred.JobClient:     RPC_RETRIES=0
    13/12/10 16:16:32 INFO mapred.JobClient:     NOT_SERVING_REGION_EXCEPTION=0
    13/12/10 16:16:32 INFO mapred.JobClient:     NUM_SCANNER_RESTARTS=0
    13/12/10 16:16:32 INFO mapred.JobClient:     MILLIS_BETWEEN_NEXTS=162
    13/12/10 16:16:32 INFO mapred.JobClient:     BYTES_IN_RESULTS=1493
    13/12/10 16:16:32 INFO mapred.JobClient:     BYTES_IN_REMOTE_RESULTS=0
    13/12/10 16:16:32 INFO mapred.JobClient:     REGIONS_SCANNED=1
    13/12/10 16:16:32 INFO mapred.JobClient:     REMOTE_RPC_RETRIES=0
    13/12/10 16:16:32 INFO mapred.JobClient:   File Output Format Counters
    13/12/10 16:16:32 INFO mapred.JobClient:     Bytes Written=0
    13/12/10 16:16:32 INFO mapred.JobClient:   FileSystemCounters
    13/12/10 16:16:32 INFO mapred.JobClient:     HDFS_BYTES_READ=69
    13/12/10 16:16:32 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=34919
    13/12/10 16:16:32 INFO mapred.JobClient:   File Input Format Counters
    13/12/10 16:16:32 INFO mapred.JobClient:     Bytes Read=0
    13/12/10 16:16:32 INFO mapred.JobClient:   Map-Reduce Framework
    13/12/10 16:16:32 INFO mapred.JobClient:     Map input records=3
    13/12/10 16:16:32 INFO mapred.JobClient:     Physical memory (bytes) snapshot=83361792
    13/12/10 16:16:32 INFO mapred.JobClient:     Spilled Records=0
    13/12/10 16:16:32 INFO mapred.JobClient:     CPU time spent (ms)=170
    13/12/10 16:16:32 INFO mapred.JobClient:     Total committed heap usage (bytes)=55443456
    13/12/10 16:16:32 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=395317248
    13/12/10 16:16:32 INFO mapred.JobClient:     Map output records=3
    13/12/10 16:16:32 INFO mapred.JobClient:     SPLIT_RAW_BYTES=69
    hbase(main):016:0> describe 'BackUpHiddenIPInfo'
    DESCRIPTION                                                                   ENABLED                                  
     'BackUpHiddenIPInfo', {NAME => 'IPAddress', DATA_BLOCK_ENCODING => 'NONE', B true                                     
     LOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSI                                          
     ONS => '3', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS =>                                           
     'false', BLOCKSIZE => '65536', ENCODE_ON_DISK => 'true', IN_MEMORY => 'false                                          
     ', BLOCKCACHE => 'true'}                                                                                              
    1 row(s) in 0.0670 seconds

    hbase(main):017:0> scan 'BackUpHiddenIPInfo'
    ROW                            COLUMN+CELL                                                                                        column=IPAddress:city, timestamp=1386597147615, value=Ningbo                                       column=IPAddress:countrycode, timestamp=1386597147615, value=CN                                    column=IPAddress:countryname, timestamp=1386597147615, value=China                                 column=IPAddress:latitude, timestamp=1386597147615, value=29.878204                                column=IPAddress:longitude, timestamp=1386597147615, value=121.5495                                column=IPAddress:region, timestamp=1386597147615, value=02                                         column=IPAddress:regionname, timestamp=1386597147615, value=Zhejiang                               column=IPAddress:timezone, timestamp=1386597147615, value=Asia/Shanghai                              column=IPAddress:city, timestamp=1386597147615, value=Hangzhou                                       column=IPAddress:countrycode, timestamp=1386597147615, value=CN                                      column=IPAddress:countryname, timestamp=1386597147615, value=China                                   column=IPAddress:latitude, timestamp=1386597147615, value=30.293594                                  column=IPAddress:longitude, timestamp=1386597147615, value=120.16141                                 column=IPAddress:region, timestamp=1386597147615, value=02                                           column=IPAddress:regionname, timestamp=1386597147615, value=Zhejiang                                 column=IPAddress:timezone, timestamp=1386597147615, value=Asia/Shanghai                             column=IPAddress:city, timestamp=1386597147615, value=Wenzhou                                       column=IPAddress:countrycode, timestamp=1386597147615, value=CN                                     column=IPAddress:countryname, timestamp=1386597147615, value=China                                  column=IPAddress:latitude, timestamp=1386597147615, value=27.999405                                 column=IPAddress:longitude, timestamp=1386597147615, value=120.66681                                column=IPAddress:region, timestamp=1386597147615, value=02                                          column=IPAddress:regionname, timestamp=1386597147615, value=Zhejiang                                column=IPAddress:timezone, timestamp=1386597147615, value=Asia/Shanghai                 
    3 row(s) in 0.0600 seconds

    方法 3:

    landen@Master:~/UntarFile/hadoop-1.0.4$ bin/hadoop distcp
    Warning: $HADOOP_HOME is deprecated.
    distcp [OPTIONS] <srcurl>* <desturl>
    -p[rbugp]              Preserve status
                           r: replication number
                           b: block size
                           u: user
                           g: group
                           p: permission
                           -p alone is equivalent to -prbugp
    -i                     Ignore failures
    -log <logdir>          Write logs to <logdir>
    -m <num_maps>          Maximum number of simultaneous copies
    -overwrite             Overwrite destination
    -update                Overwrite if src size different from dst size
    -skipcrccheck          Do not use CRC check to determine if src is 
                           different from dest. Relevant only if -update
                           is specified
    -f <urilist_uri>       Use list at <urilist_uri> as src list
    -filelimit <n>         Limit the total number of files to be <= n
    -sizelimit <n>         Limit the total size to be <= n bytes
    -delete                Delete the files existing in the dst but not in src
    -mapredSslConf <f>     Filename of SSL configuration for mapper task
    NOTE 1: if -overwrite or -update are set, each source URI is 
          interpreted as an isomorphic update to an existing directory.
    For example:
    hadoop distcp -p -update "hdfs://A:8020/user/foo/bar" "hdfs://B:8020/user/foo/baz"
         would update all descendants of 'baz' also in 'bar'; it would 
         *not* update /user/foo/baz/bar
    NOTE 2: The parameter <n> in -filelimit and -sizelimit can be 
         specified with symbolic representation.  For examples,
           1230k = 1230 * 1024 = 1259520
           891g = 891 * 1024^3 = 956703965184
    Generic options supported are
    -conf <configuration file>     specify an application configuration file
    -D <property=value>            use value for given property
    -fs <local|namenode:port>      specify a namenode
    -jt <local|jobtracker:port>    specify a job tracker
    -files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
    -libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
    -archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.
    The general command line syntax is
    bin/hadoop command [genericOptions] [commandOptions]

            distcp (distributed copy) is a tool provided by Hadoop for copying a large dataset on the same, or different HDFS cluster. It uses MapReduce to copy files in parallel, handle error and recovery, and report the job status. As HBase stores all its files, including system files on HDFS, we can simply use distcp to copy the HBase directory to either another directory on the same HDFS, or to a different HDFS, for backing up the source HBase cluster

    landen@Master:~/UntarFile/hadoop-1.0.4$ bin/hadoop distcp /hbase /backup/HBaseBackUp
    Warning: $HADOOP_HOME is deprecated.

    13/12/10 15:33:09 INFO tools.DistCp: srcPaths=[/hbase]
    13/12/10 15:33:09 INFO tools.DistCp: destPath=/backup/HBaseBackUp
    13/12/10 15:33:10 INFO tools.DistCp: sourcePathsCount=46
    13/12/10 15:33:10 INFO tools.DistCp: filesToCopyCount=17
    13/12/10 15:33:10 INFO tools.DistCp: bytesToCopyCount=11.7k
    13/12/10 15:33:11 INFO mapred.JobClient: Running job: job_201312042044_0029
    13/12/10 15:33:12 INFO mapred.JobClient:  map 0% reduce 0%
    13/12/10 15:33:37 INFO mapred.JobClient:  map 100% reduce 0%
    13/12/10 15:33:42 INFO mapred.JobClient: Job complete: job_201312042044_0029
    13/12/10 15:33:42 INFO mapred.JobClient: Counters: 22
    13/12/10 15:33:42 INFO mapred.JobClient:   Job Counters
    13/12/10 15:33:42 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=20465
    13/12/10 15:33:42 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
    13/12/10 15:33:42 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
    13/12/10 15:33:42 INFO mapred.JobClient:     Launched map tasks=1
    13/12/10 15:33:42 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
    13/12/10 15:33:42 INFO mapred.JobClient:   File Input Format Counters
    13/12/10 15:33:42 INFO mapred.JobClient:     Bytes Read=7904
    13/12/10 15:33:42 INFO mapred.JobClient:   File Output Format Counters
    13/12/10 15:33:42 INFO mapred.JobClient:     Bytes Written=0
    13/12/10 15:33:42 INFO mapred.JobClient:   FileSystemCounters
    13/12/10 15:33:42 INFO mapred.JobClient:     HDFS_BYTES_READ=20070
    13/12/10 15:33:42 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=22644
    13/12/10 15:33:42 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=11988
    13/12/10 15:33:42 INFO mapred.JobClient:   distcp
    13/12/10 15:33:42 INFO mapred.JobClient:     Files copied=17
    13/12/10 15:33:42 INFO mapred.JobClient:     Bytes copied=11988
    13/12/10 15:33:42 INFO mapred.JobClient:     Bytes expected=11988
    13/12/10 15:33:42 INFO mapred.JobClient:   Map-Reduce Framework
    13/12/10 15:33:42 INFO mapred.JobClient:     Map input records=45
    13/12/10 15:33:42 INFO mapred.JobClient:     Physical memory (bytes) snapshot=36737024
    13/12/10 15:33:42 INFO mapred.JobClient:     Spilled Records=0
    13/12/10 15:33:42 INFO mapred.JobClient:     CPU time spent (ms)=470
    13/12/10 15:33:42 INFO mapred.JobClient:     Total committed heap usage (bytes)=15925248
    13/12/10 15:33:42 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=346537984
    13/12/10 15:33:42 INFO mapred.JobClient:     Map input bytes=7804
    13/12/10 15:33:42 INFO mapred.JobClient:     Map output records=0
    13/12/10 15:33:42 INFO mapred.JobClient:     SPLIT_RAW_BYTES=178

