• Sqoop导入mysql数据到Hbase


    sqoop import --driver com.mysql.jdbc.Driver --connect "jdbc:mysql://11.143.18.29:3306/db_1"  --username root --password root --query "select rowkey,dtTime from tb1  where 1=1 and $CONDITIONS" --hbase-table tb1--column-family t --hbase-row-key id  -split-by dtTime -m 2

    导入大表时一定要注意查看数据是否分布均匀,否则很容易出现java heap space错误

    sqoop导入mysql数据到hbase时,如果带有--hbase-create-table选项出现如下错误,据说是兼容问题:

    16/01/24 13:46:33 INFO zookeeper.ClientCnxn: Session establishment complete on server slave1/192.168.2.80:2181, sessionid = 0x252720e0fac0008, negotiated timeout = 40000
    
    16/01/24 13:46:33 INFO mapreduce.HBaseImportJob: Creating missing HBase table exchangeshopflow
    
    Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.hbase.HTableDescriptor.addFamily(Lorg/apache/hadoop/hbase/HColumnDescriptor;)V
        at org.apache.sqoop.mapreduce.HBaseImportJob.jobSetup(HBaseImportJob.java:222)
        at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:264)
        at org.apache.sqoop.manager.SqlManager.importQuery(SqlManager.java:729)
        at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:499)
        at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:236)

    解决方法:
    手动建hbase表,然后使用sqoop导入

    包含自定义field:

    sqoop import --driver com.mysql.jdbc.Driver --connect "jdbc:mysql://192.168.2.80:3306/db_1"  --username root --password root --query "select concat(concat(md5(concat(PID,vid)),UNIX_TIMESTAMP(dtTime)), 100000+right(\`__#alibaba_rds_row_id#__\`,5)) as id,dttime,vid,pid,zoneid,type,count,reason,\`__#alibaba_rds_row_id#__\` from achi where  $CONDITIONS" --hbase-table achi --column-family a --hbase-row-key id  -split-by `__#alibaba_rds_row_id#__\`

    直接导入:

    sqoop import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://192.168.2.80:3306/db_1"  --username root --password root --table achi --hbase-table achi --column-family a --hbase-row-key dtTime  -split-by dtTime --hbase-create-table

    定时增量导入

    sqoop job --create t11 -- import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://192.168.2.80:3306/db_1"  --username root --password root --table t2 --hbase-table t007117 --column-family a --hbase-row-key id  -split-by id --hbase-create-table --incremental append --check-column id  --last-value 0;

    定时任务

    sqoop job --create t117 -- import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://192.168.2.80:3306/db_1"  --username root --password root --query "select concat(concat(md5(concat(PID,vid)),UNIX_TIMESTAMP(dtTime)), 100000+right(id,5)) as key_id,t.*  from t2 as t where $CONDITIONS" --hbase-table t007117 --column-family a --hbase-row-key id  -split-by id --hbase-create-table --incremental append --check-column id  --last-value 0;

    使用–query时,–incremental append –check-column id –last-value 0;不起作用

  • 相关阅读:
    秒杀多线程第八篇 经典线程同步 信号量Semaphore
    SURF特征
    (最短路径算法整理)
    中国大推力矢量发动机WS15 跨入 世界先进水平!
    SQL Profile 总结(一)
    Spring下@ResponseBody响应中文内容乱码问题
    Ubuntu12.04下jamvm1.5.4+classpath-0.98成功执行 helloworld.class
    【2012.1.24更新】不要再在网上搜索eclipse的汉化包了!
    [数据结构] N皇后问题
    DG之主库、备库切换(物理备库)
  • 原文地址:https://www.cnblogs.com/ggzone/p/10121172.html
Copyright © 2020-2023  润新知