• sqoop1的安装以及数据导入导出测试


    下载

    wget http://mirror.bit.edu.cn/apache/sqoop/1.4.7/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz

    解压

    tar -zxf sqoop-1.99.7-bin-hadoop200.tar.gz

    将mysql-connector-java-5.1.39.jar拷贝到sqoop1的lib目录

    [root@host ~]# cp mysql-connector-java-5.1.39.jar /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/lib/

    配置环境变量,并使之生效

    export SQOOP_HOME=/root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0
    export PATH=$PATH:$SQOOP_HOME/bin

    测试

    [root@host bin]# sqoop help

    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/04 14:03:00 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    usage: sqoop COMMAND [ARGS]

    Available commands:
      codegen            Generate code to interact with database records
      create-hive-table  Import a table definition into Hive
      eval               Evaluate a SQL statement and display the results
      export             Export an HDFS directory to a database table
      help               List available commands
      import             Import a table from a database to HDFS
      import-all-tables  Import tables from a database to HDFS
      import-mainframe   Import datasets from a mainframe server to HDFS
      job                Work with saved jobs
      list-databases     List available databases on a server
      list-tables        List available tables in a database
      merge              Merge results of incremental imports
      metastore          Run a standalone Sqoop metastore
      version            Display version information

     执行SQL并返回结果

    [root@host ~]#  sqoop eval --connect jdbc:mysql://localhost:3306/test --username root --password root --query 'select roleid,level from role where sex=0'
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/04 17:30:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    18/06/04 17:30:54 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
    18/06/04 17:30:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    ----------------------------
    | roleid     | level       |
    ----------------------------
    | 200        | 8           |
    | 400        | 4           |
    | 600        | 91          |
    ----------------------------

    列出数据库的所有表

    [root@host bin]# sqoop list-tables --connect jdbc:mysql://localhost:3306/test --username root -P
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/04 14:18:26 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    Enter password:
    18/06/04 14:18:28 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    gamelog
    log1
    log2
    log_merge
    loginlog
    name_test
    name_test1
    role
    role1

    列出所有数据库

    [root@host bin]# sqoop list-databases --connect jdbc:mysql://localhost:3306/test --username root -P     
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/04 14:18:46 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    Enter password:
    18/06/04 14:18:48 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    information_schema
    gamelog
    metastore
    mysql
    test

    [root@host bin]# sqoop list-databases --connect jdbc:mysql://localhost:3306 --username root --password root
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/04 14:22:22 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    18/06/04 14:22:22 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
    18/06/04 14:22:22 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    information_schema
    gamelog
    metastore
    mysql
    test

    MySQL导入数据到Hdfs

    导入成功会自动创建文件夹20180604,如果文件夹存在则会报错

    将表test.role导入到hdfs://localhost:9000/sqoop/sqoop1/20180604

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --target-dir hdfs://localhost:9000/sqoop/sqoop1/20180604

    查看hdfs:

    [root@host ~]# hdfs dfs -ls -R /sqoop/sqoop1
    drwxr-xr-x   - root supergroup          0 2018-06-04 14:41 /sqoop/sqoop1/20180604
    -rw-r--r--   1 root supergroup          0 2018-06-04 14:41 /sqoop/sqoop1/20180604/_SUCCESS
    -rw-r--r--   1 root supergroup        152 2018-06-04 14:41 /sqoop/sqoop1/20180604/part-m-00000
    -rw-r--r--   1 root supergroup         52 2018-06-04 14:41 /sqoop/sqoop1/20180604/part-m-00001
    -rw-r--r--   1 root supergroup         47 2018-06-04 14:41 /sqoop/sqoop1/20180604/part-m-00002
    -rw-r--r--   1 root supergroup         92 2018-06-04 14:41 /sqoop/sqoop1/20180604/part-m-00003

    -m,--num-mappers启动N个map来并行导入数据,默认是4个,最好不要将数字设置为高于集群的节点数

    --target-dir <dir> 指定hdfs路径

    上述操作没有指-m因此默认是4个,

    我们指定-m 1尝试下:

     sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060401 -m 1

    查看hdfs

    [root@host ~]# hdfs dfs -ls -R /sqoop/sqoop1/2018060401
    -rw-r--r--   1 root supergroup          0 2018-06-04 14:49 /sqoop/sqoop1/2018060401/_SUCCESS
    -rw-r--r--   1 root supergroup        343 2018-06-04 14:49 /sqoop/sqoop1/2018060401/part-m-00000

    [root@host ~]# hdfs dfs -cat /sqoop/sqoop1/2018060401/p*
    1,null,2017-11-16 14:49:11.0,henan luohe linying,1,10
    40,null,2017-11-13 14:50:25.0,guangdong shenzhen,1,20
    110,null,2017-11-14 14:50:47.0,beijing,1,20
    200,null,2017-11-14 14:49:47.0,shandong qingdao,0,8
    400,null,2017-11-15 14:49:56.0,anhui hefei,0,4
    600,null,2017-11-15 14:50:05.0,hunan changsha,0,91
    650,null,2017-11-01 17:24:34.0,null,1,29
    [root@host ~]# hdfs dfs -cat /sqoop/sqoop1/20180604/p*
    1,null,2017-11-16 14:49:11.0,henan luohe linying,1,10
    40,null,2017-11-13 14:50:25.0,guangdong shenzhen,1,20
    110,null,2017-11-14 14:50:47.0,beijing,1,20
    200,null,2017-11-14 14:49:47.0,shandong qingdao,0,8
    400,null,2017-11-15 14:49:56.0,anhui hefei,0,4
    600,null,2017-11-15 14:50:05.0,hunan changsha,0,91
    650,null,2017-11-01 17:24:34.0,null,1,29

     HDFS导入到mysql

    导入前的mysql表:

    mysql> select * from role1;
    +--------+------+---------------------+---------------------+------+-------+
    | roleid | name | dateid              | addr                | sex  | level |
    +--------+------+---------------------+---------------------+------+-------+
    |    400 | NULL | 2017-11-15 14:49:56 | anhui hefei         |    0 |     4 |
    |      1 | NULL | 2017-11-16 14:49:11 | henan luohe linying |    1 |    10 |
    |     40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen  |    1 |    20 |
    |    110 | NULL | 2017-11-14 14:50:47 | beijing             |    1 |    20 |
    |    200 | NULL | 2017-11-14 14:49:47 | shandong qingdao    |    0 |     8 |
    |    600 | NULL | 2017-11-15 14:50:05 | hunan changsha      |    0 |    91 |
    |    650 | NULL | 2017-11-01 17:24:34 | NULL                |    1 |    29 |
    +--------+------+---------------------+---------------------+------+-------+
    7 rows in set (0.01 sec)

    导入:

    sqoop export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --export-dir hdfs://localhost:9000/sqoop/sqoop1/2018060401

    导入后的mysql表:

    mysql> select * from role1;
    +--------+------+---------------------+---------------------+------+-------+
    | roleid | name | dateid              | addr                | sex  | level |
    +--------+------+---------------------+---------------------+------+-------+
    |    400 | NULL | 2017-11-15 14:49:56 | anhui hefei         |    0 |     4 |
    |      1 | NULL | 2017-11-16 14:49:11 | henan luohe linying |    1 |    10 |
    |     40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen  |    1 |    20 |
    |    110 | NULL | 2017-11-14 14:50:47 | beijing             |    1 |    20 |
    |    200 | NULL | 2017-11-14 14:49:47 | shandong qingdao    |    0 |     8 |
    |    600 | NULL | 2017-11-15 14:50:05 | hunan changsha      |    0 |    91 |
    |    650 | NULL | 2017-11-01 17:24:34 | NULL                |    1 |    29 |
    |    650 | NULL | 2017-11-01 17:24:34 | NULL                |    1 |    29 |
    |      1 | NULL | 2017-11-16 14:49:11 | henan luohe linying |    1 |    10 |
    |     40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen  |    1 |    20 |
    |    110 | NULL | 2017-11-14 14:50:47 | beijing             |    1 |    20 |
    |    200 | NULL | 2017-11-14 14:49:47 | shandong qingdao    |    0 |     8 |
    |    400 | NULL | 2017-11-15 14:49:56 | anhui hefei         |    0 |     4 |
    |    600 | NULL | 2017-11-15 14:50:05 | hunan changsha      |    0 |    91 |
    +--------+------+---------------------+---------------------+------+-------+
    14 rows in set (0.00 sec)

    根据条件导出某些字段:

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --columns "roleid,level" --where "sex=0"  --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060403

     [root@host ~]# hdfs dfs -cat /sqoop/sqoop1/2018060403/p*
    200,8
    400,4
    600,91

    直接导入模式

    --direct 直接导入模式,使用的是关系数据库自带的导入导出工具。传说这样导入更快

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --delete-target-dir  --direct --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060405

    --delete-target-dir   删除目标目录

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --fields-terminated-by " " --lines-terminated-by " " --delete-target-dir  --direct --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060405

    --fields-terminated-by指定列分隔符

    --lines-terminated-by指定行分隔符

    mysql与hive互导

    将mysql表结构导入的hive

    sqoop create-hive-table --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --hive-table sqoopdb.role

    18/06/04 16:10:39 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
    18/06/04 16:10:39 ERROR tool.CreateHiveTableTool: Encountered IOException running create table job: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf

    修改配置文件/etc/profile,添加:

    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/*

    重启终端或者新开一个窗口执行:

    sqoop create-hive-table --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --hive-table sqoopdb.role

    或者

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role  --hive-import  --hive-database sqoopdb --hive-table role

    查看hive的sqoop库:

    hive> use sqoopdb;
    OK
    Time taken: 0.116 seconds
    hive> show tables;
    OK
    tab_name
    role
    Time taken: 0.251 seconds, Fetched: 1 row(s)
    hive> select * from role;
    OK
    role.roleid     role.name       role.dateid     role.addr       role.sex        role.level
    Time taken: 4.798 seconds

    将mysql数据导入到hive的表sqoop.role:

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role  --hive-import --hive-table sqoopdb.role

    查看hdfs:

    drwx-wx-wx   - root supergroup          0 2018-06-04 16:24 /user/hive/warehouse/sqoopdb.db
    drwx-wx-wx   - root supergroup          0 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role
    -rwx-wx-wx   1 root supergroup        152 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00000
    -rwx-wx-wx   1 root supergroup         52 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00001
    -rwx-wx-wx   1 root supergroup         47 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00002
    -rwx-wx-wx   1 root supergroup         92 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00003

    ...............................................

    drwxr-xr-x   - root supergroup          0 2018-06-04 16:36 /user/root
    drwxr-xr-x   - root supergroup          0 2018-06-04 16:36 /user/root/role
    -rw-r--r--   1 root supergroup          0 2018-06-04 16:36 /user/root/role/_SUCCESS

    查看hive:

    hive> select * from role;
    OK
    role.roleid     role.name       role.dateid     role.addr       role.sex        role.level
    1       null    2017-11-16 14:49:11.0   henan luohe linying     1       10
    40      null    2017-11-13 14:50:25.0   guangdong shenzhen      1       20
    110     null    2017-11-14 14:50:47.0   beijing 1       20
    200     null    2017-11-14 14:49:47.0   shandong qingdao        0       8
    400     null    2017-11-15 14:49:56.0   anhui hefei     0       4
    600     null    2017-11-15 14:50:05.0   hunan changsha  0       91
    650     null    2017-11-01 17:24:34.0   null    1       29
    Time taken: 0.674 seconds, Fetched: 7 row(s)

    再次执行

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role  --hive-import --hive-table sqoopdb.role

    hive> select * from role;
    OK
    role.roleid     role.name       role.dateid     role.addr       role.sex        role.level
    1       null    2017-11-16 14:49:11.0   henan luohe linying     1       10
    40      null    2017-11-13 14:50:25.0   guangdong shenzhen      1       20
    110     null    2017-11-14 14:50:47.0   beijing 1       20
    1       null    2017-11-16 14:49:11.0   henan luohe linying     1       10
    40      null    2017-11-13 14:50:25.0   guangdong shenzhen      1       20
    110     null    2017-11-14 14:50:47.0   beijing 1       20
    200     null    2017-11-14 14:49:47.0   shandong qingdao        0       8
    200     null    2017-11-14 14:49:47.0   shandong qingdao        0       8
    400     null    2017-11-15 14:49:56.0   anhui hefei     0       4
    400     null    2017-11-15 14:49:56.0   anhui hefei     0       4
    600     null    2017-11-15 14:50:05.0   hunan changsha  0       91
    650     null    2017-11-01 17:24:34.0   null    1       29
    600     null    2017-11-15 14:50:05.0   hunan changsha  0       91
    650     null    2017-11-01 17:24:34.0   null    1       29
    Time taken: 0.464 seconds, Fetched: 14 row(s)

    drwx-wx-wx   - root supergroup          0 2018-06-04 16:24 /user/hive/warehouse/sqoopdb.db
    drwx-wx-wx   - root supergroup          0 2018-06-04 16:40 /user/hive/warehouse/sqoopdb.db/role
    -rwx-wx-wx   1 root supergroup        152 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00000
    -rwx-wx-wx   1 root supergroup        152 2018-06-04 16:40 /user/hive/warehouse/sqoopdb.db/role/part-m-00000_copy_1
    -rwx-wx-wx   1 root supergroup         52 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00001
    -rwx-wx-wx   1 root supergroup         52 2018-06-04 16:40 /user/hive/warehouse/sqoopdb.db/role/part-m-00001_copy_1
    -rwx-wx-wx   1 root supergroup         47 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00002
    -rwx-wx-wx   1 root supergroup         47 2018-06-04 16:40 /user/hive/warehouse/sqoopdb.db/role/part-m-00002_copy_1
    -rwx-wx-wx   1 root supergroup         92 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00003
    -rwx-wx-wx   1 root supergroup         92 2018-06-04 16:40 /user/hive/warehouse/sqoopdb.db/role/part-m-00003_copy_1

    hive导入到mysql

           --hive-home <dir>  重写$HIVE_HOME
      --hive-import          插入数据到hive当中,使用hive的默认分隔符
      --hive-overwrite  重写插入
      --create-hive-table  建表,如果表已经存在,该操作会报错!
      --hive-table <table-name>  设置到hive当中的表名
      --hive-drop-import-delims  导入到hive时删除 , , and 1
      --hive-delims-replacement  导入到hive时用自定义的字符替换掉 , , and 1
      --hive-partition-key          hive分区的key
      --hive-partition-value <v>  hive分区的值
      --map-column-hive <map>          类型匹配,sql类型对应到hive类型

    sqoop export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --fields-terminated-by '01' --export-dir /user/hive/warehouse/sqoopdb.db/role

    sqoop export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --fields-terminated-by '01' --export-dir hdfs://localhost:9000/user/hive/warehouse/sqoopdb.db/role

    测试时未指定列分割符发生报错,指定后成功执行

    查看mysql表

    mysql> select * from role1;

    Empty set (0.00 sec)

    mysql> select * from role1;
    +--------+------+---------------------+---------------------+------+-------+
    | roleid | name | dateid              | addr                | sex  | level |
    +--------+------+---------------------+---------------------+------+-------+
    |      1 | NULL | 2017-11-16 14:49:11 | henan luohe linying |    1 |    10 |
    |     40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen  |    1 |    20 |
    |    110 | NULL | 2017-11-14 14:50:47 | beijing             |    1 |    20 |
    |      1 | NULL | 2017-11-16 14:49:11 | henan luohe linying |    1 |    10 |
    |     40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen  |    1 |    20 |
    |    110 | NULL | 2017-11-14 14:50:47 | beijing             |    1 |    20 |
    |    200 | NULL | 2017-11-14 14:49:47 | shandong qingdao    |    0 |     8 |
    |    200 | NULL | 2017-11-14 14:49:47 | shandong qingdao    |    0 |     8 |
    |    400 | NULL | 2017-11-15 14:49:56 | anhui hefei         |    0 |     4 |
    |    400 | NULL | 2017-11-15 14:49:56 | anhui hefei         |    0 |     4 |
    |    600 | NULL | 2017-11-15 14:50:05 | hunan changsha      |    0 |    91 |
    |    650 | NULL | 2017-11-01 17:24:34 | NULL                |    1 |    29 |
    |    600 | NULL | 2017-11-15 14:50:05 | hunan changsha      |    0 |    91 |
    |    650 | NULL | 2017-11-01 17:24:34 | NULL                |    1 |    29 |
    +--------+------+---------------------+---------------------+------+-------+
    14 rows in set (0.00 sec)

     --query,-e <sql> 从查询结果中导入数据,该参数使用时必须指定–target-dir–hive-table,在查询语句中一定要有where条件且在where条件中需要包含 $CONDITIONS

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --query "select * from role1 where $CONDITIONS" --fields-terminated-by " " --lines-terminated-by " " --delete-target-dir --split-by roleid  --direct --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060406

    测试发现--split-by也不可以少

    sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --query "select roleid,level from role1 where sex=0 and $CONDITIONS" --fields-terminated-by " " --lines-terminated-by " " --delete-target-dir --split-by roleid  --direct --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060406

    $CONDITIONS是linux的系统变量,查看执行过程,发现$CONDITIONS被替换成了1=0

    If you want to import the results of a query in parallel, then each map task will need to execute a copy of the query, with results partitioned by bounding conditions inferred by Sqoop. Your query must include the token $CONDITIONS which each Sqoop process will replace with a unique condition expression. You must also select a splitting column with --split-by.

    [root@host ~]# sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --query "select roleid,level from role1 where sex=0 and $CONDITIONS" --fields-terminated-by " " --lines-terminated-by " " --delete-target-dir --split-by roleid  --direct --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060406
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/05 10:24:26 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    18/06/05 10:24:26 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
    18/06/05 10:24:26 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
    18/06/05 10:24:26 INFO tool.CodeGenTool: Beginning code generation
    18/06/05 10:24:27 INFO manager.SqlManager: Executing SQL statement: select roleid,level from role1 where sex=0 and  (1 = 0)
    18/06/05 10:24:27 INFO manager.SqlManager: Executing SQL statement: select roleid,level from role1 where sex=0 and  (1 = 0)
    18/06/05 10:24:27 INFO manager.SqlManager: Executing SQL statement: select roleid,level from role1 where sex=0 and  (1 = 0)

    ................

    export

    --update-key  表的主键

    --update-mode 两种模式allowinsert:存在更新,不存在插入,updateonly:只更新

    如果表不存在主键,指定了--update-key,再指定allowinsert,数据已经存在也会重复插入

     sqoop export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --fields-terminated-by '01' --export-dir hdfs://localhost:9000/user/hive/warehouse/sqoopdb.db/role --update-key roleid --update-mode allowinsert

    sqoop job

    创建一个Job

    [root@host ~]# sqoop job --create mysqltohive -- export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --fields-terminated-by '01' --export-dir hdfs://localhost:9000/user/hive/warehouse/sqoopdb.db/role
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/05 13:52:55 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    18/06/05 13:52:56 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

    查看可用的job:

    [root@host ~]# sqoop job --list
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/05 13:55:45 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    Available jobs:
      mysqltohive

    执行Job:

    执行前的mysql:

    mysql> select * from role1;
    +--------+------+---------------------+---------------------+------+-------+
    | roleid | name | dateid              | addr                | sex  | level |
    +--------+------+---------------------+---------------------+------+-------+
    |      1 | NULL | 2017-11-16 14:49:11 | henan luohe linying |    1 |    10 |
    |     40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen  |    1 |    20 |
    |    110 | NULL | 2017-11-14 14:50:47 | beijing             |    1 |    20 |
    |    200 | NULL | 2017-11-14 14:49:47 | shandong qingdao    |    0 |     8 |
    |    400 | NULL | 2017-11-15 14:49:56 | anhui hefei         |    0 |     4 |
    |    600 | NULL | 2017-11-15 14:50:05 | hunan changsha      |    0 |    91 |
    |    650 | NULL | 2017-11-01 17:24:34 | NULL                |    1 |    29 |
    +--------+------+---------------------+---------------------+------+-------+
    7 rows in set (0.00 sec)

    [root@host ~]# sqoop job --exec mysqltohive

    注:执行过程中需要输入密码

    查看执行后的mysql:

    mysql> select * from role1;
    +--------+------+---------------------+---------------------+------+-------+
    | roleid | name | dateid              | addr                | sex  | level |
    +--------+------+---------------------+---------------------+------+-------+
    |      1 | NULL | 2017-11-16 14:49:11 | henan luohe linying |    1 |    10 |
    |     40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen  |    1 |    20 |
    |    110 | NULL | 2017-11-14 14:50:47 | beijing             |    1 |    20 |
    |    200 | NULL | 2017-11-14 14:49:47 | shandong qingdao    |    0 |     8 |
    |    400 | NULL | 2017-11-15 14:49:56 | anhui hefei         |    0 |     4 |
    |    600 | NULL | 2017-11-15 14:50:05 | hunan changsha      |    0 |    91 |
    |    650 | NULL | 2017-11-01 17:24:34 | NULL                |    1 |    29 |
    |      1 | NULL | 2017-11-16 14:49:11 | henan luohe linying |    1 |    10 |
    |     40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen  |    1 |    20 |
    |    110 | NULL | 2017-11-14 14:50:47 | beijing             |    1 |    20 |
    |    200 | NULL | 2017-11-14 14:49:47 | shandong qingdao    |    0 |     8 |
    |    400 | NULL | 2017-11-15 14:49:56 | anhui hefei         |    0 |     4 |
    |    600 | NULL | 2017-11-15 14:50:05 | hunan changsha      |    0 |    91 |
    |    650 | NULL | 2017-11-01 17:24:34 | NULL                |    1 |    29 |
    +--------+------+---------------------+---------------------+------+-------+

    14 rows in set (0.00 sec)

    执行成功!

    删除job

    sqoop job --delete mysqltohive

    以上执行sqoop job执行过程中需要输入密码,免密码执行方式如下:

    配置sqoop_site.xml,启用:

      <property>
        <name>sqoop.metastore.client.record.password</name>
        <value>true</value>
        <description>If true, allow saved passwords in the metastore.
        </description>
      </property>

    创建job,最好新建窗口:

    [root@host conf]# sqoop job --create hivetomysql -- export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --fields-terminated-by '01' --export-dir hdfs://localhost:9000/user/hive/warehouse/sqoopdb.db/role
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/06/05 14:17:51 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    18/06/05 14:17:51 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

    执行job:

    [root@host ~]# sqoop job --exec hivetomysql

    果然不用输入秘密,bingo!!!

    查看mysql数据表,数据存在!测试成功!!!!

    同样sqoop也可以实现sqlserver与hdfs,hive,hbase的互导,需要下载相关驱动

    添加环境变量 MSSQL_CONNECTOR_HOME

    并且将驱动复制到sqoop的lib目录下面

  • 相关阅读:
    详解RTMP协议视频直播点播平台EasyDSS转推视频直播流到CDN(云直播)
    RTMP协议视频直播点播平台EasyDSS如何实现RTSP视频流地址加密?
    阿里王坚:“城市大脑”开启算力时代
    数据中心网络测试自动化的逐步发展
    在数据中心利用AI的5个理由
    基于层次过滤的文本生成
    解密为何 Go 能从众多语言中脱颖而出
    乔姆斯基专访:深度学习并没有探寻世界本质
    释放数据价值的「三个关键点」
    改改Python代码,运行速度还能提升6万倍,Science:先别想摩尔定律了
  • 原文地址:https://www.cnblogs.com/playforever/p/9103912.html
Copyright © 2020-2023  润新知