下载
wget http://mirror.bit.edu.cn/apache/sqoop/1.4.7/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
解压
tar -zxf sqoop-1.99.7-bin-hadoop200.tar.gz
将mysql-connector-java-5.1.39.jar拷贝到sqoop1的lib目录
[root@host ~]# cp mysql-connector-java-5.1.39.jar /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/lib/
配置环境变量,并使之生效
export SQOOP_HOME=/root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0
export PATH=$PATH:$SQOOP_HOME/bin
测试
[root@host bin]# sqoop help
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
18/06/04 14:03:00 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
usage: sqoop COMMAND [ARGS]
Available commands:
codegen Generate code to interact with database records
create-hive-table Import a table definition into Hive
eval Evaluate a SQL statement and display the results
export Export an HDFS directory to a database table
help List available commands
import Import a table from a database to HDFS
import-all-tables Import tables from a database to HDFS
import-mainframe Import datasets from a mainframe server to HDFS
job Work with saved jobs
list-databases List available databases on a server
list-tables List available tables in a database
merge Merge results of incremental imports
metastore Run a standalone Sqoop metastore
version Display version information
执行SQL并返回结果
[root@host ~]# sqoop eval --connect jdbc:mysql://localhost:3306/test --username root --password root --query 'select roleid,level from role where sex=0'
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
18/06/04 17:30:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
18/06/04 17:30:54 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/06/04 17:30:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
----------------------------
| roleid | level |
----------------------------
| 200 | 8 |
| 400 | 4 |
| 600 | 91 |
----------------------------
列出数据库的所有表
[root@host bin]# sqoop list-tables --connect jdbc:mysql://localhost:3306/test --username root -P
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
18/06/04 14:18:26 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
Enter password:
18/06/04 14:18:28 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
gamelog
log1
log2
log_merge
loginlog
name_test
name_test1
role
role1
列出所有数据库
[root@host bin]# sqoop list-databases --connect jdbc:mysql://localhost:3306/test --username root -P
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
18/06/04 14:18:46 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
Enter password:
18/06/04 14:18:48 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
gamelog
metastore
mysql
test
[root@host bin]# sqoop list-databases --connect jdbc:mysql://localhost:3306 --username root --password root
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
18/06/04 14:22:22 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
18/06/04 14:22:22 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/06/04 14:22:22 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
gamelog
metastore
mysql
test
MySQL导入数据到Hdfs
导入成功会自动创建文件夹20180604,如果文件夹存在则会报错
将表test.role导入到hdfs://localhost:9000/sqoop/sqoop1/20180604
sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --target-dir hdfs://localhost:9000/sqoop/sqoop1/20180604
查看hdfs:
[root@host ~]# hdfs dfs -ls -R /sqoop/sqoop1
drwxr-xr-x - root supergroup 0 2018-06-04 14:41 /sqoop/sqoop1/20180604
-rw-r--r-- 1 root supergroup 0 2018-06-04 14:41 /sqoop/sqoop1/20180604/_SUCCESS
-rw-r--r-- 1 root supergroup 152 2018-06-04 14:41 /sqoop/sqoop1/20180604/part-m-00000
-rw-r--r-- 1 root supergroup 52 2018-06-04 14:41 /sqoop/sqoop1/20180604/part-m-00001
-rw-r--r-- 1 root supergroup 47 2018-06-04 14:41 /sqoop/sqoop1/20180604/part-m-00002
-rw-r--r-- 1 root supergroup 92 2018-06-04 14:41 /sqoop/sqoop1/20180604/part-m-00003
-m,--num-mappers
启动N个map来并行导入数据,默认是4个,最好不要将数字设置为高于集群的节点数
--target-dir <dir>
指定hdfs路径
上述操作没有指-m因此默认是4个,
我们指定-m 1尝试下:
sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060401 -m 1
查看hdfs
[root@host ~]# hdfs dfs -ls -R /sqoop/sqoop1/2018060401
-rw-r--r-- 1 root supergroup 0 2018-06-04 14:49 /sqoop/sqoop1/2018060401/_SUCCESS
-rw-r--r-- 1 root supergroup 343 2018-06-04 14:49 /sqoop/sqoop1/2018060401/part-m-00000
[root@host ~]# hdfs dfs -cat /sqoop/sqoop1/2018060401/p*
1,null,2017-11-16 14:49:11.0,henan luohe linying,1,10
40,null,2017-11-13 14:50:25.0,guangdong shenzhen,1,20
110,null,2017-11-14 14:50:47.0,beijing,1,20
200,null,2017-11-14 14:49:47.0,shandong qingdao,0,8
400,null,2017-11-15 14:49:56.0,anhui hefei,0,4
600,null,2017-11-15 14:50:05.0,hunan changsha,0,91
650,null,2017-11-01 17:24:34.0,null,1,29
[root@host ~]# hdfs dfs -cat /sqoop/sqoop1/20180604/p*
1,null,2017-11-16 14:49:11.0,henan luohe linying,1,10
40,null,2017-11-13 14:50:25.0,guangdong shenzhen,1,20
110,null,2017-11-14 14:50:47.0,beijing,1,20
200,null,2017-11-14 14:49:47.0,shandong qingdao,0,8
400,null,2017-11-15 14:49:56.0,anhui hefei,0,4
600,null,2017-11-15 14:50:05.0,hunan changsha,0,91
650,null,2017-11-01 17:24:34.0,null,1,29
HDFS导入到mysql
导入前的mysql表:
mysql> select * from role1;
+--------+------+---------------------+---------------------+------+-------+
| roleid | name | dateid | addr | sex | level |
+--------+------+---------------------+---------------------+------+-------+
| 400 | NULL | 2017-11-15 14:49:56 | anhui hefei | 0 | 4 |
| 1 | NULL | 2017-11-16 14:49:11 | henan luohe linying | 1 | 10 |
| 40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen | 1 | 20 |
| 110 | NULL | 2017-11-14 14:50:47 | beijing | 1 | 20 |
| 200 | NULL | 2017-11-14 14:49:47 | shandong qingdao | 0 | 8 |
| 600 | NULL | 2017-11-15 14:50:05 | hunan changsha | 0 | 91 |
| 650 | NULL | 2017-11-01 17:24:34 | NULL | 1 | 29 |
+--------+------+---------------------+---------------------+------+-------+
7 rows in set (0.01 sec)
导入:
sqoop export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --export-dir hdfs://localhost:9000/sqoop/sqoop1/2018060401
导入后的mysql表:
mysql> select * from role1;
+--------+------+---------------------+---------------------+------+-------+
| roleid | name | dateid | addr | sex | level |
+--------+------+---------------------+---------------------+------+-------+
| 400 | NULL | 2017-11-15 14:49:56 | anhui hefei | 0 | 4 |
| 1 | NULL | 2017-11-16 14:49:11 | henan luohe linying | 1 | 10 |
| 40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen | 1 | 20 |
| 110 | NULL | 2017-11-14 14:50:47 | beijing | 1 | 20 |
| 200 | NULL | 2017-11-14 14:49:47 | shandong qingdao | 0 | 8 |
| 600 | NULL | 2017-11-15 14:50:05 | hunan changsha | 0 | 91 |
| 650 | NULL | 2017-11-01 17:24:34 | NULL | 1 | 29 |
| 650 | NULL | 2017-11-01 17:24:34 | NULL | 1 | 29 |
| 1 | NULL | 2017-11-16 14:49:11 | henan luohe linying | 1 | 10 |
| 40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen | 1 | 20 |
| 110 | NULL | 2017-11-14 14:50:47 | beijing | 1 | 20 |
| 200 | NULL | 2017-11-14 14:49:47 | shandong qingdao | 0 | 8 |
| 400 | NULL | 2017-11-15 14:49:56 | anhui hefei | 0 | 4 |
| 600 | NULL | 2017-11-15 14:50:05 | hunan changsha | 0 | 91 |
+--------+------+---------------------+---------------------+------+-------+
14 rows in set (0.00 sec)
根据条件导出某些字段:
sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --columns "roleid,level" --where "sex=0" --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060403
[root@host ~]# hdfs dfs -cat /sqoop/sqoop1/2018060403/p*
200,8
400,4
600,91
直接导入模式
--direct 直接导入模式,使用的是关系数据库自带的导入导出工具。传说这样导入更快
sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --delete-target-dir --direct --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060405
--delete-target-dir 删除目标目录
sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --fields-terminated-by " " --lines-terminated-by " " --delete-target-dir --direct --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060405
--fields-terminated-by指定列分隔符
--lines-terminated-by指定行分隔符
mysql与hive互导
将mysql表结构导入的hive
sqoop create-hive-table --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --hive-table sqoopdb.role
18/06/04 16:10:39 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
18/06/04 16:10:39 ERROR tool.CreateHiveTableTool: Encountered IOException running create table job: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
修改配置文件/etc/profile,添加:
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/*
重启终端或者新开一个窗口执行:
sqoop create-hive-table --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --hive-table sqoopdb.role
或者
sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --hive-import --hive-database sqoopdb --hive-table role
查看hive的sqoop库:
hive> use sqoopdb;
OK
Time taken: 0.116 seconds
hive> show tables;
OK
tab_name
role
Time taken: 0.251 seconds, Fetched: 1 row(s)
hive> select * from role;
OK
role.roleid role.name role.dateid role.addr role.sex role.level
Time taken: 4.798 seconds
将mysql数据导入到hive的表sqoop.role:
sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --hive-import --hive-table sqoopdb.role
查看hdfs:
drwx-wx-wx - root supergroup 0 2018-06-04 16:24 /user/hive/warehouse/sqoopdb.db
drwx-wx-wx - root supergroup 0 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role
-rwx-wx-wx 1 root supergroup 152 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00000
-rwx-wx-wx 1 root supergroup 52 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00001
-rwx-wx-wx 1 root supergroup 47 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00002
-rwx-wx-wx 1 root supergroup 92 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00003
...............................................
drwxr-xr-x - root supergroup 0 2018-06-04 16:36 /user/root
drwxr-xr-x - root supergroup 0 2018-06-04 16:36 /user/root/role
-rw-r--r-- 1 root supergroup 0 2018-06-04 16:36 /user/root/role/_SUCCESS
查看hive:
hive> select * from role;
OK
role.roleid role.name role.dateid role.addr role.sex role.level
1 null 2017-11-16 14:49:11.0 henan luohe linying 1 10
40 null 2017-11-13 14:50:25.0 guangdong shenzhen 1 20
110 null 2017-11-14 14:50:47.0 beijing 1 20
200 null 2017-11-14 14:49:47.0 shandong qingdao 0 8
400 null 2017-11-15 14:49:56.0 anhui hefei 0 4
600 null 2017-11-15 14:50:05.0 hunan changsha 0 91
650 null 2017-11-01 17:24:34.0 null 1 29
Time taken: 0.674 seconds, Fetched: 7 row(s)
再次执行
sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --table role --hive-import --hive-table sqoopdb.role
hive> select * from role;
OK
role.roleid role.name role.dateid role.addr role.sex role.level
1 null 2017-11-16 14:49:11.0 henan luohe linying 1 10
40 null 2017-11-13 14:50:25.0 guangdong shenzhen 1 20
110 null 2017-11-14 14:50:47.0 beijing 1 20
1 null 2017-11-16 14:49:11.0 henan luohe linying 1 10
40 null 2017-11-13 14:50:25.0 guangdong shenzhen 1 20
110 null 2017-11-14 14:50:47.0 beijing 1 20
200 null 2017-11-14 14:49:47.0 shandong qingdao 0 8
200 null 2017-11-14 14:49:47.0 shandong qingdao 0 8
400 null 2017-11-15 14:49:56.0 anhui hefei 0 4
400 null 2017-11-15 14:49:56.0 anhui hefei 0 4
600 null 2017-11-15 14:50:05.0 hunan changsha 0 91
650 null 2017-11-01 17:24:34.0 null 1 29
600 null 2017-11-15 14:50:05.0 hunan changsha 0 91
650 null 2017-11-01 17:24:34.0 null 1 29
Time taken: 0.464 seconds, Fetched: 14 row(s)
drwx-wx-wx - root supergroup 0 2018-06-04 16:24 /user/hive/warehouse/sqoopdb.db
drwx-wx-wx - root supergroup 0 2018-06-04 16:40 /user/hive/warehouse/sqoopdb.db/role
-rwx-wx-wx 1 root supergroup 152 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00000
-rwx-wx-wx 1 root supergroup 152 2018-06-04 16:40 /user/hive/warehouse/sqoopdb.db/role/part-m-00000_copy_1
-rwx-wx-wx 1 root supergroup 52 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00001
-rwx-wx-wx 1 root supergroup 52 2018-06-04 16:40 /user/hive/warehouse/sqoopdb.db/role/part-m-00001_copy_1
-rwx-wx-wx 1 root supergroup 47 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00002
-rwx-wx-wx 1 root supergroup 47 2018-06-04 16:40 /user/hive/warehouse/sqoopdb.db/role/part-m-00002_copy_1
-rwx-wx-wx 1 root supergroup 92 2018-06-04 16:36 /user/hive/warehouse/sqoopdb.db/role/part-m-00003
-rwx-wx-wx 1 root supergroup 92 2018-06-04 16:40 /user/hive/warehouse/sqoopdb.db/role/part-m-00003_copy_1
hive导入到mysql
--hive-home <dir> 重写$HIVE_HOME
--hive-import 插入数据到hive当中,使用hive的默认分隔符
--hive-overwrite 重写插入
--create-hive-table 建表,如果表已经存在,该操作会报错!
--hive-table <table-name> 设置到hive当中的表名
--hive-drop-import-delims 导入到hive时删除
,
, and 1
--hive-delims-replacement 导入到hive时用自定义的字符替换掉
,
, and 1
--hive-partition-key hive分区的key
--hive-partition-value <v> hive分区的值
--map-column-hive <map> 类型匹配,sql类型对应到hive类型
sqoop export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --fields-terminated-by ' 01' --export-dir /user/hive/warehouse/sqoopdb.db/role
同
sqoop export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --fields-terminated-by ' 01' --export-dir hdfs://localhost:9000/user/hive/warehouse/sqoopdb.db/role
测试时未指定列分割符发生报错,指定后成功执行
查看mysql表
mysql> select * from role1;
Empty set (0.00 sec)
mysql> select * from role1;
+--------+------+---------------------+---------------------+------+-------+
| roleid | name | dateid | addr | sex | level |
+--------+------+---------------------+---------------------+------+-------+
| 1 | NULL | 2017-11-16 14:49:11 | henan luohe linying | 1 | 10 |
| 40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen | 1 | 20 |
| 110 | NULL | 2017-11-14 14:50:47 | beijing | 1 | 20 |
| 1 | NULL | 2017-11-16 14:49:11 | henan luohe linying | 1 | 10 |
| 40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen | 1 | 20 |
| 110 | NULL | 2017-11-14 14:50:47 | beijing | 1 | 20 |
| 200 | NULL | 2017-11-14 14:49:47 | shandong qingdao | 0 | 8 |
| 200 | NULL | 2017-11-14 14:49:47 | shandong qingdao | 0 | 8 |
| 400 | NULL | 2017-11-15 14:49:56 | anhui hefei | 0 | 4 |
| 400 | NULL | 2017-11-15 14:49:56 | anhui hefei | 0 | 4 |
| 600 | NULL | 2017-11-15 14:50:05 | hunan changsha | 0 | 91 |
| 650 | NULL | 2017-11-01 17:24:34 | NULL | 1 | 29 |
| 600 | NULL | 2017-11-15 14:50:05 | hunan changsha | 0 | 91 |
| 650 | NULL | 2017-11-01 17:24:34 | NULL | 1 | 29 |
+--------+------+---------------------+---------------------+------+-------+
14 rows in set (0.00 sec)
--query,-e <sql> 从查询结果中导入数据,该参数使用时必须指定、,在查询语句中一定要有where条件且在where条件中需要包含
sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --query "select * from role1 where $CONDITIONS" --fields-terminated-by " " --lines-terminated-by " " --delete-target-dir --split-by roleid --direct --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060406
测试发现--split-by也不可以少
sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --query "select roleid,level from role1 where sex=0 and $CONDITIONS" --fields-terminated-by " " --lines-terminated-by " " --delete-target-dir --split-by roleid --direct --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060406
$CONDITIONS是linux的系统变量,查看执行过程,发现$CONDITIONS被替换成了1=0
If you want to import the results of a query in parallel, then each map task will need to execute a copy of the query, with results partitioned by bounding conditions inferred by Sqoop. Your query must include the token $CONDITIONS
which each Sqoop process will replace with a unique condition expression. You must also select a splitting column with --split-by
.
[root@host ~]# sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password root --query "select roleid,level from role1 where sex=0 and $CONDITIONS" --fields-terminated-by " " --lines-terminated-by "
" --delete-target-dir --split-by roleid --direct --target-dir hdfs://localhost:9000/sqoop/sqoop1/2018060406
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
18/06/05 10:24:26 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
18/06/05 10:24:26 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/06/05 10:24:26 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/06/05 10:24:26 INFO tool.CodeGenTool: Beginning code generation
18/06/05 10:24:27 INFO manager.SqlManager: Executing SQL statement: select roleid,level from role1 where sex=0 and (1 = 0)
18/06/05 10:24:27 INFO manager.SqlManager: Executing SQL statement: select roleid,level from role1 where sex=0 and (1 = 0)
18/06/05 10:24:27 INFO manager.SqlManager: Executing SQL statement: select roleid,level from role1 where sex=0 and (1 = 0)
................
export
--update-key 表的主键
--update-mode 两种模式allowinsert:存在更新,不存在插入,updateonly:只更新
如果表不存在主键,指定了--update-key,再指定allowinsert,数据已经存在也会重复插入
sqoop export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --fields-terminated-by ' 01' --export-dir hdfs://localhost:9000/user/hive/warehouse/sqoopdb.db/role --update-key roleid --update-mode allowinsert
sqoop job
创建一个Job
[root@host ~]# sqoop job --create mysqltohive -- export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --fields-terminated-by ' 01' --export-dir hdfs://localhost:9000/user/hive/warehouse/sqoopdb.db/role
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
18/06/05 13:52:55 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
18/06/05 13:52:56 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
查看可用的job:
[root@host ~]# sqoop job --list
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
18/06/05 13:55:45 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
Available jobs:
mysqltohive
执行Job:
执行前的mysql:
mysql> select * from role1;
+--------+------+---------------------+---------------------+------+-------+
| roleid | name | dateid | addr | sex | level |
+--------+------+---------------------+---------------------+------+-------+
| 1 | NULL | 2017-11-16 14:49:11 | henan luohe linying | 1 | 10 |
| 40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen | 1 | 20 |
| 110 | NULL | 2017-11-14 14:50:47 | beijing | 1 | 20 |
| 200 | NULL | 2017-11-14 14:49:47 | shandong qingdao | 0 | 8 |
| 400 | NULL | 2017-11-15 14:49:56 | anhui hefei | 0 | 4 |
| 600 | NULL | 2017-11-15 14:50:05 | hunan changsha | 0 | 91 |
| 650 | NULL | 2017-11-01 17:24:34 | NULL | 1 | 29 |
+--------+------+---------------------+---------------------+------+-------+
7 rows in set (0.00 sec)
[root@host ~]# sqoop job --exec mysqltohive
注:执行过程中需要输入密码
查看执行后的mysql:
mysql> select * from role1;
+--------+------+---------------------+---------------------+------+-------+
| roleid | name | dateid | addr | sex | level |
+--------+------+---------------------+---------------------+------+-------+
| 1 | NULL | 2017-11-16 14:49:11 | henan luohe linying | 1 | 10 |
| 40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen | 1 | 20 |
| 110 | NULL | 2017-11-14 14:50:47 | beijing | 1 | 20 |
| 200 | NULL | 2017-11-14 14:49:47 | shandong qingdao | 0 | 8 |
| 400 | NULL | 2017-11-15 14:49:56 | anhui hefei | 0 | 4 |
| 600 | NULL | 2017-11-15 14:50:05 | hunan changsha | 0 | 91 |
| 650 | NULL | 2017-11-01 17:24:34 | NULL | 1 | 29 |
| 1 | NULL | 2017-11-16 14:49:11 | henan luohe linying | 1 | 10 |
| 40 | NULL | 2017-11-13 14:50:25 | guangdong shenzhen | 1 | 20 |
| 110 | NULL | 2017-11-14 14:50:47 | beijing | 1 | 20 |
| 200 | NULL | 2017-11-14 14:49:47 | shandong qingdao | 0 | 8 |
| 400 | NULL | 2017-11-15 14:49:56 | anhui hefei | 0 | 4 |
| 600 | NULL | 2017-11-15 14:50:05 | hunan changsha | 0 | 91 |
| 650 | NULL | 2017-11-01 17:24:34 | NULL | 1 | 29 |
+--------+------+---------------------+---------------------+------+-------+
14 rows in set (0.00 sec)
执行成功!
删除job
sqoop job --delete mysqltohive
以上执行sqoop job执行过程中需要输入密码,免密码执行方式如下:
配置sqoop_site.xml,启用:
<property>
<name>sqoop.metastore.client.record.password</name>
<value>true</value>
<description>If true, allow saved passwords in the metastore.
</description>
</property>
创建job,最好新建窗口:
[root@host conf]# sqoop job --create hivetomysql -- export --connect jdbc:mysql://localhost:3306/test --username root --password root --table role1 --fields-terminated-by ' 01' --export-dir hdfs://localhost:9000/user/hive/warehouse/sqoopdb.db/role
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /root/sqoop1/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
18/06/05 14:17:51 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
18/06/05 14:17:51 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
执行job:
[root@host ~]# sqoop job --exec hivetomysql
果然不用输入秘密,bingo!!!
查看mysql数据表,数据存在!测试成功!!!!
同样sqoop也可以实现sqlserver与hdfs,hive,hbase的互导,需要下载相关驱动
添加环境变量 MSSQL_CONNECTOR_HOME
并且将驱动复制到sqoop的lib目录下面