• sqoop job从创建到执行


    在学习sqoop job之前,最好先学习一下sqoop命令的导入导出

    sqoop 使用 import 将 mysql 中数据导入到 hive

    sqoop 使用 import 将 mysql 中数据导入到 hdfs

    sqoop 使用 export 将 hive 中数据导出到 mysql

    sqoop job

    sqoop job 可将一些参数配置以及命令语句保存起来,方便调用。

    接下来实现一个从mysql导入到hive的任务

    • mysql建表,表名为 sqoop_job
      CREATE TABLE `sqoop_job` (
        `id` int(11) DEFAULT NULL,
        `name` varchar(255) DEFAULT NULL,
        `jobname` varchar(255) DEFAULT NULL
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1
    • 给sqoop_job插入测试数据
      insert into sqoop_job values(1,"name1","jobname1");
      insert into sqoop_job values(2,"name2","jobname2");
      insert into sqoop_job values(3,"name3","jobname3");
    • 将mysql表结构同步到hive
      sqoop create-hive-table --connect jdbc:mysql://localhost:3306/sqooptest --username root --password 123qwe --table sqoop_job
      --hive-table sqoop_job --fields-terminated-by ,
    • 创建一个导入任务的sqoop job
      sqoop job --create sqoopimport1 -- import --connect jdbc:mysql://localhost:3306/sqooptest --username root -password 123qwe --table sqoop_job
      --hive-import --hive-table sqoop_job --fields-terminated-by ',' -m 1

      创建成功后可使用命令查看当前job列表

      sqoop job -list
      sqoop还支持查看已创建任务的参数配置
      使用命令 sqoop job --show jobname
      EFdeMacBook-Pro:sbin FengZhen$ sqoop job --show sqoopimport1
      Job: sqoopimport1
      Tool: import
      Options:
      ----------------------------
      verbose = false
      db.connect.string = jdbc:mysql://localhost:3306/sqooptest
      codegen.output.delimiters.escape = 0
      codegen.output.delimiters.enclose.required = false
      codegen.input.delimiters.field = 0
      hbase.create.table = false
      db.require.password = true
      hdfs.append.dir = false
      db.table = sqoop_job
      codegen.input.delimiters.escape = 0
      import.fetch.size = null
      accumulo.create.table = false
      codegen.input.delimiters.enclose.required = false
      db.username = root
      reset.onemapper = false
      codegen.output.delimiters.record = 10
      import.max.inline.lob.size = 16777216
      hbase.bulk.load.enabled = false
      hcatalog.create.table = false
      db.clear.staging.table = false
      codegen.input.delimiters.record = 0
      enable.compression = false
      hive.overwrite.table = false
      hive.import = true
      codegen.input.delimiters.enclose = 0
      hive.table.name = sqoop_job
      accumulo.batch.size = 10240000
      hive.drop.delims = false
      codegen.output.delimiters.enclose = 0
      hdfs.delete-target.dir = false
      codegen.output.dir = .
      codegen.auto.compile.dir = true
      relaxed.isolation = false
      mapreduce.num.mappers = 1
      accumulo.max.latency = 5000
      import.direct.split.size = 0
      codegen.output.delimiters.field = 44
      export.new.update = UpdateOnly
      incremental.mode = None
      hdfs.file.format = TextFile
      codegen.compile.dir = /tmp/sqoop-FengZhen/compile/546e29b092f451585b5c8547b3e9985e
      direct.import = false
      hive.fail.table.exists = false
      db.batch = false
    • 执行job
      sqoop job --exec sqoopimport1
      执行成功后可查看hive中表的数据
      hive> select * from sqoop_job;
      OK
      1    name1    jobname1
      2    name2    jobname2
      3    name3    jobname3
      Time taken: 1.618 seconds, Fetched: 3 row(s)

      Done.

  • 相关阅读:
    博客作业03--栈和队列
    博客作业02---线性表
    博客作业01-抽象数据类型
    C语言最后一次作业--总结报告
    C语言博客作业--函数嵌套调用
    java课程设计——2048
    博客作业06--图
    博客作业05--查找
    博客作业04--树
    博客作业03--栈和队列
  • 原文地址:https://www.cnblogs.com/EnzoDin/p/7597761.html
Copyright © 2020-2023  润新知