• Sqoop2入门之导入关系型数据库数据到HDFS上(sqoop2-1.99.4版本)


    sqoop2-1.99.4和sqoop2-1.99.3版本操作略有不同:新版本中使用link代替了老版本的connection,其他使用类似。

    sqoop2-1.99.4环境搭建参见:Sqoop2环境搭建

    sqoop2-1.99.3版本实现参见:Sqoop2入门之导入关系型数据库数据到HDFS上

    启动sqoop2-1.99.4版本客户端:

    $SQOOP2_HOME/bin/sqoop.sh client 
    set server --host hadoop000 --port 12000 --webapp sqoop

    查看所有connector:

    show connector --all
    2 connector(s) to show: 
            Connector with id 1:
                Name: hdfs-connector 
                Class: org.apache.sqoop.connector.hdfs.HdfsConnector
                Version: 1.99.4-cdh5.3.0
    
            Connector with id 2:
                Name: generic-jdbc-connector 
                Class: org.apache.sqoop.connector.jdbc.GenericJdbcConnector
                Version: 1.99.4-cdh5.3.0

    查询所有link: 

    show link

    删除指定link:

    delete link --lid x

    查询所有job:

    show job

    删除指定job:

    delete job --jid 1

      

    创建generic-jdbc-connector类型的connector

    create link --cid 2
        Name: First Link
        JDBC Driver Class: com.mysql.jdbc.Driver
        JDBC Connection String: jdbc:mysql://hadoop000:3306/hive
        Username: root
        Password: ****
        JDBC Connection Properties: 
        There are currently 0 values in the map:
        entry# protocol=tcp
        There are currently 1 values in the map:
        protocol = tcp
        entry# 
        New link was successfully created with validation status OK and persistent id 3
    show link
    +----+-------------+-----------+---------+
    | Id |    Name     | Connector | Enabled |
    +----+-------------+-----------+---------+
    | 3  | First Link  | 2         | true    |
    +----+-------------+-----------+---------+

    创建hdfs-connector类型的connector:

    create link -cid 1
        Name: Second Link
        HDFS URI: hdfs://hadoop000:8020
        New link was successfully created with validation status OK and persistent id 4
    show link
    +----+-------------+-----------+---------+
    | Id |    Name     | Connector | Enabled |
    +----+-------------+-----------+---------+
    | 3  | First Link  | 2         | true    |
    | 4  | Second Link | 1         | true    |
    +----+-------------+-----------+---------+
    show link -all
        2 link(s) to show: 
        link with id 3 and name First Link (Enabled: true, Created by null at 15-2-2 ??11:28, Updated by null at 15-2-2 ??11:28)
        Using Connector id 2
          Link configuration
            JDBC Driver Class: com.mysql.jdbc.Driver
            JDBC Connection String: jdbc:mysql://hadoop000:3306/hive
            Username: root
            Password: 
            JDBC Connection Properties: 
              protocol = tcp
        link with id 4 and name Second Link (Enabled: true, Created by null at 15-2-2 ??11:32, Updated by null at 15-2-2 ??11:32)
        Using Connector id 1
          Link configuration
            HDFS URI: hdfs://hadoop000:8020

    根据connector id创建job:

    create job -f 3 -t 4
        Creating job for links with from id 3 and to id 4
        Please fill following values to create new job object
        Name: Sqoopy
    
        From database configuration
    
        Schema name: hive
        Table name: TBLS
        Table SQL statement: 
        Table column names: 
        Partition column name: 
        Null value allowed for the partition column: 
        Boundary query: 
    
        ToJob configuration
    
        Output format: 
          0 : TEXT_FILE
          1 : SEQUENCE_FILE
        Choose: 0
        Compression format: 
          0 : NONE
          1 : DEFAULT
          2 : DEFLATE
          3 : GZIP
          4 : BZIP2
          5 : LZO
          6 : LZ4
          7 : SNAPPY
          8 : CUSTOM
        Choose: 0
        Custom compression format: 
        Output directory: hdfs://hadoop000:8020/sqoop2/tbls_import_demo_sqoop1.99.4
    
        Throttling resources
    
        Extractors: 
        Loaders: 
        New job was successfully created with validation status OK  and persistent id 2

    查询所有job: 

    show job
    +----+--------+----------------+--------------+---------+
    | Id |  Name  | From Connector | To Connector | Enabled |
    +----+--------+----------------+--------------+---------+
    | 2  | Sqoopy | 2              | 1            | true    |
    +----+--------+----------------+--------------+---------+

    启动指定的job:  该job执行完后查看HDFS上的文件(hdfs fs -ls hdfs://hadoop000:8020/sqoop2/tbls_import_demo_sqoop1.99.4/)

    start job --jid 2

    查看指定job的执行状态:

    status job --jid 2

    停止指定的job:

    stop job --jid 2

    在start job(如:start job --jid 2)时常见错误:

    Exception has occurred during processing command 
    Exception: org.apache.sqoop.common.SqoopException Message: CLIENT_0001:Server has returned exception

    在sqoop客户端设置查看job详情:

    set option --name verbose --value true
    show job --jid 2
  • 相关阅读:
    说一下 session 的工作原理?
    session 和 cookie 有什么区别?
    说一下 JSP 的 4 种作用域?
    jsp有哪些内置对象?作用分别是什么?
    MVC的各个部分都有那些技术来实现?如何实现?
    request.getAttribute()和 request.getParameter()有何区别?
    Servlet API中forward()与redirect()的区别?
    jsp和servlet的区别、共同点、各自应用的范围?
    说一说Servlet的生命周期?
    如何从CDN加载jQuery?
  • 原文地址:https://www.cnblogs.com/luogankun/p/4267442.html
Copyright © 2020-2023  润新知