• Sqoop2 环境搭建


    原文地址:http://www.cnblogs.com/luogankun/p/4209017.html

    正在准备做Spark SQL external data source与关系型数据库交互的部分,参考下Sqoop2是如何操作关系型数据库的。

    下载地址:http://archive.cloudera.com/cdh5/cdh/5/

    下载并安装:

    cd /home/spark/app/
    wget http://archive.cloudera.com/cdh5/cdh/5/sqoop2-1.99.3-cdh5.0.0.tar.gz
    tar -zxvf sqoop2-1.99.3-cdh5.0.0.tar.gz
    cd sqoop2-1.99.3-cdh5.0.0

    注:由于本地采用的hadoop是2.3.0-cdh5.0.0版本,故本案例中使用的是cdh5.0.0对应的sqoop版本;

    添加sqoop2到系统环境变量中:

    export SQOOP2_HOME=/home/spark/app/sqoop2-1.99.3-cdh5.0.0
    export CATALINA_BASE=$SQOOP2_HOME/server
    export PATH=.:$SQOOP2_HOME/bin:$PATH

    拷贝mysql驱动包到$SQOOP2_HOME/server/lib下

    cp mysql-connector-java-5.1.10-bin.jar $SQOOP2_HOME/server/lib/ 

    修改配置文件:

    $SQOOP2_HOME/server/conf/sqoop.properties

    org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/home/spark/app/hadoop-2.3.0-cdh5.0.0/etc/hadoop
    $SQOOP2_HOME/server/conf/catalina.properties
    common.loader=${catalina.base}/lib,${catalina.base}/lib/*.jar,${catalina.home}/lib,${catalina.home}/lib/*.jar,${catalina.home}/../lib/*.jar,/home/spark/app/hadoop-2.3.0-cdh5.0.0/share/hadoop/common/*.jar,/home/spark/app/hadoop-2.3.0-cdh5.0.0/share/hadoop/common/lib/*.jar,/home/spark/app/hadoop-2.3.0-cdh5.0.0/share/hadoop/hdfs/*.jar,/home/spark/app/hadoop-2.3.0-cdh5.0.0/share/hadoop/hdfs/lib/*.jar,/home/spark/app/hadoop-2.3.0-cdh5.0.0/share/hadoop/mapreduce/*.jar,/home/spark/app/hadoop-2.3.0-cdh5.0.0/share/hadoop/mapreduce/lib/*.jar,/home/spark/app/hadoop-2.3.0-cdh5.0.0/share/hadoop/tools/*.jar,/home/spark/app/hadoop-2.3.0-cdh5.0.0/share/hadoop/tools/lib/*.jar,/home/spark/app/hadoop-2.3.0-cdh5.0.0/share/hadoop/yarn/*.jar,/home/spark/app/hadoop-2.3.0-cdh5.0.0/share/hadoop/yarn/lib/*.jar
    如果想修改tomcat的端口号等信息可以在$SQOOP2_HOME/server/conf/server.xml中进行设置;

    启停sqoop-server:

    $SQOOP2_HOME/bin/sqoop.sh server start
    $SQOOP2_HOME/bin/sqoop.sh server stop
    或者
    $SQOOP2_HOME/bin/sqoop2-server start
    $SQOOP2_HOME/bin/sqoop2-server stop

    验证是否启动成功:

    方式一:jps查看进程: Bootstrap 
    方式二:http://hadoop000:12000/sqoop/version
    方式三:wget -qO - hadoop000:12000/sqoop/version

    启动sqoop客户端:

    $SQOOP2_HOME/bin/sqoop.sh client
    或者
    $SQOOP2_HOME/bin/sqoop2-shell

    为客户端配置服务器:

    sqoop:000> set server --host hadoop000 --port 12000 --webapp sqoop

    查看服务器端信息:

    sqoop:000> show server --all

     ZOOM 云视频会议网站:http://www.zoomonline.cn/

  • 相关阅读:
    LINUX中SHELL批量导入文件到DB2数据库
    LINUX使用SHELL对DB2数据库中的大表中的非月末数据进行分离
    LINUX之SHELL进行数据检查和调用存储过程
    LINUX中使用SHELL重跑DB2存储过程
    SHELL中自动备份DB2架构
    使用SHELL对DB2数据库表空间进行自动扩容
    LINUX系统中根据DB2名称杀掉进程
    LINUX下SHELL调用DB2公共函数之public_s.sh
    pycurl之调用公共方法--请求/上传/下载,解析json
    pyspark常用函数
  • 原文地址:https://www.cnblogs.com/gw811/p/4630116.html
Copyright © 2020-2023  润新知