• oozie4.3.0的安装与配置 + hadoop2.7.3


    安装步骤

    • mysql的配置
    • oozie的安装
    • oozie的配置
    • oozie的启动与登录
    • 常用oozie的命令
    1. mysql的配置
    mysql的安装自行解决,然后在mysql上

    创建oozie数据库,创建oozie用户名和密码,并赋值oozie账号登陆oozie数据库的权限。

    mysql -u root -proot
    create database oozie;  
    //(设置oozie数据库的访问权限)  
    grant all privileges on oozie.* to 'oozie'@'%' identified by 'password';             
    FLUSH PRIVILEGES;  

    注意要删除msyql中两个空的用户,否则会一直报用户无权限。

     

    2. oozie的安装

    2.1 oozie的下载与编译

    https://mirrors.tuna.tsinghua.edu.cn/apache/oozie/4.3.0/oozie-4.3.0.tar.gz

    官方提供的oozie只是源码,需要自己编译,解压缩,使用以下命令编译:

    ./mkdistro.sh -Phadoop-2 -Dhadoop.auth.version=2.7.3 -Ddistcp.version=2.7.3 -Dhadoop.version=2.7.3 -Dsqoop.version=1.4.6 -DskipTests 

    编译过程中会有3处错误(点这里参考这篇文章),都是镜像库中包找不到,所以需要自己下载,然后放到本地maven仓库目录下即可。

    编译成功后,打包的文件路径为:oozie-4.3.0/distro/target/oozie-4.3.0-distro.tar.gz

    2.2 oozie的安装

    1)  解压缩包oozie-4.3.0-distro.tar.gz到/usr/local/目录下,然后进入目录oozie-4.3.0,分别解压缩里面的三个压缩包oozie-client-4.3.0.tar.gz、oozie-examples.tar.gz、oozie-sharelib-4.3.0.tar.gz

    完成后的文件列表如下:

    2) 在hdfs上创建/user/oozie目录,然后将share目录上传到hdfs中的/user/oozie目录。

    将mysql驱动和oracle驱动放到share/lib目录下, 后面使用sqoop的时候,会使用hdfs的/user/oozie/share/lib/sqoop/目录下的jar包。

    cp ojdbc*.jar /usr/local/oozie-4.3.0/share/lib/sqoop/
    cp mysql-connector-java-5.1.35-bin.jar /usr/local/oozie-4.3.0/share/lib/sqoop/
    hdfs dfs -copyFromLocal /usr/local/oozie-4.3.0/share/ /user/oozie

    3)  在/usr/local/oozie目录下创建libext文件夹,然后复制hadoop的lib目录下的文件到/usr/local/oozie/libext下。

    ln -s oozie-4.3.0 oozie
    cd oozie
    mkdir libext
    cp ${HADOOP_HOME}/share/hadoop/*/*.jar libext/
    cp ${HADOOP_HOME}/share/hadoop/*/lib/*.jar libext/

    添加ext-2.2.zip和mysql驱动包、oracle驱动包到libext

    cp ext-2.2.zip /usr/local/oozie/libext/
    cp mysql-connector-java-5.1.35-bin.jar /usr/local/oozie/libext/

    4)  修改oozie-4.3.0/oozie-server/conf/server.xml文件,注释掉下面的记录

    <!--<Listener className="org.apache.catalina.mbeans.ServerLifecycleListener" />-->

    5)  打war包

    在bin目录下执行命令:

    ./oozie-setup.sh prepare-war

    war文件最终保存在/usr/local/oozie/oozie-server/webapps目录下

     

    3. oozie的配置

    3.1 设置环境变量/etc/profile

    #180112 oozie path
    export OOZIE_HOME=/usr/local/oozie
    export PATH=$OOZIE_HOME/bin:$PATH
    export OOZIE_CONFIG=/usr/local/oozie/conf

    # 这里要注意地址后面要带/oozie,否则报404错误, 踩过的坑只有自己知道痛苦

    export OOZIE_URL=http://dwtest-name1:11000/oozie

      3.2 修改配置文件/usr/local/oozie/conf/oozie-site.xml

    默认conf文件夹下的oozie-site.xml文件都是注释的,需要自己添加以下内容。

    <?xml version="1.0"?>
    <configuration>
    
        <!--
            Refer to the oozie-default.xml file for the complete list of
            Oozie configuration properties and their default values.
        -->
    
        <!-- Proxyuser Configuration -->
    
        <property>
            <name>oozie.service.ProxyUserService.proxyuser.hadoop.hosts</name>
            <value>*</value>
            <description>
                List of hosts the '#USER#' user is allowed to perform 'doAs'
                operations.
    
                The '#USER#' must be replaced with the username o the user who is
                allowed to perform 'doAs' operations.
    
                The value can be the '*' wildcard or a list of hostnames.
    
                For multiple users copy this property and replace the user name
                in the property name.
            </description>
        </property>
    
        <property>
            <name>oozie.service.ProxyUserService.proxyuser.hadoop.groups</name>
            <value>*</value>
            <description>
                List of groups the '#USER#' user is allowed to impersonate users
                from to perform 'doAs' operations.
    
                The '#USER#' must be replaced with the username o the user who is
                allowed to perform 'doAs' operations.
    
                The value can be the '*' wildcard or a list of groups.
    
                For multiple users copy this property and replace the user name
                in the property name.
            </description>
        </property>
    
        <!-- 20180110 add -->
        <property>
            <name>oozie.service.JPAService.create.db.schema</name>
              <value>false</value>
            </property>
        <property>
            <name>oozie.service.JPAService.jdbc.driver</name>
            <value>com.mysql.jdbc.Driver</value>
        </property>
        <property>
            <name>oozie.service.JPAService.jdbc.url</name>
            <value>jdbc:mysql://dwtest-name1:33061/oozie?createDatabaseIfNotExist=true</value>
        </property>
         
        <property>
            <name>oozie.service.JPAService.jdbc.username</name>
            <value>oozie</value>
        </property>
         
        <property>
            <name>oozie.service.JPAService.jdbc.password</name>
            <value>password</value>
            <description>
                    DB user password.
                    IMPORTANT: if password is emtpy leave a 1 space string, the service trims the value,
                    if empty Configuration assumes it is NULL.
            </description>
        </property>
    
        <property>
            <name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
            <value>*=/usr/local/hadoop/etc/hadoop</value>
        </property>
    
          <property>
            <name>oozie.service.HadoopAccessorService.action.configurations</name>
            <value>*=/usr/local/hadoop/etc/hadoop</value>
        </property>
    
        <property>
             <name>oozie.service.SparkConfigurationService.spark.configurations</name>
             <value>*=/usr/local/spark/conf</value>
        </property>
    
        <!-- 这里是保存在hdfs上的路径 -->
        <property>
             <name>oozie.service.WorkflowAppService.system.libpath</name>
             <value>/user/oozie/share/lib</value>
        </property>
    
        <property>
            <name>oozie.use.system.libpath</name>
            <value>true</value>
            <description>
                    Default value of oozie.use.system.libpath. If user haven't specified =oozie.use.system.libpath=
                    in the job.properties and this value is true and Oozie will include sharelib jars for workflow.
            </description>
        </property>
    
        <property>
            <name>oozie.subworkflow.classpath.inheritance</name>
            <value>true</value>
        </property>
    
    
    </configuration>

     3.3 创建元数据表

    在/usr/local/oozie/bin目录下执行以下命令生成sql文件,并创建元数据表

    bin/ooziedb.sh create -sqlfile oozie.sql -run

    可以在mysql的oozie数据库中看到以下表被创建:

    3.4 修改core-site.xml文件(视具体环境设置)

     如果调用oozie job的账号与hadoop不一致,则需要修改hadoop的配置文件core-site.xml,添加调用oozie job的用户组,由于我使用的都是hadoop账户,则不需要做任何修改。

    修改完后,拷贝到Second name节点。使用以下命令来刷新,不需要重启hadoop集群。

    scp core-site.xml hadoop@dwtest-name2:/usr/local/hadoop/etc/hadoop/
    hdfs dfsadmin -refreshSuperUserGroupsConfiguration  
    yarn rmadmin -refreshSuperUserGroupsConfiguration

      

    4. oozie的启动与登录

    启动与停止的脚本

    bin/oozied.sh start
    bin/oozied.sh stop

     启动时,显示如下:

     

    登录: http://dwtest-name1:11000/oozie/

    不兼容包的删除:

     oozie启动后会自动将war包解压生成一个oozie文件夹。

    此时需要将/usr/local/oozie/oozie-server/webapps/oozie/WEB-INF/lib中包含hadoop2.6.0版本的包删除或者移走,

    否则启动job会提示错误Error, java.lang.NoSuchFieldError: HADOOP_CLASSPATH


    5. 常用oozie的命令

    desc

    command

    查看共享库pig包

    oozie admin -oozie http://localhost:11000/oozie -shareliblist spark

    查看共享库

    oozie admin -oozie http://localhost:11000/oozie -shareliblist

    提交任务

    oozie job -oozie http://localhost:11000/oozie -config job.properties -submit

    执行任务

    oozie job -oozie http://localhost:11000/oozie -config job.properties -run

    杀死任务

    oozie job -oozie http://localhost:11000/oozie -kill jobid

    重新运行任务

    oozie job -oozie http://localhost:11000/oozie -config job.properties  -rerun jobid

    改变作业参数

    oozie job -oozie http://localhost:11000/oozie -change jobid -value concurrency=1000;endtime=2018-01-10

    检查作业状态

    oozie job -oozie http://localhost:11000/oozie -info jobid

    查看作业日志

    oozie job -oozie http://localhost:11000/oozie -log jobid

    检查XML是否符合规范

    oozie calidate myapp/workflow.xml

    help查询:

    oozie help calidata //查询所有命令
    oozie help admin
    oozie help job
  • 相关阅读:
    Method "goodsList" has already been defined as a data property
    mac安装淘宝淘宝镜像失败
    webstrom git配置设置时右侧没有内容 select configuration element in the tree to edit its setting
    vue下标获取数据时候,页面报错
    透明度全兼容
    clipboard冲突mui.css,移动端实现复制粘贴
    Vue价格四舍五入保留两位和直接取两位
    实习大总结
    day33
    day31
  • 原文地址:https://www.cnblogs.com/30go/p/8335523.html
Copyright © 2020-2023  润新知