• 大数据平台搭建 cdh5.11.1 oozie安装


    一、简介

    oozie是hadoop平台开源的工作流调度引擎,用来管理hadoop作业,属于web应用程序,由oozie server 和oozie client构成。

    oozie server运行与tomcat容器中

    oozie的工作流必须是一个有向无环图,当用户需要执行多个关联的MapReduce作业时,只需要把作业写进workflow.xml中,再提交到oozie,oozie便可以托管服务,按照预先的配置有序执行任务。

    二、安装

    1.下载编译好的cdh版本

    http://archive.cloudera.com/cdh5/cdh/5/

    下载4.1-cdh5.11.1即可

    2.先停hbase和zookeeper

    bin/hbase-daemon.sh stop master
    bin/hbase-daemon.sh stop regionserver
    bin/hbase-daemon.sh stop zookeeper
    3.再停hadoop集群
    sbin/stop-dfs.sh
    sbin/stop-yarn.sh
    4.解压oozie压缩包到本地目录
    5.配置hadoop的一个代理用户
    <!-- OOZIE -->
    <property>
    <name>hadoop.proxyuser.hadoop.hosts</name>
    <value>hadoop001</value>
    </property>
    <property>
    <name>hadoop.proxyuser.hadoop.groups</name>
    <value>*</value>
    </property>
    

     

    6.在解压过的根目录中,再解压oozie-hadooplibs-4.0.0-cdh5.3.6.tar.gz文件到当前目录下,会多一个目录:oozie-4.1.0-cdh5.11.1目录

    7.在oozie根目录下创建libext目录

    拷贝刚刚解压的jar包到libext目录

    cp -r ./oozie-4.1.0-cdh5.11.1/hadooplibs/hadooplib-2.6.0-cdh5.11.1.oozie-4.1.0-cdh5.11.1/* ~/app/oozie/libext/

    8.拷贝ext-2.2.zip到libext目录

    9.打包oozie到war包中

    bin/oozie-setup.sh prepare-war

    这个命令会把libext下的jar包,打成war包

    10.启动hadoop

    sbin/start-dfs.sh

    sbin/start-yarn.sh

    11.修改oozie-site.xml,新增配置(在oozie新版本中,会有oozie-default.xml和oozie-site.xml,如果有修改的地方,请拷贝属性到oozie-site.xml中,不要直接去修改oozie-default.xml否则不生效)

    
    
    	<property>
            <name>oozie.service.WorkflowAppService.system.libpath</name>
            <value>/user/oozie/share/lib</value>
            <description>
                System library path to use for workflow applications.
                This path is added to workflow application if their job properties sets
                the property 'oozie.use.system.libpath' to true.
            </description>
        </property>
    	
    
    	<property>
            <name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
            <value>*=/home/hadoop/app/hadoop/etc/hadoop</value>
            <description>
                Comma separated AUTHORITY=HADOOP_CONF_DIR, where AUTHORITY is the HOST:PORT of
                the Hadoop service (JobTracker, YARN, HDFS). The wildcard '*' configuration is
                used when there is no exact match for an authority. The HADOOP_CONF_DIR contains
                the relevant Hadoop *-site.xml files. If the path is relative is looked within
                the Oozie configuration directory; though the path can be absolute (i.e. to point
                to Hadoop client conf/ directories in the local filesystem.
            </description>
        </property>

    <property>
    <name>oozie.processing.timezone</name>
    <value>GMT+0800</value>
    <description>
    Oozie server timezone. Valid values are UTC and GMT(+/-)####, for example 'GMT+0530' would be India
    timezone. All dates parsed and genered dates by Oozie Coordinator/Bundle will be done in the specified
    timezone. The default value of 'UTC' should not be changed under normal circumtances. If for any reason
    is changed, note that GMT(+/-)#### timezones do not observe DST changes.
    </description>
    </property>

    
    

      

    
    

      12.把共享包传到hdfs上

    bin/oozie-setup.sh sharelib create -fs hdfs://hadoop004:8020 -locallib oozie-sharelib-4.1.0-cdh5.11.1-yarn.tar.gz
    13.配置oozie的数据库为mysql
    oozie-site.xml新加配置

    
    
    <property>
            <name>oozie.service.JPAService.jdbc.driver</name>
            <value>com.mysql.jdbc.Driver</value>
            <description>
                JDBC driver class.
            </description>
        </property>
    
        <property>
            <name>oozie.service.JPAService.jdbc.url</name>
            <value>jdbc:mysql://hadoop001:3306/oozie?createDatabaseIfNotExist=true</value>
            <description>
                JDBC URL.
            </description>
        </property>
    
        <property>
            <name>oozie.service.JPAService.jdbc.username</name>
            <value>root</value>
            <description>
                DB user name.
            </description>
        </property>
    
        <property>
            <name>oozie.service.JPAService.jdbc.password</name>
            <value>123456</value>
            <description>
                DB user password.
    
                IMPORTANT: if password is emtpy leave a 1 space string, the service trims the value,
                           if empty Configuration assumes it is NULL.
            </description>
        </property>
    
    
    

      

     

    利用命令在数据库中创建表结构及数据

    bin/ooziedb.sh create -sqlfile oozie.sql -run DB Connection

    14.启动oozie

    bin/oozied.sh start

    15.访问:

    hadoop001:11000 即可访问了
  • 相关阅读:
    Java并发之Thread类的使用
    剑指Offer
    总结下2017之前的几年
    解决一个特定的负载均衡下定时任务执行多次的问题
    《MYSQL》----字符串的复杂函数,检索的七-天-排-重
    科学计数法的转换
    小伙伴自言自语发给我的聊天记录,一句都看不懂
    记录下一个让我调了一天的失误
    记录一个从没见过的bug
    吐槽下
  • 原文地址:https://www.cnblogs.com/nicekk/p/9043486.html
Copyright © 2020-2023  润新知