• 配置oozie4.10+hadoop2.5.2


    终于将这个神秘的寻象人 oozie 安装配置成功了,这个困扰我好几天, 当看到如下的画面, 我觉得值!

    废话少说,看我如何编译和安装过程:

    (已经将hadoop2.5.2HA 的环境搭建起来了,hive,habase, flume,stom 都有了

      Linux环境:centos6.5  64bit

      jdk :1.7 

      mysql 已经安装

      Apache Maven 3.1.1

    下载oozie安装包:oozie-4.1.0.tar.gz    http://mirror.bit.edu.cn/apache/oozie/  

    下载ext-2.2.zip  http://oozie.apache.org/docs/4.0.1/DG_QuickStart.html该路径有extjs的链接

    ) 

    1、编译

    去http://mirrors.cnnic.cn/apache/oozie/4.2.0/ 

    下载的是oozie-4.2.0.tar.gz

    ,而后进行解压:

    tar -zxvf oozie-4.2.0.tar.gz 

    cd oozie-4.2.0/bin

    ./mkdistro.sh -DskipTests -Phadoop-2 -Dhadoop.auth.version=2.5.2 -Ddistcp.version=2.5.2 -Dsqoop.version=1.4.4 -Dhive.version=0.13.1

    -Dtomcat.version=7.0.52

    进行了漫长的等,网络问题一直困扰我,我就重复执行上面的命令,发现,最终到了这里,

     那个编译到这里就错了
     
    [INFO] ------------------------------------------------------------------------
    [INFO] Reactor Summary:
    [INFO]
    [INFO] Apache Oozie Main ................................. SUCCESS [6.824s]
    [INFO] Apache Oozie Hadoop Utils hadoop-2-4.2.0 .......... SUCCESS [9.525s]
    [INFO] Apache Oozie Hadoop Distcp hadoop-2-4.2.0 ......... SUCCESS [0.444s]
    [INFO] Apache Oozie Hadoop Auth hadoop-2-4.2.0 Test ...... SUCCESS [1.027s]
    [INFO] Apache Oozie Hadoop Libs .......................... SUCCESS [0.101s]
    [INFO] Apache Oozie Client ............................... SUCCESS [5:08.683s]
    [INFO] Apache Oozie Share Lib Oozie ...................... SUCCESS [9.351s]
    [INFO] Apache Oozie Share Lib HCatalog ................... SUCCESS [11.656s]
    [INFO] Apache Oozie Share Lib Distcp ..................... SUCCESS [3.151s]
    [INFO] Apache Oozie Core ................................. SUCCESS [3:53.804s]
    [INFO] Apache Oozie Share Lib Streaming .................. SUCCESS [13.230s]
    [INFO] Apache Oozie Share Lib Pig ........................ SUCCESS [15.454s]
    [INFO] Apache Oozie Share Lib Hive ....................... SUCCESS [13.747s]
    [INFO] Apache Oozie Share Lib Hive 2 ..................... SUCCESS [14.417s]
    [INFO] Apache Oozie Share Lib Sqoop ...................... SUCCESS [5.546s]
    [INFO] Apache Oozie Examples ............................. SUCCESS [10.178s]
    [INFO] Apache Oozie Share Lib Spark ...................... SUCCESS [15.450s]
    [INFO] Apache Oozie Share Lib ............................ SUCCESS [52.422s]
    [INFO] Apache Oozie Docs ................................. FAILURE [9.477s]
    [INFO] Apache Oozie WebApp ............................... SKIPPED
    [INFO] Apache Oozie Tools ................................ SKIPPED
    [INFO] Apache Oozie MiniOozie ............................ SKIPPED
    [INFO] Apache Oozie Distro ............................... SKIPPED
    [INFO] Apache Oozie ZooKeeper Security Tests ............. SKIPPED
    [INFO] ------------------------------------------------------------------------
    [INFO] BUILD FAILURE
    [INFO] ------------------------------------------------------------------------
    [INFO] Total time: 12:21.113s
    [INFO] Finished at: Wed Oct 26 05:39:28 CST 2016
    [INFO] Final Memory: 174M/482M
    [INFO] ------------------------------------------------------------------------
    [ERROR] Failed to execute goal org.apache.maven.plugins:maven-site-plugin:2.0-be                                                                                        ta-6:site (default) on project oozie-docs: The site descriptor cannot be resolve                                                                                        d from the repository: Could not transfer artifact org.apache:apache:xml:site_en                                                                                        :16 from/to Codehaus repository (http://repository.codehaus.org/): repository.co                                                                                        dehaus.org: 未知的名称或服务
    [ERROR] org.apache:apache:xml:16
    [ERROR]
    [ERROR] from the specified remote repositories:
    [ERROR] central (http://repo1.maven.org/maven2, releases=true, snapshots=false),
    [ERROR] ce d (https://repository.cloudera.com/cloudera/ext-release-local/, relea                                                                                        ses=true, snapshots=false),
    [ERROR] Codehaus repository (http://repository.codehaus.org/, releases=true, sna                                                                                        pshots=false),
    [ERROR] cloudera com (https://repository.cloudera.com/content/repositories/relea                                                                                        ses/, releases=true, snapshots=false),
    [ERROR] central maven (http://central.maven.org/maven2/, releases=true, snapshot                                                                                        s=false),
    [ERROR] apache.snapshots.repo (https://repository.apache.org/content/groups/snap                                                                                        shots, releases=true, snapshots=true),
    [ERROR] datanucleus (http://www.datanucleus.org/downloads/maven2, releases=true,                                                                                         snapshots=false),
    [ERROR] apache.snapshots (http://repository.apache.org/snapshots, releases=false                                                                                        , snapshots=true): Unknown host repository.codehaus.org: 未知的名称或服务
    [ERROR] -> [Help 1]
    [ERROR]
    [ERROR] To see the full stack trace of the errors, re-run Maven with the -e swit                                                                                        ch.
    [ERROR] Re-run Maven using the -X switch to enable full debug logging.
    [ERROR]
    [ERROR] For more information about the errors and possible solutions, please rea                                                                                        d the following articles:
    [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionE                                                                                        xception
    [ERROR]
    [ERROR] After correcting the problems, you can resume the build with the command
    [ERROR]   mvn <goals> -rf :oozie-docs
    
    ERROR, Oozie distro creation failed
    

      

    无奈,然后又去编译,3.3.2 的,结果:

    到了这里,也是实在无法编译下去了,

    到此,我即去网找原因:

    都告告诉我是: maven的仓储地址的问题,于是我换了仓储的配置:

    oozie 根目录下, 

    pom.xml中,修改:<repositories></repositories> 中的仓储,修改如下:

    <repositories>
    	   <repository>
                <id>cloudera com</id>
                <url>https://repository.cloudera.com/content/repositories/releases/</url>
                <snapshots>
                    <enabled>false</enabled>
                </snapshots>
            </repository>
    		
            <repository>
                <id>central</id>
                <url>http://repo1.maven.org/maven2</url>
                <snapshots>
                    <enabled>false</enabled>
                </snapshots>
            </repository>
    		
    		
    		
    		
    		<repository>
                <id>central maven</id>
                <url>http://central.maven.org/maven2/</url>
                <snapshots>
                    <enabled>false</enabled>
                </snapshots>
            </repository>
    		
    		
            <repository>
                <id>Codehaus repository</id>
                <url>http://repository.codehaus.org/</url>
                <snapshots>
                    <enabled>false</enabled>
                </snapshots>
            </repository>
            <repository>
                <id>apache.snapshots.repo</id>
                <url>https://repository.apache.org/content/groups/snapshots</url>
                <name>Apache Snapshots Repository</name>
                <snapshots>
                    <enabled>true</enabled>
                </snapshots>
            </repository>
            <repository>
                <id>datanucleus</id>
                <url>http://www.datanucleus.org/downloads/maven2</url>
                <name>Datanucleus</name>
                <snapshots>
                    <enabled>false</enabled>
                </snapshots>
            </repository>
        </repositories>
    

     

    完了之后,继续编译

    [INFO] Apache Oozie Docs ................................. FAILURE [9.477s] 

    ping 地址好像可以,

    实在不清楚为啥, 尝试了几次不行, 估计是哪里需要改的,我没办法,之候在去解决

    2.改变方案,用其他编译好的,

    于是用的是cloudera公司的

    http://archive.cloudera.com/cdh5/cdh/5/

    这个:http://archive.cloudera.com/cdh5/cdh/5/oozie-4.1.0-cdh5.8.2.tar.gz

    ,下载后解压:tar -zxvf oozie-4.1.0-cdh5.8.2.tar.gz

    这个是支持的hadoop2.6的

    那我将将他换成了我的hadoop版本hadoop2.5.2

     具体做法是:(多谢他了)

    这个参考这个地址:http://www.mamicode.com/info-detail-490284.html

    1.      解压

    cp oozie-4.1.0-distro.tar.gz /home/hadoop

    cd /home/hadoop

    tar xvzf oozie-4.1.0-distro.tar.gz

    /home/hadoop/oozie-4.1.0即为oozie的根目录

    2.      设置环境变量

    vi  /etc/profile

    export OOZIE_HOME=/home/hadoop/oozie-4.1.0
    
    export PATH=$PATH:$OOZIE_HOME/bin
    

    自后,在source /etc/profile   使他生效

    3.      引入jar包

    在OOZIE_HOME下创建libext文件夹

    mkdir libext

    将hadoop的所有jar包复制到该目录下

    cp  $HADOOP_HOME/share/hadoop/*/hadoop-*.jar  ./libext/

    cp  $HADOOP_HOME/share/hadoop/*/lib/*.jar  ./libext/

    cp  mysql-connector-java-5.1.29-bin.jar   ./libext/

    删除libext中的jasper*.jar, servlet-api.jar, jsp-api.jar,与oozie-4.0.1/oozie-server/lib/下jar包冲突,war包会报:

    org.eclipse.jdt.internal.compiler.CompilationResult.getProblems()[Lorg/eclipse/jdt/core/compiler/IProblem

    4.       生成war包

    bin/oozie-setup.sh prepare-war

    会生成@OOZIE_HOME/oozie-server/webapps/oozie.war

    解压ext-2.2.zip后生成ext-2.2文件夹,将该文件夹打入oozie.war。他的做法是在后面启动服务之后oozie.war包会解压为oozie,让后将ext-2.2直接拖进去。

    (我的做法是,将上面的oozie.war,下载到桌面,用解压工具打开,而后将 ext-2.2.zip 拖到 oozie.war中,后来发现,其实是不用的,我打开后就有)

    注:1.在网上看到用以下命令可以生成oozie.war,并已经将ext-2.2.zip打入了war包之中

    ./addtowar.sh -inputwar $OOZIE_HOME/oozie.war -outputwar $OOZIE_HOME/oozie-server/webapps/oozie.war -hadoop 2.3.0  $HADOOP_HOME -extjs /home/oozie/ext-2.2.zip

    2.你的有那个zip和unzip 的命令,否则会包错误,去root用户下,用yum -y install unzip 和yum -y install zip  安装即可 

    5.      修改配置

    vi $OOZIE_HOME/conf/oozie-site.xml

    <property>
    
       <name>oozie.service.JPAService.jdbc.driver</name>
    
       <value>com.mysql.jdbc.Driver</value>
    
        <description>
    
            JDBC driver class.
    
        </description>
    
    </property>
    
    <property>
    
       <name>oozie.service.JPAService.jdbc.url</name>
    
       <value>jdbc:mysql://mysql-server:3306/oozie</value>
    
        <description>
    
            JDBC URL.
    
        </description>
    
    </property>
    
    <property>
    
       <name>oozie.service.JPAService.jdbc.username</name>
    
        <value>root</value>
    
        <description>
    
            DB user name.
    
        </description>
    
    </property>
    
    <property>
    
       <name>oozie.service.JPAService.jdbc.password</name>
    
        <value>mapengbo</value>
    
        <description>
    
            DB user password.
    
        </description>
    
    </property>
    

      

    6.      创建数据库

    创建名为oozie的数据库并赋权

    CREATE DATABASE oozie;

    grant all ON oozie.* TO ‘shirdrn‘@‘oozie-server‘IDENTIFIED BY ‘0o21e‘;

    FLUSH PRIVILEGES;

    生成所需的数据库表,并执行

    bin/ooziedb.sh create -sqlfile oozie.sql –run

    查看数据库oozie生成了oozie的相关表。

    7.      启动服务

    bin/oozied.sh start

    访问控制台http://hadoop1:11000/oozie  hadoop1为我的主机名

    四.配置hadoop的jobhistory和用户

    修改$HADOOP_HOME/etc/hadoop/mapred-site.xml

    和$OOZIE_HOME/conf/hadoop-conf/core-site.xml添加如下配置。

    <property>
           <name>mapreduce.jobhistory.address</name>
            <value>node3:10020</value>
         </property>
    
        <property>
           <name>mapreduce.jobhistory.webapp.address</name>
            <value>node3:19888</value>
         </property>
    
         <property>
           <name>mapreduce.jobhistory.intermediate-done-dir</name>
           <value>${hadoop.tmp.dir}/mr/history-tmp</value>
         </property>
    
        <property>
            <name>mapreduce.jobhistory.done-dir</name>
           <value>${hadoop.tmp.dir}/mr/history-done</value>
    </property>
    

      

    需要在hadoop的core-site.xml里面添加如下内容:

     <property>
    
                    <name>hadoop.proxyuser.root.hosts</name>
    
                    <value>*</value>
    
             </property>
    
             <property>
    
                    <name>hadoop.proxyuser.root.groups</name>
    
                    <value>*</value>
    
             </property>
    

      

             root为hadoop的用户,hadoop.proxyuser.root.groups属性配置用户所属组名称,配置完成重启hadoop

       你也可以写成这样:hadoop.proxyuser.[USER].hosts和hadoop.proxyuser.[USER].groups

     

    启动hadoop历史jobHistory服务

             $HADOOP_HOME/sbin/mr-jobhistory-daemon.shstart historyserver    //这个我是重启的hadoop集群

             重启oozie

             bin/oozied.sh start

    五.Client测试

    tar –zxvf oozie-client-4.1.0.tar.gz   //这个我用到是之前编译的oozie4.2.0里编译好的,发现我下载的那个cloudera的里面没有这个

                  //地址:链接:http://pan.baidu.com/s/1eSBOdEi 密码:q1nw

    tar –zxvf oozie-examples.tar.gz      

    tar –zxvf oozie-sharelib-4.1.0.tar.gz  

    hdsf dfs -put examples  hdfs:/myserver/user/hadoop/      

    hdsf dfs -put share  /user/hadoop/      --//这个后来发现不行,需要在oozie-site.xml中配置到本地的目录路径,

    //配置oozie.service.WorkflowAppService.system.libpath

    A.修改$OOZIE_HOME/conf/oozie-site.xml文件,添加如下:

    <property>
    <name>oozie.service.WorkflowAppService.system.libpath</name>
    <value>file:///home/${user.name}/oozie-4.1.0-cdh5.8.2/share/lib</value>
    </property>
    

    B.修改$OOZIE_HOME/conf/hadoop-conf/core-site.xml文件,添加如下:

    <property>
    
       <name>yarn.resourcemanager.address</name>
    
          <value>node1:8032</value>(应与hadoop的配置相同,
    
                                这个我是在http://你的mapreduce主机名:8088/conf 下找到,并将其改的 同下)
    
    </property>
    
    <property>
    
         <name>yarn.resourcemanager.scheduler.address</name>
    
          <value>node1:8030</value>
    
    </property>
    

      

    C.修改oozie.service.HadoopAccessorService.hadoop.configurations 属性,将其值调整为 *=HADOOP_HOME/etc/hadoop    

         ---//这个我没有如何配置, 你可以看一下这个人配置的http://heylinux.com/archives/2836.html

    D.修改$OOZIE_HOME/examples/apps/map-reduce/job.properties(yarn中已经没有jobTracker,以下jobTracker填入yarn.resourcemanager.address的值,oozie.wf.application.path即HDFS中oozie示例程序的路径)

        nameNode=hdfs://node1:9000
    
        jobTracker=node1:8032
    
        queueName=default
    
        examplesRoot=examples
    
        oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/map-reduce
    
        outputDir=map-reduce
    

      

    在$OOZIE_HOME/oozie-client-4.0.1/bin中调用oozie脚本,执行工作流

       ./oozie job -oozie http://node3:11000/oozie -config $OOZIE_HOME/examples/apps/map-reduce/job.properties -run

    访问控制台http://hadoop1:11000/oozie

    完工!

  • 相关阅读:
    Hbase 笔记(4) 客户端API高级性能
    Hbase 笔记(3) 客户端API基础
    Hbase 笔记(2) 安装
    HBase 笔记(1) 简介
    Global 和 Local 索引。
    Phoenix Tips (14) mutable 和 immutable 表区别
    Phoenix Tips (13) 统计收集
    Phoenix Tips (12) 跟踪 Tracing
    Phoenix Tips (11) Skip Scan
    Phoenix Tips (10) 分页查询
  • 原文地址:https://www.cnblogs.com/nucdy/p/6010589.html
Copyright © 2020-2023  润新知