• You are currently running the HMaster without HDFS append support enabled. This may result in data loss. Please see the


    Hadoop versionHBase versionCompatible?
    0.20.2 release 0.90.2 NO
    0.20-append 0.90.2 YES
    0.21.0 release 0.90.2 NO
    0.22.x (in development) 0.90.2 NO

    从上图可以看出,HBase0.90.2与Hadoop的主干版本0.20.0是不兼容的,虽然可以使用,但是在生产环境中会导致数据丢失。

    比如在hbase的web界面会有如下提醒:

    You are currently running the HMaster without HDFS append support enabled. This may result in data loss. Please see the HBase wiki for details. 

    As of today, Hadoop 0.20.2 is the latest stable release of Apache Hadoop that is marked as ready for production (neither 0.21 nor 0.22 are). 

    Unfortunately, Hadoop 0.20.2 release is not compatible with the latest stable version of HBase: if you run HBase on top of Hadoop 0.20.2, you risk to lose data! Hence HBase users are required to build their own Hadoop 0.20.x version if they want to run HBase on a production cluster of Hadoop. In this article, I describe how to build such a production-ready version of Hadoop 0.20.x that is compatible with HBase 0.90.2.

    在Hbase0.20.2的官方book中也有提到:

    This version of HBase will only run on Hadoop 0.20.x. It will not run on hadoop 0.21.x (nor 0.22.x). HBase will lose data unless it is running on an HDFS that has a durable sync. Currently only the branch-0.20-appendbranch has this attribute [1]. No official releases have been made from this branch up to now so you will have to build your own Hadoop from the tip of this branch. Check it out using this url, branch-0.20-append. Scroll down in the Hadoop How To Release to the section Build Requirements for instruction on how to build Hadoop.

    Or rather than build your own, you could use Cloudera's CDH3. CDH has the 0.20-append patches needed to add a durable sync (CDH3 betas will suffice; b2, b3, or b4).

    所以本文就讨论如何使用编译hadoop的append分支,并整合进入Hadoop主干版本。

    首先安装git工具。(是个类似于svn一样的版本控制工具)

    $ apt-get install git

    使用git获取源代码,并建立本地版本库,需要下载较长时间

    $ git clone git://git.apache.org/hadoop-common.git 进入库内 $ cd hadoop-common

    我们发现git到本地的库只可以看到hadoop的最新主干代码,实际上,git已经获取了所有版本,需要手动切换版本到append分支;

    $ git checkout -t remotes/origin/branch-0.20-append

    这样就切换到了append分支

    我们在分支就可以准备进行编译:

    首先在hadoop-common目录下创建 build.properties ,内容如下:

    resolvers=intple
    version=0.20.2(你需要指定的版本号)
    project.version=${version} 
    hadoop.version=${version} 
    hadoop-core.version=${version} 
    hadoop-hdfs.version=${version} 
    hadoop-mapred.version=${version}
    hadoop-common目录下,最后确认一下是否已经切换版本 git checkout branch-0.20-append

    现在看,目录中内容全变了,切换到了append分支

    下面开始编译,先安装ant哦

    启动构建,需要较长时间完成(4分钟左右)

    $ ant mvn-install 注意,如果需要重新运行该指令,你应该先清除生成的文件 rm -rf $HOME/.m2/repository
    在hadoop-common目录下执行 ant clean-cache

    编译完成之后,会进入测试阶段

    # Optional: run the full test suite or just the core test suite $ ant test $ ant test-core

    第一个 测试全部内容,第二个只测试核心功能

    ant test 时间非常久,非服务器约10小时。

    在哪里可以找到目标jar包呢?

    $ find $HOME/.m2/repository -name "hadoop-*.jar" .../repository/org/apache/hadoop/hadoop-examples/0.20-append-for-hbase/hadoop-examples-0.20-append-for-hbase.jar .../repository/org/apache/hadoop/hadoop-test/0.20-append-for-hbase/hadoop-test-0.20-append-for-hbase.jar .../repository/org/apache/hadoop/hadoop-tools/0.20-append-for-hbase/hadoop-tools-0.20-append-for-hbase.jar .../repository/org/apache/hadoop/hadoop-streaming/0.20-append-for-hbase/hadoop-streaming-0.20-append-for-hbase.jar .../repository/org/apache/hadoop/hadoop-core/0.20-append-for-hbase/hadoop-core-0.20-append-for-hbase.jar

    接下来就是将新的jar替换旧的jar包(此处假设你已经架设好hadoop-0.20.2release版本)

    1,替换旧的hadoop包;

    2,替换hbase中lib文件夹中的包

    请注意,替换jar包需要重新命名

    Hadoop 0.20.2 release 版本的命名规则为 hadoop-VERSION-PACKAGE.jar,如:hadoop-0.20.2-examples.jar

    而新编译的版本命名规则为 hadoop-PACKAGE-VERSION.jar,如: hadoop-examples-0.20-append-for-hbase.jar

    所以你会以如下方式重命名:

    hadoop-examples-0.20-append-for-hbase.jar --> hadoop-0.20-append-for-hbase-examples.jar hadoop-test-0.20-append-for-hbase.jar --> hadoop-0.20-append-for-hbase-test.jar hadoop-tools-0.20-append-for-hbase.jar --> hadoop-0.20-append-for-hbase-tools.jar hadoop-streaming-0.20-append-for-hbase.jar --> hadoop-0.20-append-for-hbase-streaming.jar hadoop-core-0.20-append-for-hbase.jar --> hadoop-0.20-append-for-hbase-core.jar

    而与之相反,Hbase使用的命名规则为hadoop-PACKAGE-VERSION.jar ,所以提交到$HBASE_HOME/lib的jar包则不需要重命名,只需要保持原来的名称。

  • 相关阅读:
    C# 时间+三位随机数
    dataGridView加行标识方法与制作
    MySql多列查询
    php 去掉字符串的最后一个字符
    DataTable 排序
    汇编第一个程序 Hello World (初学者与入门)
    C# 获取前一天,明天,本周,上周,本季度等!
    php 字符串中任意添加
    天涯论坛的经典回帖!!!
    系统时间同步
  • 原文地址:https://www.cnblogs.com/ylqmf/p/2371669.html
Copyright © 2020-2023  润新知