注:以下链接均为近期hadoop集群搭建及mapreduce应用开发查找到的资料。使用hadoop2.6.0,其中hadoop集群配置过程下面的文章都有部分参考。
hadoop集群配置方法:
---------------------------------------------------------------------------------------------------------------------------------------------------------
Hadoop 三台主机 集群搭建 详解 (集群搭建过程的主要参考文章,但是其介绍的hadoop版本较低,其中core-site.xml需要修改, 没有yarn-site.xml的配置过程)
http://www.cnblogs.com/shitouer/archive/2012/05/21/2511060.html
利用Cloudera实现Hadoop (hadoop版本较低,但是排版很好,可读性较好)
http://wiki.ubuntu.org.cn/%E5%88%A9%E7%94%A8Cloudera%E5%AE%9E%E7%8E%B0Hadoop
CentOS6.4配置Hadoop-2.6.0集群配置安装指南 (hadoop2.6.0版本,有xml配置文件示例)
http://blog.csdn.net/tianya846/article/details/42176507
Hadoop中的集群配置和使用技巧 (文章里介绍的mapreduce思想值得一看)
http://www.infoq.com/cn/articles/hadoop-config-tip
hadoop2.6安装配置以及整合eclipse开发环境 (主要是eclipse的设置过程)
http://blog.csdn.net/crazyzhb2012/article/details/43083785
mapreduce应用:wordcount详解+xml解析:
---------------------------------------------------------------------------------------------------------------------------------------------------------
Hadoop集群系列7:WordCount运行详解(2)
http://developer.51cto.com/art/201206/345334_1.htm
How does XML be parsed in hadoop in parallel
http://stackoverflow.com/questions/25485126/how-does-xml-be-parsed-in-hadoop-in-parallel
xml解析类的代码:
yarn配置项解析:
---------------------------------------------------------------------------------------------------------------------------------------------------------
Hadoop MapReduce Next Generation - Cluster Setup (官网的yarn框架集群配置参数介绍)
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/ClusterSetup.html
Hadoop 新 MapReduce 框架 Yarn 详解 (hadoop新旧框架对比)
http://www.ibm.com/developerworks/cn/opensource/os-cn-hadoop-yarn/
Hadoop YARN配置参数剖析(1)—RM与NM相关参数 (详细的yarn配置参数解释)
http://dongxicheng.org/mapreduce-nextgen/hadoop-yarn-configurations-resourcemanager-nodemanager/