集群环境搭建请见:http://blog.csdn.net/jediael_lu/article/details/45145767
一、环境准备
1、安装linux、jdk
2、下载hadoop2.6.0,并解压
3、配置免密码ssh
(1)检查是否可以免密码:
$ ssh localhost
(2)若否:
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
4、在/etc/profile中添加以下内容
#hadoop setting
export PATH=$PATH:/mnt/jediael/hadoop-2.6.0/bin:/mnt/jediael/hadoop-2.6.0/sbin
export HADOOP_HOME=/mnt/jediael/hadoop-2.6.0
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
二、安装hdfs
1、配置etc/hadoop/core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
2、配置etc/hadoop/hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
3、格式化namenode
$ bin/hdfs namenode -format
4、启动hdfs
$ sbin/start-dfs.sh
5、打开页面验证hdfs安装成功
http://localhost:50070/
6、运行自带示例
(1)创建目录
$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/jediael
(2)复制文件
bin/hdfs dfs -put etc/hadoop input
(3)运行示例
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar grep input output 'dfs[a-z.]+’
(4)检查输出结果
$ bin/hdfs dfs -cat output/*
6 dfs.audit.logger
4 dfs.class
3 dfs.server.namenode.
2 dfs.period
2 dfs.audit.log.maxfilesize
2 dfs.audit.log.maxbackupindex
1 dfsmetrics.log
1 dfsadmin
1 dfs.servers
1 dfs.replication
1 dfs.file
(5)关闭hdfs
$ sbin/stop-dfs.sh
三、安装YARN
1、配置etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
2、配置etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
3、启动yarn
$ sbin/start-yarn.sh
4、打开页面检查yarn
http://localhost:8088/
5、运行一个map-reduce job
$ bin/hadoop fs -mkdir /input
$ bin/hadoop fs -copyFromLocal /etc/profile /input
$ cd /mnt/jediael/hadoop-2.6.0/share/hadoop/mapreduce
$ /mnt/jediael/hadoop-2.6.0/bin/hadoop jar hadoop-mapreduce-examples-2.6.0.jar wordcount /input /output
查看结果:
$/mnt/jediael/hadoop-2.6.0/bin/hadoop fs -cat /output/*