• hadoop2.X集群安装与应用


    可参考此文档:hadoop(2.x)以hadoop2.2为例完全分布式最新高可靠安装文档(非常详细)http://www.aboutyun.com/thread-7684-1-1.html

    步骤一:下载并安装JDK

    JDK下载地址:http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

    [root@master vmware-share-folder]# cp * /usr/local/src/

    [root@master src]# chmod +x jdk-6u45-linux-x64.bin

    [root@master src]# ./jdk-6u45-linux-x64.bin

    完成后设置环境变量:

    [root@master src]# vim /etc/profile.d/java.sh
    export JAVA_HOME=/usr/local/src/jdk1.8.0_141/
    export PATH=$PATH:$JAVA_HOME/bin

    [root@master src]# source /etc/profile.d/java.sh

    [root@slave1 hadoop]# java -version
    java version "1.8.0_141"
    Java(TM) SE Runtime Environment (build 1.8.0_141-b15)
    Java HotSpot(TM) 64-Bit Server VM (build 25.141-b15, mixed mode)

    表示JDK安装成功!!!!

    在slave1和slave2做同样的操作:      如果没有么有scp 命令:yum install -y openssh-clients openssh

    [root@master src]# scp -r jdk1.8.0_141 slave1:/usr/local/src/

    [root@master src]# scp -r jdk1.8.0_141 slave2:/usr/local/src/

    之后操作如同master即可!!!!!!

    步骤二:编辑hosts文件(每台机器上都操作)

    [root@master src]# cat /etc/hosts

    192.168.244.200 master
    192.168.244.201 slave1
    192.168.244.202 slave2

    步骤三:关闭防火墙(每台机器上都操作)

    [root@master src]# /etc/init.d/iptables stop

    [root@master src]# chkconfig iptables off

    [root@master src]# vim /etc/sysconfig/selinux 

    SELINUX=disabled

    步骤四:部署免密码(SSH) 

    [root@master src]# ssh-keygen -t rsa
    Generating public/private rsa key pair.
    Enter file in which to save the key (/root/.ssh/id_rsa):
    Enter passphrase (empty for no passphrase):
    Enter same passphrase again:
    Your identification has been saved in /root/.ssh/id_rsa.
    Your public key has been saved in /root/.ssh/id_rsa.pub.
    The key fingerprint is:
    d6:9f:53:20:59:93:a4:08:c2:2d:b6:68:2e:01:a1:32 root@master
    The key's randomart image is:
    +--[ RSA 2048]----+
    |.. .... .+. |
    |o +... . +.. |
    |E o o . + . |
    |o.o . . . . |
    | + S . . |
    |. . . . o |
    | . + |
    | . |
    | |
    +-----------------+
    [root@master src]# cd /root/.ssh/
    [root@master .ssh]# ll
    total 12
    -rw------- 1 root root 1675 Jun 29 02:37 id_rsa
    -rw-r--r-- 1 root root 393 Jun 29 02:37 id_rsa.pub
    -rw-r--r-- 1 root root 794 Jun 29 02:20 known_hosts
    [root@master .ssh]# cp id_rsa.pub authorized_keys

    测试:

    [root@master .ssh]# ssh slave1

    [root@master .ssh]# ssh slave2

    成功即可!!!!

    步骤五:下载hadoop2.x 并解压

    https://dist.apache.org/repos/dist/release/hadoop/common/hadoop-2.7.3/

    [root@master src]# tar xf hadoop-2.7.3.tar.gz

    步骤六:修改配置文件

    涉及到的配置文件有7个:
    ~/hadoop-2.2.0/etc/hadoop/hadoop-env.sh
    ~/hadoop-2.2.0/etc/hadoop/yarn-env.sh
    ~/hadoop-2.2.0/etc/hadoop/slaves
    ~/hadoop-2.2.0/etc/hadoop/core-site.xml
    ~/hadoop-2.2.0/etc/hadoop/hdfs-site.xml
    ~/hadoop-2.2.0/etc/hadoop/mapred-site.xml
    ~/hadoop-2.2.0/etc/hadoop/yarn-site.xml
    以上个别文件默认不存在的,可以复制相应的template文件获得。

    1.修改hadoop-env.sh配置文件

    [root@master hadoop]# pwd
    /usr/local/src/hadoop-2.7.3/etc/hadoop

    [root@master hadoop]# cat hadoop-env.sh |grep -i java_home

    export JAVA_HOME=/usr/local/src/jdk1.6.0_45/

    2.修改yarn-env.sh配置文件

    [root@master hadoop]# cat yarn-env.sh

    export JAVA_HOME=/usr/local/src/jdk1.6.0_45/

    3.修改slaves配置文件

    [root@master hadoop]# cat slaves 
    slave1
    slave2

    4.修改core-site.xml配置文件

    [root@master hadoop]# cat core-site.xml

    <configuration>
      <property>
        <name>fs.defaultFS</name>
        <value>hdfs://master:9000</value>
      </property>

      <property>
        <name>io.file.buffer.size</name>
        <value>131072</value>
      </property>

      <property>
        <name>hadoop.proxyuser.aboutyun.hosts</name>
        <value>*</value>
      </property>

      <property>
        <name>hadoop.proxyuser.aboutyun.groups</name>
        <value>*</value>
      </property>

       <property>

        <name>hadoop.tmp.dir</name>
        <value>file:/usr/local/src/hadoop-2.7.3/tmp</value>
        <description>Abase forother temporary directories.</description>
      </property>
    </configuration>

    5.修改hdfs-site.xml配置文件

    [root@master hadoop-2.7.3]# mkdir dfs/{name,data} -p

    [root@master hadoop]# cat hdfs-site.xml
    <configuration>
      <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>master:9001</value>
      </property>


      <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/usr/local/src/hadoop-2.7.3/dfs/name</value>
      </property>

      <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:/usr/local/src/hadoop-2.7.3/dfs/data</value>
      </property>

      <property>
        <name>dfs.replication</name>
        <value>3</value>
      </property>

      <property>
        <name>dfs.webhdfs.enabled</name>
        <value>true</value>
      </property>
    </configuration>

    6.修改mapred-site.xml配置文件

    <configuration>
      <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
      </property>
      

      <property>
        <name>mapreduce.jobhistory.address</name>
        <value>master:10020</value>
      </property>
      

      <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>master:19888</value>
      </property>
    </configuration>

    7.修改yarn-site.xml配置文件

    <configuration>
      <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
      </property>
      <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
      </property>
      <property>
        <name>yarn.resourcemanager.address</name>
        <value>master:8032</value>
      </property>
      <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>master:8030</value>
      </property>
      <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>master:8035</value>
      </property>
      <property>
        <name>yarn.resourcemanager.admin.address</name>
        <value>master:8033</value>
      </property>
      <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>master:8088</value>
      </property>
    </configuration>

    8.将master的hadoop 目录复制到slave1和slave2节点上:

    [root@master src]# scp -r hadoop-2.7.3  root@slave1:/usr/local/src/

    [root@master src]# scp -r hadoop-2.7.3  root@slave2:/usr/local/src/

    9.设置环境变量

    [root@master profile.d]# cat hadoop2.sh
    export HADOOP2_HOME=/usr/local/src/hadoop-2.7.3
    export PATH=$PATH:$HADOOP2_HOME/bin:HADOOP2_HOME/sbin
    [root@master profile.d]# source hadoop2.sh

    10 .启动Hadoop

    先格式化:hadoop namenode -format

    [root@master sbin]# ./start-dfs.sh

    [root@master sbin]# ./start-yarn.sh

    [root@master sbin]# jps
    2714 NameNode
    3051 ResourceManager
    2892 SecondaryNameNode
    3310 Jps

    [root@slave1 hadoop]# jps
    1904 NodeManager
    2004 Jps
    1797 DataNode

    特别注意点:如果hadoop 版本是2.7 JDK一定要用1.7 以上版本,不然格式化namenode 会出现各种类找不到

    hadoop 2.6 可以使用jdk 1.6 版本

    ########快照案例#################

    HDFS快照是对目录进行设定,是某个目录的某一个时刻的镜像

     1 [root@master ~]# jps
     2 3604 Jps
     3 2714 NameNode
     4 3051 ResourceManager
     5 2892 SecondaryNameNode
     6 [root@master ~]# hadoop fs -ls /
     7 [root@master ~]# hadoop fs -mkdir /kuaizhao_dir
     8 [root@master ~]# hadoop fs -ls /
     9 Found 1 items
    10 drwxr-xr-x   - root supergroup          0 2017-07-20 11:02 /kuaizhao_dir
    11 [root@master ~]# hadoop fs -mkdir /kuaizhao_dir/kz_test
    12 [root@master ~]# hadoop fs /etc/passwd
    13 passwd   passwd-  
    14 [root@master ~]# hadoop fs -put /etc/passwd /kuaizhao_dir/kz_test
    15 [root@master ~]# hadoop fs -cat /kuaizhao_dir/kz_test/passwd |head
    16 root:x:0:0:root:/root:/bin/bash
    17 bin:x:1:1:bin:/bin:/sbin/nologin
    18 daemon:x:2:2:daemon:/sbin:/sbin/nologin
    19 adm:x:3:4:adm:/var/adm:/sbin/nologin
    20 lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
    21 sync:x:5:0:sync:/sbin:/bin/sync
    22 shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
    23 halt:x:7:0:halt:/sbin:/sbin/halt
    24 mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
    25 uucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologin
    1 [root@master ~]# hdfs dfsadmin -allowSnapshot /kuaizhao_dir/kz_test
    2 Allowing snaphot on /kuaizhao_dir/kz_test succeeded
    3 [root@master ~]# hdfs dfs -createSnapshot /kuaizhao_dir/kz_test s0
    4 Created snapshot /kuaizhao_dir/kz_test/.snapshot/s0
    5 [root@master ~]# hadoop fs -ls /kuaizhao_dir/kz_test/.snapshot/s0
    6 Found 1 items
    7 -rw-r--r--   3 root supergroup        854 2017-07-20 11:07 /kuaizhao_dir/kz_test/.snapshot/s0/passwd
  • 相关阅读:
    201771010113 李婷华 《面向对象程序设计(Java)》第十七周总结
    201771010113 李婷华 《面向对象程序设计(Java)》第十六周总结
    201771010113 李婷华 《面向对象程序设计(java)》第十五周总结
    201771010113 李婷华 《面向对象程序设计(java)》
    201771010113 李婷华 《面向对象程序设计(Java)》第十三周总结
    201771030102-常梦娇 实验四 软件项目案例分析
    201771030102-常梦娇 实验三 结对项目—《西北师范大学学生疫情上报系统》项目报告
    201771030102-常梦娇 实验二 个人项目—《西北师范大学学生疫情上报系统》项目报告
    201771030102-常梦娇 实验一 软件工程准备-《构建之法》初识
    201771030115-牛莉梅 实验四 软件项目案例分析
  • 原文地址:https://www.cnblogs.com/shanhua-fu/p/7091054.html
Copyright © 2020-2023  润新知