• 搭建hadoop2.6.0 HDFS HA及YARN HA


    最终结果:
    [hadoop@h41 ~]$ jps
    12723 ResourceManager
    12995 Jps
    12513 NameNode
    12605 DFSZKFailoverController

    [hadoop@h42 ~]$ jps
    12137 ResourceManager
    12233 Jps
    12009 DFSZKFailoverController
    11930 NameNode

    [hadoop@h43 ~]$ jps
    12196 DataNode
    12322 NodeManager
    12435 Jps
    11965 QuorumPeerMain
    12050 JournalNode

    [hadoop@h44 ~]$ jps
    11848 QuorumPeerMain
    11939 JournalNode
    12309 Jps
    12156 NodeManager
    12032 DataNode

    [hadoop@h45 ~]$ jps
    12357 Jps
    11989 JournalNode
    11904 QuorumPeerMain
    12204 NodeManager
    12080 DataNode

    角色分配:

     

    h41 NameNode DFSZKFailoverController ResourceManager        
    h42 NameNode DFSZKFailoverController ResourceManager        
    h43       NodeManager JournalNode QuorumPeerMain DataNode
    h44       NodeManager JournalNode QuorumPeerMain DataNode
    h45       NodeManager JournalNode QuorumPeerMain DataNode


    说明:在hadoop2.X中通常由两个NameNode组成,一个处于active状态,另一个处于standby状态。Active NameNode对外提供服务,而Standby NameNode则不对外提供服务,仅同步active namenode的状态,以便能够在它失败时快速进行切换。
    hadoop2.0官方提供了两种HDFS HA的解决方案,一种是NFS,另一种是QJM(由cloudra提出,原理类似zookeeper)。这里我使用QJM完成。主备NameNode之间通过一组JournalNode同步元数据信息,一条数据只要成功写入多数JournalNode即认为写入成功。通常配置奇数个JournalNode

    一、准备环境:
    关闭防火墙和selinux(所有虚拟机)
    service iptables stop
    chkconfig iptables off(设置自动启动为关闭)

    setenforce 0
    vi /etc/selinux/config
    SELINUX=disabled

    配置主机名和hosts(所有虚拟机)
    vi /etc/sysconfig/network
    修改HOSTNAME为HOSTNAME=h41~h45

    vi /etc/hosts(上面那步没有好像也不影响什么,但是这步必须有,所有虚拟机,都可以把初始内容都删掉后添加如下内容)
    192.168.8.41    h41
    192.168.8.42    h42
    192.168.8.43    h43
    192.168.8.44    h44
    192.168.8.45    h45

    所有机器同步时间
    ntpdate 202.120.2.101(这种方法我没有成功,好像得先yum安装ntpdate并且可以联网,参考文章:https://my.oschina.net/myaniu/blog/182959和http://www.cnblogs.com/liuyou/archive/2012/07/29/2614330.html和http://blog.csdn.net/lixianlin/article/details/7045321)
    这里我用了最笨的方法:所有虚拟机都在root用户下执行date -s "2017-05-05 12:00:00"并且重启虚拟机

    创建hadoop用户和组(所有虚拟机)
    groupadd hadoop
    useradd -g hadoop hadoop
    passwd hadoop

    切换hadoop用户(所有虚拟机)
    su - hadoop

    配置密钥验证免密码登录[所有虚拟机都要做一遍]
    h41:
    ssh-keygen -t rsa -P ''
    cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
    chmod 700 ~/.ssh/
    chmod 600 ~/.ssh/authorized_keys
    ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@h42
    ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@h43
    ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@h44
    ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@h45

    h42:
    ssh-keygen -t rsa -P ''
    cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
    chmod 700 ~/.ssh/
    chmod 600 ~/.ssh/authorized_keys
    ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@h41
    ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@h43
    ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@h44
    ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@h45
    。。。。。。。。。(h43,h44,h45重复以上步骤)

    验证:ssh 'hadoop@h42' (h42~h45可以远程登录其他虚拟机)

    创建备用目录
    mkdir -pv /home/hadoop/storage/hadoop/tmp
    mkdir -pv /home/hadoop/storage/hadoop/name
    mkdir -pv /home/hadoop/storage/hadoop/data
    mkdir -pv /home/hadoop/storage/hadoop/journal
    mkdir -pv /home/hadoop/storage/yarn/local
    mkdir -pv /home/hadoop/storage/yarn/logs
    mkdir -pv /home/hadoop/storage/hbase
    mkdir -pv /home/hadoop/storage/zookeeper/data
    mkdir -pv /home/hadoop/storage/zookeeper/logs

    scp -r /home/hadoop/storage h42:/home/hadoop/
    scp -r /home/hadoop/storage h43:/home/hadoop/
    scp -r /home/hadoop/storage h44:/home/hadoop/
    scp -r /home/hadoop/storage h45:/home/hadoop/

    安装jdk1.7和hadoop并配置环境变量,可以配置全局的(修改/etc/profile)也可以配置当前用户的(修改~/.bashrc文件),这里我配置是当前用户的环境变量(包括hbase和hive的我也配了,虽然目前用不上)

    h41切换root用户
    安装jdk
    [root@h41 usr]# tar -zxvf jdk-7u25-linux-i586.tar.gz
    [root@h41 usr]# scp -r /usr/jdk1.7.0_25/ h42:/usr/ (这些步骤需要root用户密码。。。)
    [root@h41 usr]# scp -r /usr/jdk1.7.0_25/ h43:/usr/
    [root@h41 usr]# scp -r /usr/jdk1.7.0_25/ h44:/usr/
    [root@h41 usr]# scp -r /usr/jdk1.7.0_25/ h45:/usr/

    h41再切换回hadoop用户
    vi ~/.bashrc

    export JAVA_HOME=/usr/jdk1.7.0_25
    export JRE_HOME=${JAVA_HOME}/jre
    export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
    export PATH=${JAVA_HOME}/bin:$PATH
    ##java
    export HADOOP_HOME=/home/hadoop/hadoop
    export HIVE_HOME=/home/hadoop/hive
    export HBASE_HOME=/home/hadoop/hbase
    ##hadoop hbase hive
    export HADOOP_MAPRED_HOME=${HADOOP_HOME}
    export HADOOP_COMMON_HOME=${HADOOP_HOME}
    export HADOOP_HDFS_HOME=${HADOOP_HOME}
    export YARN_HOME=${HADOOP_HOME}
    export HADOOP_YARN_HOME=${HADOOP_HOME}
    export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
    export HDFS_CONF_DIR=${HADOOP_HOME}/etc/hadoop
    export YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop
    export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin:$HIVE_HOME/bin

    scp ~/.bashrc h42:~/.bashrc
    scp ~/.bashrc h43:~/.bashrc
    scp ~/.bashrc h44:~/.bashrc
    scp ~/.bashrc h45:~/.bashrc

    使环境变量生效,并且所有虚拟机都执行一遍,否则的话jps命令不好使,并最终导致zookeeper无法启动成功
    [hadoop@h41 ~]$ source ~/.bashrc
    [hadoop@h42 ~]$ source ~/.bashrc
    [hadoop@h43 ~]$ source ~/.bashrc
    [hadoop@h44 ~]$ source ~/.bashrc
    [hadoop@h45 ~]$ source ~/.bashrc

    二、部署hadoop-2.6.0的namenoe HA、resource manager HA
    解压、改名
    tar -zxvf hadoop-2.6.0.tar.gz -C /home/hadoop
    cd /home/hadoop
    mv hadoop-2.6.0 hadoop

    配置hadoop环境变量[准备环境时已做,略]

    验证hadoop安装成功
    hadoop version

    修改hadoop配置文件
    vi /home/hadoop/hadoop/etc/hadoop/core-site.xml

    添加:

    <!-- 指定hdfs的nameservice为gagcluster,是NameNode的URI。hdfs://主机名:端口/ -->
    (这里被http://www.it610.com/article/3334284.htm的博主死坑了一把,原文章中这里写的是<value>hdfs://gagcluster:9000</value>正确的写法应该是把端口去掉,这样写<value>hdfs://gagcluster</value>,否则在搭建完成之后在执行hadoop fs -mkdir /input的时候却报错:
    mkdir: Port 9000 specified in URI hdfs://gagcluster:9000 but host 'gagcluster' is a logical (HA) namenode and does not use port information.)
      <property>
        <name>fs.defaultFS</name>
        <value>hdfs://gagcluster</value>
      </property>
     
      <property>
        <name>io.file.buffer.size</name>
        <value>131072</value>
      </property>
     
    <!-- 指定hadoop临时目录 -->
      <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/storage/hadoop/tmp</value>
        <description>Abase for other temporary directories.</description>
      </property>
     
    <!--指定可以在任何IP访问-->
      <property>
        <name>hadoop.proxyuser.hduser.hosts</name>
        <value>*</value>
      </property>
     
    <!--指定所有用户可以访问-->
      <property>
        <name>hadoop.proxyuser.hduser.groups</name>
        <value>*</value>
      </property>
     
    <!-- 指定zookeeper地址 -->
      <property>
        <name>ha.zookeeper.quorum</name>
        <value>h43:2181,h44:2181,h45:2181</value>
      </property>


    vi /home/hadoop/hadoop/etc/hadoop/hdfs-site.xml
    添加:

    <!--节点黑名单列表文件,用于下线hadoop节点 -->
    <property>
      <name>dfs.hosts.exclude</name>
      <value>/home/hadoop/hadoop/etc/hadoop/exclude</value>
    </property>
     
    <!--指定hdfs的block大小64M -->
      <property> 
        <name>dfs.block.size</name> 
        <value>67108864</value>
      </property>
     
    <!--指定hdfs的nameservice为gagcluster,需要和core-site.xml中的保持一致 -->
      <property>
        <name>dfs.nameservices</name>
        <value>gagcluster</value>
      </property>
     
    <!-- gagcluster下面有两个NameNode,分别是nn1,nn2 -->
      <property>
        <name>dfs.ha.namenodes.gagcluster</name>
        <value>nn1,nn2</value>
      </property>
     
    <!-- nn1的RPC通信地址 -->
      <property>
        <name>dfs.namenode.rpc-address.gagcluster.nn1</name>
        <value>h41:9000</value>
      </property>
     
    <!-- nn1的http通信地址 -->
      <property>
        <name>dfs.namenode.http-address.gagcluster.nn1</name>
        <value>h41:50070</value>
      </property>
     
    <!-- nn2的RPC通信地址 -->
      <property>
        <name>dfs.namenode.rpc-address.gagcluster.nn2</name>
        <value>h42:9000</value>
      </property>
     
    <!-- nn2的http通信地址 -->
      <property>
        <name>dfs.namenode.http-address.gagcluster.nn2</name>
        <value>h42:50070</value>
      </property>
     
    <!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
      <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://h43:8485;h44:8485;h45:8485/gagcluster</value>
      </property>
     
    <!-- 配置失败自动切换实现方式 -->
      <property>
        <name>dfs.client.failover.proxy.provider.gagcluster</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
      </property>
     
    <!-- 配置隔离机制 -->
      <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence</value>
      </property>
     
    <!-- 使用隔离机制时需要ssh免密码登陆 -->
      <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/home/hadoop/.ssh/id_rsa</value>
      </property>
     
    <!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
      <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/home/hadoop/storage/hadoop/journal</value>
      </property>
     
    <!--指定支持高可用自动切换机制-->
      <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
      </property>
     
    <!--指定namenode名称空间的存储地址-->
      <property>  
        <name>dfs.namenode.name.dir</name>  
        <value>/home/hadoop/storage/hadoop/name</value> 
      </property>
     
     <!--指定datanode数据存储地址-->
      <property>  
        <name>dfs.datanode.data.dir</name>  
        <value>file:/home/hadoop/storage/hadoop/data</value> 
      </property>
     
    <!--指定数据冗余份数-->
      <property>  
        <name>dfs.replication</name>  
        <value>3</value>
      </property>
     
    <!--指定可以通过web访问hdfs目录-->
      <property> 
        <name>dfs.webhdfs.enabled</name> 
        <value>true</value>
      </property>
     
    <!--保证数据恢复 --> 
      <property> 
        <name>dfs.journalnode.http-address</name> 
        <value>0.0.0.0:8480</value> 
      </property>
     
      <property> 
        <name>dfs.journalnode.rpc-address</name> 
        <value>0.0.0.0:8485</value> 
      </property>
     
      <property>
        <name>ha.zookeeper.quorum</name>
        <value>h43:2181,h44:2181,h45:2181</value>
      </property>


    vi /home/hadoop/hadoop/etc/hadoop/mapred-site.xml
    添加:

    <configuration>
    <!-- 配置MapReduce运行于yarn中 -->
      <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
      </property>
     
    <!-- 配置 MapReduce JobHistory Server 地址 ,默认端口10020 -->
      <property>
        <name>mapreduce.jobhistory.address</name>
        <value>0.0.0.0:10020</value>
      </property>
     
    <!-- 配置 MapReduce JobHistory Server web ui 地址, 默认端口19888 -->
      <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>0.0.0.0:19888</value>
      </property>
    </configuration>


    vi /home/hadoop/hadoop/etc/hadoop/yarn-site.xml
    添加:

    <!--日志聚合功能-->
      <property>
         <name>yarn.log-aggregation-enable</name>
         <value>true</value>
      </property>
     
    <!--在HDFS上聚合的日志最长保留多少秒。3天-->
      <property>
         <name>yarn.log-aggregation.retain-seconds</name>
         <value>259200</value>
      </property>
     
    <!--rm失联后重新链接的时间-->
      <property>
         <name>yarn.resourcemanager.connect.retry-interval.ms</name>
         <value>2000</value>
      </property>
     
    <!--开启resource manager HA,默认为false-->
      <property>
         <name>yarn.resourcemanager.ha.enabled</name>
         <value>true</value>
      </property>
     
    <!--配置resource manager -->
      <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
      </property>
     
      <property>
        <name>ha.zookeeper.quorum</name>
        <value>h43:2181,h44:2181,h45:2181</value>
      </property>
     
    <!--开启故障自动切换-->
      <property>
         <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
         <value>true</value>
      </property>
     
      <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>h41</value>
      </property>
                        
      <property>
         <name>yarn.resourcemanager.hostname.rm2</name>
         <value>h42</value>
      </property>
     
    <!--在namenode1上配置rm1,在namenode2上配置rm2,注意:一般都喜欢把配置好的文件远程复制到其它机器上,但这个在YARN的另一个机器上一定要修改-->
      <property>
        <name>yarn.resourcemanager.ha.id</name>
        <value>rm1</value>
      <description>If we want to launch more than one RM in single node, we need this configuration</description>
      </property>
     
    <!--开启自动恢复功能-->
      <property>
        <name>yarn.resourcemanager.recovery.enabled</name>
        <value>true</value>
      </property>
     
    <!--配置与zookeeper的连接地址-->
      <property>
        <name>yarn.resourcemanager.zk-state-store.address</name>
        <value>h43:2181,h44:2181,h45:2181</value>
      </property>
     
      <property>
        <name>yarn.resourcemanager.store.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
      </property>
     
      <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>h43:2181,h44:2181,h45:2181</value>
      </property>
     
      <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>gagcluster-yarn</value>
      </property>
     
    <!--schelduler失联等待连接时间-->
      <property>
        <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
        <value>5000</value>
      </property>
     
    <!--配置rm1-->
      <property>
        <name>yarn.resourcemanager.address.rm1</name>
        <value>h41:8132</value>
      </property>
     
      <property>
        <name>yarn.resourcemanager.scheduler.address.rm1</name>
        <value>h41:8130</value>
      </property>
     
      <property>
        <name>yarn.resourcemanager.webapp.address.rm1</name>
        <value>h41:8188</value>
      </property>
     
      <property>
        <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
        <value>h41:8131</value>
      </property>
     
      <property>
        <name>yarn.resourcemanager.admin.address.rm1</name>
        <value>h41:8033</value>
      </property>
     
      <property>
        <name>yarn.resourcemanager.ha.admin.address.rm1</name>
        <value>h41:23142</value>
      </property>
     
    <!--配置rm2-->
      <property>
        <name>yarn.resourcemanager.address.rm2</name>
        <value>h42:8132</value>
      </property>
     
      <property>
        <name>yarn.resourcemanager.scheduler.address.rm2</name>
        <value>h42:8130</value>
      </property>
     
      <property>
        <name>yarn.resourcemanager.webapp.address.rm2</name>
        <value>h42:8188</value>
      </property>
     
      <property>
        <name>yarn.resourcemanager.resource-tracker.address.rm2</name>
        <value>h42:8131</value>
      </property>
     
      <property>
        <name>yarn.resourcemanager.admin.address.rm2</name>
        <value>h42:8033</value>
      </property>
     
      <property>
        <name>yarn.resourcemanager.ha.admin.address.rm2</name>
        <value>h42:23142</value>
      </property>
     
      <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
      </property>
     
      <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
      </property>
     
      <property>
        <name>yarn.nodemanager.local-dirs</name>
        <value>/home/hadoop/storage/yarn/local</value>
      </property>
     
      <property>
        <name>yarn.nodemanager.log-dirs</name>
        <value>/home/hadoop/storage/yarn/logs</value>
      </property>
     
      <property>
        <name>mapreduce.shuffle.port</name>
        <value>23080</value>
      </property>
     
    <!--故障处理类-->
      <property>
        <name>yarn.client.failover-proxy-provider</name>
        <value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value>
      </property>
     
      <property>
          <name>yarn.resourcemanager.ha.automatic-failover.zk-base-path</name>
          <value>/yarn-leader-election</value>
          <description>Optional setting. The default value is /yarn-leader-election</description>
      </property>

    配置DataNode节点
    vi /home/hadoop/hadoop/etc/hadoop/slaves

    h43
    h44
    h45

    创建exclude文件,用于以后下线hadoop节点
    touch /home/hadoop/hadoop/etc/hadoop/exclude

    同步hadoop工程到h42~45机器上面
    scp -r /home/hadoop/hadoop h42:/home/hadoop/
    scp -r /home/hadoop/hadoop h43:/home/hadoop/
    scp -r /home/hadoop/hadoop h44:/home/hadoop/
    scp -r /home/hadoop/hadoop h45:/home/hadoop/

    修改nn2(h42)配置文件yarn-site.xml
    修改这一处为:

    <property>
        <name>yarn.resourcemanager.ha.id</name>
        <value>rm2</value>
      <description>If we want to launch more than one RM in single node, we need this configuration</description>
      </property>

    三、部署zookeeper3.4.5三节点完全分布式集群
    使用三台服务器安装zookeeper,安装在hadoop用户上(zookeeper最好是奇数安装,其实我这五台机器任意三台安装都可以,我这里选择了h43,h44,h45这三台虚拟机来安装)
    h43 192.168.8.43
    h44 192.168.8.44
    h45 192.168.8.45

    解压、改名(在h43上)
    tar xf zookeeper-3.4.5.tar.gz -C /home/hadoop/
    mv /home/hadoop/zookeeper-3.4.5/ /home/hadoop/zookeeper
    cd /home/hadoop/zookeeper

    修改配置文件
    vi /home/hadoop/zookeeper/conf/zoo.cfg

    tickTime=2000
    initLimit=5
    syncLimit=2
    dataDir=/home/hadoop/storage/zookeeper/data
    dataLogDir=/home/hadoop/storage/zookeeper/logs
    clientPort=2181
    server.1=h43:2888:3888
    server.2=h44:2888:3888
    server.3=h45:2888:3888

    同步到h44、h45节点
    scp -r /home/hadoop/zookeeper h44:/home/hadoop
    scp -r /home/hadoop/zookeeper h45:/home/hadoop

    创建zookeeper的数据文件和日志存放目录[准备环境已做,此步骤略]

    h43~45分别编辑myid值 
    echo 1 > /home/hadoop/storage/zookeeper/data/myid
    echo 2 > /home/hadoop/storage/zookeeper/data/myid
    echo 3 > /home/hadoop/storage/zookeeper/data/myid

     

    ###########################################################################################
    Hadoop集群首次启动过程
    ###########################################################################################
     1.如果zookeeper集群还没有启动的话, 首先把各个zookeeper起来。
    /home/hadoop/zookeeper/bin/zkServer.sh start    (记住所有的zookeeper机器都要启动)
    /home/hadoop/zookeeper/bin/zkServer.sh status (1个leader,n-1个follower)
    输入jps,会显示启动进程:QuorumPeerMain

    2.、然后在主namenode节点(h41)执行如下命令,创建命名空间
    /home/hadoop/hadoop/bin/hdfs zkfc -formatZK

    3、在h43,h44,h45节点用如下命令启日志程序
    /home/hadoop/hadoop/sbin/hadoop-daemon.sh start journalnode

    4、在主namenode节点用./bin/hadoop namenode -format格式化namenode和journalnode目录
    /home/hadoop/hadoop/bin/hadoop namenode -format
    验证成功
    在zookeeper节点执行
    /home/hadoop/zookeeper/bin/zkCli.sh
    [zk: localhost:2181(CONNECTED) 0] ls /
    [hadoop-ha, zookeeper]
    [zk: localhost:2181(CONNECTED) 1] ls /hadoop-ha 
    [gagcluster]
    [zk: localhost:2181(CONNECTED) 2] quit

    5、在主namenode节点启动namenode进程
    /home/hadoop/hadoop/sbin/hadoop-daemon.sh start namenode

    6、在备namenode节点执行第一行命令,把备namenode节点的目录格式化并把元数据从主namenode节点copy过来,并且这个命令不会把journalnode目录再格式化了!然后用第二个命令启动备namenode进程!
    /home/hadoop/hadoop/bin/hdfs namenode -bootstrapStandby【或者直接scp -r /home/hadoop/storage/hadoop/name h42:/home/hadoop/storage/hadoop】
    /home/hadoop/hadoop/sbin/hadoop-daemon.sh start namenode

    7、在两个namenode节点都执行以下命令
    /home/hadoop/hadoop/sbin/hadoop-daemon.sh start zkfc

    8、启动datanode
    方法一、
    在所有datanode节点都执行以下命令启动datanode(我在h43上执行后h43,h44,h45的DataNode就都启动了)
    /home/hadoop/hadoop/sbin/hadoop-daemons.sh start datanode
    方法二、
    启动datanode节点多的时候,可以直接在主NameNode(nn1)上执行如下命令一次性启动所有datanode
    /home/hadoop/hadoop/sbin/hadoop-daemons.sh start datanode

    9. 启动YARN(在namenode1和namenode2上执行)
    /home/hadoop/hadoop/sbin/start-yarn.sh

    注意:
    在namenode2上执行此命令时会提示NodeManager已存在等信息不用管这些,主要是启动namenode2上的resourceManager完成与namenode1的互备作用,目前没有找到单独启动resourceManager的方法
    启动完成之后可以在浏览器中输入http://192.168.8.41:50070和http://192.168.8.42:50070查看namenode分别为Standby和Active。
    在namenode1上执行${HADOOP_HOME}/bin/yarn rmadmin -getServiceState rm1查看rm1和rm2分别为active和standby状态,也可以通过浏览器访问http://192.168.8.41:8188查看状态

     

    验证YARN:
    然后我想运行一个mr小程序:
    (详情请看我的另一篇文章:新装的hadoop2版本无法运行mapreduce的解决方法
    可以在h41,h42上成功运行。
    [hadoop@h41 ~]$ hadoop fs -cat /output/part-00000
    hadoop  1
    hello   3
    hive    1
    world   1
    [hadoop@h42 ~]$ ${HADOOP_HOME}/bin/yarn rmadmin -getServiceState rm1
    active
    [hadoop@h42 ~]$ ${HADOOP_HOME}/bin/yarn rmadmin -getServiceState rm2
    standby
    [hadoop@h41 ~]$ jps
    12723 ResourceManager
    14752 Jps
    12513 NameNode
    12605 DFSZKFailoverController
    [hadoop@h41 ~]$ kill -9 12723
    [hadoop@h42 ~]$ ${HADOOP_HOME}/bin/yarn rmadmin -getServiceState rm1

    [hadoop@h42 ~]$ ${HADOOP_HOME}/bin/yarn rmadmin -getServiceState rm2
    active
    手动启动那个挂掉的ResourceManager
    [hadoop@h42 ~]$ ${HADOOP_HOME}/bin/yarn rmadmin -getServiceState rm1
    standby

    验证HDFS HA:
    然后再kill -9掉active的NameNode
    通过浏览器访问:http://192.168.8.41:50070
    这个时候h41上的NameNode变成了active 
    在执行命令:hadoop fs -cat /output/part-00000
    刚才上传的文件依然存在!!! 
    手动启动那个挂掉的NameNode 
    /home/hadoop/hadoop/sbin/hadoop-daemon.sh start namenode
    通过浏览器访问:http://192.168.8.42:50070 
    NameNode ‘h42’ (standby)

  • 相关阅读:
    去除 SQL Server 查询结果中的两边空格
    Ubuntu 中安装 Oracle 10g
    不同格式的下拉列表框
    闲来无趣,写了个简单的JavaScript验证码
    Ubuntu 任务前后台调度管理
    C#数据类型转换,Convert
    OleDbType,C#,access 对应数据类型,互相对应
    SQL 将查询出的表当做 value 插入到表中
    asp.net mvc && asp.net 页面跳转
    asp.net mvc 与 asp.net结合(asp.net mvc 技巧)
  • 原文地址:https://www.cnblogs.com/jieran/p/9314136.html
Copyright © 2020-2023  润新知