• hadoop2.5搭建过程


    1 搭建环境所使用的资源

    VMware Workstation 9

    ubuntu-14.04.2-desktop-amd64.iso

    jdk-7u80-linux-x64.tar.gz

    hadoop-2.5.0.tar.gz

    zookeeper-3.4.5-cdh5.1.0.tar.gz

    hbase-0.98.6-cdh5.3.0.tar.gz

    实验室服务器一台

    (没有用最新版的hadoop是因为也是看别人教程搭的)

    2 准备工作

    2.1 安装虚拟机

    在Vmware上安装4台虚拟机,使用ubuntu镜像。

    如果出现左边那一列不显示

    关掉虚拟机,在虚拟机设置中,将显示里的 3D图形加速的勾去掉

    设置好用户名和密码

    2.2 设置IP地址

    设置master和slave的IP,如:

    (master01)10.109.252.94,

    (slave01)10.109.252.95,

    (slave02)10.109.252.96,

    (slave03)10.109.252.97。

     

    子码掩码:255.255.255.0

    默认网关:10.109.252.1

    首选DNS:10.3.9.4 10.3.9.5

     

    网络连接要选择桥接

     

    命令行输入ifconfig

    第一行最左边的名字,就是本机的网络接口,此处为 eth0 ,不同机器可能不同。

     

    输入命令: 

    sudo gedit /etc/network/interfaces

     

    在打开的文件中,输入以下代码:

    auto eth0  // 使用的网络接口,之前查询接口是为了这里

    iface eth0 inet static    // eth0这个接口,使用静态ip设置

    address 10.109.252.94   // 设置ip地址

    netmask 255.255.255.0  // 设置子网掩码

    gateway 10.109.252.1   // 设置网关

    dns-nameservers 10.3.9.4   // 设置dns服务器地址

     

    设置DNS:

    sudo gedit /etc/resolv.conf

    加上:

    nameserver 10.3.9.4

    nameserver 10.3.9.5

     

    用以下命令使网络设置生效:

    service networking restart

    sudo /etc/init.d/networking restart

    2.3 修改主机名

    主机名存放在/etc/hostname文件中

    sudo gedit /etc/hostname

    主机名也就是master01,slave01这些

     

    然后修改/etc/hosts文件:

    sudo gedit /etc/hosts

     

    127.0.0.1 localhost

    10.109.252.94   master01

    10.109.252.95   slave01

    10.109.252.96   slave02

    10.109.252.97   slave03

    2.4 安装配置SSH

    目的是为了无密码远程登录

    首先一定要确保虚拟机能上网

    然后输入命令:

    sudo apt-get update
    sudo apt-get install ssh

     

    输入

    ssh localhost

    查看是否安装成功

     

    关闭防火墙

    sudo ufw disable

     

    配置无密码远程登录:

     

    第一步:产生密钥

    $ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

    第二步:导入authorized_keys

    $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

     

    Master需要通过无密码的SSH登陆来控制各个slave从机,因此需要将master上的公钥复制到每个slave从机上。

    在Master上输入命令:

     

    $ scp ~/.ssh/authorized_keys mcc@slave01:~/.ssh/

    $ scp ~/.ssh/authorized_keys mcc@slave02:~/.ssh/

    $ scp ~/.ssh/authorized_keys mcc@slave03:~/.ssh/

    这里的mcc@slave01换成你自己设置的用户名@主机名

    在master01中无密码登录slave01

    输入命令:

    ssh slave01

     

    遇到问题:

    Agent admitted failure to sign using the key.

    解决办法:

    解決方式 使用 ssh-add 指令将私钥 加进来 (根据个人的密匙命名不同更改 id_dsa)
    ssh-add   ~/.ssh/id_dsa 

    之后就成功啦

    2.5 安装Java

    在master和slave上分别安装Java7

    创建目录:

    sudo mkdir /usr/lib/jvm

    解压缩到该目录:

    sudo tar -zxvf jdk-7u80-linux-x64.tar.gz -C /usr/lib/jvm

     

    修改环境变量:  

    sudo gedit ~/.bashrc

     

    文件的末尾追加下面内容:

    #set oracle jdk environment

    export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_80  

    export JRE_HOME=${JAVA_HOME}/jre  

    export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib  

    export PATH=${JAVA_HOME}/bin:$PATH  

     

    使环境变量马上生效:

     source ~/.bashrc

    3 Hadoop部署

    创建一个文件夹

    sudo mkdir /opt/modules

    解压到/opt/modules下

    sudo tar -zxf hadoop-2.5.0.tar.gz -C /opt/modules

    hadoop-2.5.0 重命名为hadoop

    sudo mv hadoop-2.5.0 hadoop

    配置之前,需要在master本地文件系统创建以下文件夹:

    ~/dfs/name

    ~/dfs/data

    ~/tmp

    mcc@master01:~$ mkdir /home/mcc/tmp

    mcc@master01:~$ mkdir /home/mcc/dfs

    mcc@master01:~$ mkdir /home/mcc/dfs/name

    mcc@master01:~$ mkdir /home/mcc/dfs/data

    ll查看权限是否为当前用户组下的

    这里要涉及到的配置文件有7个:

    ~/hadoop-2.5.0/etc/hadoop/hadoop-env.sh

    ~/hadoop-2.5.0/etc/hadoop/yarn-env.sh

    ~/hadoop-2.5.0/etc/hadoop/slaves

    ~/hadoop-2.5.0/etc/hadoop/core-site.xml

    ~/hadoop-2.5.0/etc/hadoop/hdfs-site.xml

    ~/hadoop-2.5.0/etc/hadoop/mapred-site.xml

    ~/hadoop-2.5.0/etc/hadoop/yarn-site.xml

    以上文件默认不存在的,可以复制相应的template文件获得

    进入etc/hadoop/

    修改hadoop-env.sh

    sudo gedit hadoop-env.sh

    修改这句为:

    export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_80

    修改yarn-env.sh

    sudo gedit yarn-env.sh

    修改这句为:

    export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_80

    修改slaves

    sudo gedit slaves

    修改为:

    slave01

    slave02

    slave03

    修改core-site.xml

    sudo gedit core-site.xml

    <configuration>

           <property>

                    <name>fs.defaultFS</name>

                    <value>hdfs://master01:9000</value>

           </property>

           <property>

                    <name>io.file.buffer.size</name>

                    <value>131072</value>

            </property>

           <property>

                   <name>hadoop.tmp.dir</name>

                   <value>file:/home/mcc/tmp</value>

                   <description>Abase for other temporary   directories.</description>

           </property>

    </configuration>

    修改hdfs-site.xml

    sudo gedit hdfs-site.xml

    hdfs-site.xml里改

    <configuration>

           <property>

                    <name>dfs.namenode.secondary.http-address</name>

                   <value>master01:9001</value>

           </property>

         <property>

                 <name>dfs.namenode.name.dir</name>

                 <value>file:/home/mcc/dfs/name</value>

           </property>

          <property>

                  <name>dfs.datanode.data.dir</name>

                  <value>file:/home/mcc/dfs/data</value>

           </property>

           <property>

                   <name>dfs.replication</name>

                   <value>3</value>

            </property>

            <property>

                     <name>dfs.webhdfs.enabled</name>

                      <value>true</value>

             </property>

    </configuration>

    mapred-site.xml.template  重命名为mapred-site.xml

    sudo mv mapred-site.xml.template mapred-site.xml

    修改mapred-site.xml

    sudo gedit mapred-site.xml

    <configuration>      

         <property>                                                               

         <name>mapreduce.framework.name</name>

                    <value>yarn</value>

               </property>

              <property>

                      <name>mapreduce.jobhistory.address</name>

                      <value>master01:10020</value>

              </property>

              <property>

                    <name>mapreduce.jobhistory.webapp.address</name>

                    <value>master01:19888</value>

           </property>

    </configuration>

    修改yarn-site.xml

    sudo gedit yarn-site.xml

    <configuration>

            <property>

                   <name>yarn.nodemanager.aux-services</name>

                   <value>mapreduce_shuffle</value>

            </property>

            <property>                                                                

        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

                   <value>org.apache.hadoop.mapred.ShuffleHandler</value>

            </property>

            <property>

                   <name>yarn.resourcemanager.address</name>

                   <value>master01:8032</value>

           </property>

           <property>

                   <name>yarn.resourcemanager.scheduler.address</name>

                   <value>master01:8030</value>

           </property>

           <property>

                <name>yarn.resourcemanager.resource-tracker.address</name>

                 <value>master01:8031</value>

          </property>

          <property>

                  <name>yarn.resourcemanager.admin.address</name>

                   <value>master01:8033</value>

           </property>

           <property>

                   <name>yarn.resourcemanager.webapp.address</name>

                   <value>master01:8088</value>

           </property>

    </configuration>

    修改环境变量:

    export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_80  

    export JRE_HOME=${JAVA_HOME}/jre  

    export HADOOP_HOME=/opt/modules/hadoop

    export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib  

    export PATH=${JAVA_HOME}/bin:${JRE_HOME}/bin:${HADOOP_HOME}/bin:$PATH  

    格式化

    mcc@master01:/opt/modules/hadoop$ sudo bin/hdfs namenode -format

    之后启动时报错

    解决办法:

    每个虚拟机都输一遍这个命令 修改目录的所有者  

    sudo chown -R mcc:mcc /opt/modules/

    然后又出现问题,jps不显示namenode

    解决办法:

    输入命令: sudo chmod -R 777 /home/dfs

    slave上的配置要保持和master上一致,不要去把有master的地方改成slave

    master上输:(要进入hadoop文件夹)

    sbin/start-all.sh 

    然后成功启动hadoop集群

    4 配置zookeeper

    解压:

    tar -zxf zookeeper-3.4.5-cdh5.1.0.tar.gz -C /opt/modules/

    新建一个目录:

    mcc@slave01:/opt/modules/zookeeper-3.4.5-cdh5.1.0$ mkdir zkData

    在这个目录下新建一个文件叫myid

    mcc@slave01:/opt/modules/zookeeper-3.4.5-cdh5.1.0/zkData$ touch myid

    slave01,slave02,slave03分别写上数字1,2,3

    将zoo_sample.cfg重命名

    mcc@slave01:/opt/modules/zookeeper-3.4.5-cdh5.1.0/conf$ mv zoo_sample.cfg zoo.cfg

    修改zoo.cfg:

    # The number of milliseconds of each tick

    tickTime=2000

    # The number of ticks that the initial

    # synchronization phase can take

    initLimit=10

    # The number of ticks that can pass between

    # sending a request and getting an acknowledgement

    syncLimit=5

    # the directory where the snapshot is stored.

    # do not use /tmp for storage, /tmp here is just

    # example sakes.

    dataDir=/opt/modules/zookeeper-3.4.5-cdh5.1.0/zkData

    # the port at which the clients will connect

    clientPort=2181

    #

    # Be sure to read the maintenance section of the

    # administrator guide before turning on autopurge.

    #

    # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance

    #

    # The number of snapshots to retain in dataDir

    #autopurge.snapRetainCount=3

    # Purge task interval in hours

    # Set to "0" to disable auto purge feature

    #autopurge.purgeInterval=1

     

    server.1=slave01:2888:3888

    server.2=slave02:2888:3888

    server.3=slave03:2888:3888

    将zookeeper目录移到slave02,slave03上,并修改myid

    scp -r zookeeper-3.4.5-cdh5.1.0/ slave02:/opt/modules/

    scp -r zookeeper-3.4.5-cdh5.1.0/ slave03:/opt/modules/

    修改环境变量:

    export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_80  

    export JRE_HOME=${JAVA_HOME}/jre  

    export HADOOP_HOME=/opt/modules/hadoop

    export ZOOKEEPER_HOME=/opt/modules/zookeeper-3.4.5-cdh5.1.0

    export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib  

    export PATH=${JAVA_HOME}/bin:${JRE_HOME}/bin:${HADOOP_HOME}/bin:${ZOOKEEPER_HOME}/bin:$PATH

    启动:

    $ZOOKEEPER_HOME/bin/zkServer.sh start

    Jps查看进程,已成功启动

    5 配置Hbase

    解压:

    tar -zxf hbase-0.98.6-cdh5.3.0.tar.gz -C /opt/modules/

     

    配置hbase-env.sh

    sudo gedit hbase-env.sh

     

    export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_80

    export HBASE_MANAGES_ZK=false

     

    配置hbase-site.xml

    sudo gedit hbase-site.xml

     

    <configuration>

     <property>

        <name>hbase.rootdir</name>

        <value>hdfs://master01:9000/hbase</value>

      </property>

      <property>

        <name>hbase.cluster.distributed</name>

        <value>true</value>

      </property>   

      <property>

        <name>hbase.zookeeper.quorum</name>

        <value>slave01,slave02,slave03</value>

      </property>

    </configuration>

     

     

    配置regionservers

    sudo gedit regionservers

     

    slave01

    slave02

    slave03

     

    然后要用我们的hadoop里的jar包替换hbase里的

    在hbase的lib里

    rm -rf hadoop*.jar删掉所有的hadoop相关的jar包

     

    替换

    find /opt/modules/hadoop/share/hadoop -name "hadoop*jar" | xargs -i cp {} /opt/modules/hbase-0.98.6-cdh5.3.0/lib

     

    因为Hbase是依赖于Hadoop的,它要求Hadoop的jar必须部署在HBase的lib下

     

    然后又出现了新的问题:

     

    FATAL [master:master01:60000] master.HMaster: Unhandled exception. Starting shutdown.

    java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: "master01":9000; java.net.UnknownHostException; For more details see:  

     

     

    解决办法:(玄学)

    将hbase-site.xml里的一个属性改为:

    <property>

        <name>hbase.rootdir</name>

        <value>hdfs://10.109.252.94:9000/hbase</value>

      </property>

    将hbase目录移到slave01,slave02,slave03上

    scp -r hbase-0.98.6-cdh5.3.0/ slave01:/opt/modules

    scp -r hbase-0.98.6-cdh5.3.0/ slave02:/opt/modules

    scp -r hbase-0.98.6-cdh5.3.0/ slave03:/opt/modules

    启动  在master上输

    mcc@master01:/opt/modules/hbase-0.98.6-cdh5.3.0$ ./bin/start-hbase.sh

    启动成功

  • 相关阅读:
    [freemarker篇]03.如何处理空值
    [Android篇]Android Studio + Genymotion 一夜无眠 ,超级详细版本[请使用新版2.0]
    [freemarker篇]02.生成HTML的静态页面
    [freemarker篇]01.入门Freemarker示例
    验证码-直接使用
    jquery基础介绍-转
    VBA与宏
    .net打印
    [CCF] 201612-1 中间数
    [LeetCode] 56. Merge Intervals(vector sort)
  • 原文地址:https://www.cnblogs.com/mengchunchen/p/9029122.html
Copyright © 2020-2023  润新知