• hadoop搭建部署


    HDFS(Hadoop Distributed File System)和Mapreduce是hadoop的两大核心:

    HDFS(文件系统)实现分布式存储的底层支持

    Mapreduce(编程模型)实现分布式并行任务处理的程序支持

    JobTracker   对应于 NameNode

    TaskTracker 对应于 DataNode

    DataNode和NameNode   是针对数据存放来而言的

    JobTracker和TaskTracker是对于MapReduce执行而言的

    从官网下载安装包:

    wget  http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz

    JDK安装和ssh免密码等此处不再讲述

    hadoop环境变量配置:

    vim /etc/profile.d/hadoop.sh 

    HADOOP_HOME=/usr/local/hadoop
    HADOOP_HEAPSIZE=2048
    HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
    PATH=$HADOOP_HOME/bin:$PATH
    HADOOP_OPTS=-Djava.library.path=$HADOOP_HOME/lib/native

    然后主要配置下面5个配置文件:

    core-site.xml

    hdfs-site.xml

    mapred-site.xml

    yarn-site.xml

    slave

    以上各配置文件的各项参数默认值:

    http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/core-default.xml

    http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

    http://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

    http://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

    vim core-site.xml 在<configuration>处添加以下部分

    <configuration>
         <property>
             <name>fs.defaultFS</name>
             <value>hdfs://dataMaster30:9000</value>
         </property>
         <property>
             <name>hadoop.tmp.dir</name>
             <value>file:/usr/local/hadoop/tmp</value>
            <description>Abase for other temporary directories.</description>
         </property>
     <property>
            <name>io.file.buffer.size</name>
            <value>131702</value>
        </property> </configuration>

    vim hdfs-site.xml

    <configuration>
           <property>
                    <name>dfs.namenode.secondary.http-address</name>
                    <value>dataMaster30:9001</value>
            </property>
            <property>
                   <name>dfs.replication</name>
                   <value>3</value>
            </property>
            <property>
                   <name>dfs.blocksize</name>
                   <value>512m</value>
            </property>
            <property>
                  <name>dfs.namenode.name.dir</name>
                  <value>file:/data/hadoop/name</value>
            </property>
            <property>
                 <name>dfs.datanode.data.dir</name> 
                 <value>file:/data/hadoop/hdfs</value>
           </property>
           <property>
                 <name>dfs.webhdfs.enabled</name>
                 <value>true</value>
           </property>  
    <property>
    <name>dfs.permissions</name>
    <value>false</value>
    </property>
    </configuration>

    vim mapred-site.xml

    <configuration>
            <property>
                    <name>mapreduce.framework.name</name>
                    <value>yarn</value>
            </property>
            <property>
                    <name>mapreduce.jobhistory.address</name>
                    <value>dataMaster30:10020</value>
            </property>
            <property>
                    <name>mapreduce.jobhistory.webapp.address</name>
                    <value>dataMaster30:19888</value>
            </property>
    
            <property>
                    <name>mapreduce.map.memory.mb</name>
                    <value>2048</value>
                    <description>每个Map任务的物理内存限制</description>
            </property>
    
            <property>
                    <name>mapreduce.reduce.memory.mb</name>
                    <value>2048</value>
                    <description>每个Reduce任务的物理内存限制</description>
            </property>
    </configuration>

    vim yarn-site.xml

    <configuration>
            <property>
                     <name>yarn.resourcemanager.hostname</name>
                     <value>dataMaster30</value>
            </property>
            <property>
                     <name>yarn.nodemanager.aux-services</name>
                     <value>mapreduce_shuffle</value>
            </property>
            <property>
                    <name>yarn.nodemanager.resource.memory-mb</name>
                    <value>65366</value>
                    <discription>每个节点可用内存,单位MB</discription>
            </property>
      
            <property>
                    <name>yarn.scheduler.minimum-allocation-mb</name>
                    <value>2048</value>
                    <discription>单个任务可申请最少内存,默认1024MB</discription>
            </property>
      
            <property>
                    <name>yarn.scheduler.maximum-allocation-mb</name>
                    <value>16384</value>
                    <discription>单个任务可申请最大内存,默认8192MB</discription>
            </property>
             <property>
                    <name>yarn.nodemanager.resource.cpu-vcores</name>
                    <value>16</value>
                    <discription>cpu</discription>
            </property>
    </configuration>

     vim slave

    #localhost
    dataSlave31 dataSlave32 dataSlave33 dataSlave34 dataSlave35

    完成后,将配置好的Hadoop目录分发到各个slave节点对应位置上

    在Master节点服务器启动hadoop集群,从节点会自动启动,进入hadoop目录
    (1)初始化,格式化Hadoop。输入命令,bin/hdfs namenode -format
    (2)全部启动sbin/start-all.sh,也可以分开sbin/start-dfs.sh、sbin/start-yarn.sh
    (3)停止的话,输入命令,sbin/stop-all.sh
    (4)输入命令,jps,可以看到相关进程信息,从而进行验证是否启动成功。

    如果输入jps出现process information unavailable提示时,这时可以进于是/tmp目录下,删除名称为hsperfdata_{username}的文件夹,然后重新启动Hadoop即可。

    # jps (主节点)

    1701 SecondaryNameNode
    1459 NameNode
    2242 Jps
    1907 ResourceManager

    # jps (从节点)

    4520 Jps
    9677 NodeManager
    9526 DataNode

    这时可以浏览器打开 IP:8088 和 IP:50070 就可以查看集群状态NameNode信息了

    Hadoop Shell命令:

    http://blog.csdn.net/wuwenxiang91322/article/details/22166423

    http://hadoop.apache.org/docs/r1.0.4/cn/hdfs_shell.html

  • 相关阅读:
    MySQL练习题
    InnoDB存储引擎+显示数据库引擎
    EMP+DEPT+SALGRADE 表的基本操作2
    EMP+DEPT+SALGRADE 表的基本操作1
    Git的使用
    JavaScript动画实例:旋转的正三角形
    JavaScript动画实例:曲线的绘制
    JavaScript动画实例:螺旋线
    JavaScript图形实例:阿基米德螺线
    JavaScript图形实例:平面镶嵌图案
  • 原文地址:https://www.cnblogs.com/wjoyxt/p/5509624.html
Copyright © 2020-2023  润新知