• hadoop 集群搭建


    一、准备环境

    1.下载hadoop 2.9.2

    https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.9.2/hadoop-2.9.2.tar.gz

    2. 下载java8

    https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

    3. 解压、设置JAVA_HOME,PATH

    export JAVA_HOME=/home/*/hadoop/jdk1.8.0_191
    export J2SDKDIR=${JAVA_HOME}
    export J2REDIR=${JAVA_HOME}/jre
    export DERBY_HOME=${JAVE_HOME}/db
    
    export PATH=${JAVA_HOME}/bin:${JAVA_HOME}/jre/bin:${JAVA_HOME}/db/bin:$PATH
    export MANPATH=${JAVA_HOME}/man:$MANPATH

    4.设置HADOOP_HOME、etc/hadoop/hadoop-env.sh

    export HADOOP_HOME=/home/*/hadoop/hadoop-2.9.2
    export PATH=$PATH:$HADOOP_HOME/bin
    
    vim etc/hadoop/hadoop-env.sh
    修改为
    export JAVA_HOME=/home/*/hadoop/jdk1.8.0_191

    二、配置hadoop 集群

    1. 配置etc/hadoop/core-site.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <!--
      Licensed under the Apache License, Version 2.0 (the "License");
      you may not use this file except in compliance with the License.
      You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License. See accompanying LICENSE file.
    -->
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
        <property>
            <name>hadoop.tmp.dir</name>
            <value>/home/*/hadoop/tmp</value>
            <description>Abase for other temporary directories.</description>
        </property>
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://hostname:8800</value>
        </property>
    </configuration>
    View Code

    2.配置etc/hadoop/hdfs-site.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <!--
      Licensed under the Apache License, Version 2.0 (the "License");
      you may not use this file except in compliance with the License.
      You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License. See accompanying LICENSE file.
    -->
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
        <property>
            <name>dfs.replication</name>
            <value>3</value>
        </property>
        <property>
            <name>dfs.name.dir</name>
            <value>/home/*/hadoop/hdfs/name</value>
        </property>
        <property>
            <name>dfs.data.dir</name>
            <value>/home/*/hadoop/hdfs/data</value>
        </property>
        <property>
            <name>dfs.namenode.secondary.http-address</name>
            <value>hostname:8801</value>
            <description>secondarynamenode的web地址</description>
        </property>
        <property>
        <name>dfs.webhdfs.enabled</name>
        <value>true</value>
           <description>提供web访问hdfs的权限</description>
        </property>
    </configuration>
    View Code

    3.配置etc/hadoop/yarn-site.xml

    <?xml version="1.0"?>
    <!--
      Licensed under the Apache License, Version 2.0 (the "License");
      you may not use this file except in compliance with the License.
      You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License. See accompanying LICENSE file.
    -->
    <configuration>
    
    <!-- Site specific YARN configuration properties -->
        <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
        </property>
        <property>
            <name>yarn.resourcemanager.hostname</name>
            <value>hostname</value>
        </property>
    </configuration>
    View Code

    4.配置etc/hadoop/mapred-site.xml

    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <!--
      Licensed under the Apache License, Version 2.0 (the "License");
      you may not use this file except in compliance with the License.
      You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License. See accompanying LICENSE file.
    -->
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
        <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
        </property>
        <property>
            <name>mapreduce.jobtracker.address</name>
            <value>master:8010</value>
        </property>
    </configuration>
    View Code

     5.配置etc/hadoop/slaves

    hostname
    hostname-slave
    View Code

    6.配置slave

    a.可以把所有配置拷贝到slave 的同路径;

    b.路径可以不同,但是需要自己手动启动datanode,如果jps查看发现启动不成功可以查日志定位问题,如果有端口冲突可以在hdfs-site.xml 中修改。

    sbin/hadoop-daemon.sh --config etc/hadoop --script hdfs start datanode

    三、启动集群

    1.开启集群机器间免密登录

    ssh-keygen -t rsa
    ssh-copy-id -i /home/*/.ssh/id_rsa.hadoop.pub work@hostname-slave

    2.格式化hdfs文件系统(如果已经格式化就不需要了)

    bin/hdfs namenode -format

    3.启动dfs

    sbin/start-dfs.sh

    4.查看启动状态(NameNode、SecondaryNameNode、DataNode)

    $ jps
    47249 NameNode
    53393 Jps
    49952 SecondaryNameNode
    48514 DataNode
    bin/hdfs dfsadmin -report
    bin/hdfs dfs -ls /

    5.启动yarn

    sbin/start-yarn.sh

    6.查看yarn 状态(NodeManager、ResourceManager)

    $ jps
    45656 NodeManager
    47249 NameNode
    45375 ResourceManager
    52477 Jps
    49952 SecondaryNameNode
    48514 DataNode

    网页UI: http://hostname-master:8088/cluster

    7.查看集群状态

    $ bin/hdfs dfsadmin -report
    Configured Capacity: 15501430390784 (14.10 TB)
    Present Capacity: 13610185841062 (12.38 TB)
    DFS Remaining: 13606793785344 (12.38 TB)
    DFS Used: 3392055718 (3.16 GB)
    DFS Used%: 0.02%
    Under replicated blocks: 569
    Blocks with corrupt replicas: 0
    Missing blocks: 0
    Missing blocks (with replication factor 1): 0
    Pending deletion blocks: 0
    
    -------------------------------------------------
    Live datanodes (2):
    
    Name: 10.156.88.35:50010 (slave)
    Hostname: hostname
    Decommission Status : Normal
    Configured Capacity: 3875357597696 (3.52 TB)
    DFS Used: 1600417310 (1.49 GB)
    Non DFS Used: 1882846601698 (1.71 TB)
    DFS Remaining: 1990893801472 (1.81 TB)
    DFS Used%: 0.04%
    DFS Remaining%: 51.37%
    Configured Cache Capacity: 0 (0 B)
    Cache Used: 0 (0 B)
    Cache Remaining: 0 (0 B)
    Cache Used%: 100.00%
    Cache Remaining%: 0.00%
    Xceivers: 1
    Last contact: Tue Jan 29 11:30:15 CST 2019
    Last Block Report: Tue Jan 29 11:29:33 CST 2019
    
    
    Name: 10.182.48.147:50010 (master)
    Hostname: hostname
    Decommission Status : Normal
    Configured Capacity: 11626072793088 (10.57 TB)
    DFS Used: 1791639552 (1.67 GB)
    Non DFS Used: 8330874880 (7.76 GB)
    DFS Remaining: 11615899947008 (10.56 TB)
    DFS Used%: 0.02%
    DFS Remaining%: 99.91%
    Configured Cache Capacity: 0 (0 B)
    Cache Used: 0 (0 B)
    Cache Remaining: 0 (0 B)
    Cache Used%: 100.00%
    Cache Remaining%: 0.00%
    Xceivers: 1
    Last contact: Tue Jan 29 11:30:13 CST 2019
    Last Block Report: Tue Jan 29 11:29:37 CST 2019
    View Code
  • 相关阅读:
    Java堆外内存管理
    Java内存模型和JVM内存管理
    C++经典面试题(最全,面中率最高)
    115道Java经典面试题(面中率最高、最全)
    Sublime Text 3中文乱码问题的解决(最有效)
    面试笔记3
    IntelliJ IDEA使用教程(很全)
    Intellij IDEA 创建Web项目并在Tomcat中部署运行
    IDEA调试总结(设置断点进行调试)
    Tomcat_启动多个tomcat时,会报StandardServer.await: Invalid command '' received错误
  • 原文地址:https://www.cnblogs.com/yuanzhenliu/p/10331247.html
Copyright © 2020-2023  润新知