• hadoop 2.7.3伪分布式安装


    hadoop 2.7.3伪分布式安装

    hadoop集群的伪分布式部署由于只需要一台服务器,在测试,开发过程中还是很方便实用的,有必要将搭建伪分布式的过程记录下来,好记性不如烂笔头。
    hadoop 2.7.3
    JDK 1.8.91

    到Apache的官网下载hadoop的二进制安装包。

    cd /home/fuxin.zhao/soft
    tar -czvf hadoop 2.7.3.tar.gz
    cd hadoop-2.7.3
    cd etc/hadoop/
    pwd

    1. 建立本机到本机的免密登录

    ssh-keygen -t rsa -P ""
    cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    ssh localhost
    

    1. 修改hadoop的配置文件

    位于$HADOOP_HOME/conf目录下的修改四个配置文件:slaves、core-site.xml
    hdfs-site.xml 、mapred-site.xml 、 yarn-site.xml

    vi etc/hadoop/yarn-env.sh

    export JAVA_HOME=/usr/local/jdk
    

    vi etc/hadoop/hadoop-env.sh

    export JAVA_HOME=/usr/local/jdk
    

    vi slaves

    ##加入本机的hostname
    fuxin.zhao@ubuntuServer01:~/soft/hadoop-2.7.3/etc/hadoop$ vi slaves 
    ubuntuServer01
    

    vi core-site.xml

    <configuration>
     <property>
       <name>fs.defaultFS</name>
       <value>hdfs://ubuntuServer01:9000</value>
     </property>
     <property>
       <name>hadoop.tmp.dir</name>
       <value>file:/home/fuxin.zhao/hadoop/tmp</value>
       <description>Abase for other temporary directories.</description>
     </property>
    </configuration>
    

    vi hdfs-site.xml:

    <configuration>
        <property>
             <name>dfs.replication</name>
             <value>1</value>
        </property>
        <property>
             <name>dfs.namenode.name.dir</name>
             <value>file:/home/fuxin.zhao/hadoop/tmp/dfs/name</value>
        </property>
        <property>
             <name>dfs.datanode.data.dir</name>
             <value>file:/home/fuxin.zhao/hadoop/tmp/dfs/data</value>
        </property>
       <property>
        <name>dfs.block.size</name>
        <value>67108864</value>
       </property>
    </configuration>
    

    vi yarn-site.xml

    <configuration>
    <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
    </property>
    <property>
      <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
      <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
      <name>yarn.scheduler.minimum-allocation-mb</name>
      <value>512</value>
    </property>
    <property>
      <name>yarn.scheduler.maximum-allocation-mb</name>
      <value>2048</value>
    </property>
    <property>
      <name>yarn.scheduler.minimum-allocation-vcores</name>
      <value>1</value>
    </property>
    <property>
      <name>yarn.scheduler.maximum-allocation-vcores</name>
      <value>2</value>
    </property>
    </configuration>
    

    vi mapred-site.xml

    <configuration>
    <property>
    	<name>mapreduce.framework.name</name>
    	<value>yarn</value>
    </property>
    <property>
    	<name>yarn.app.mapreduce.am.resource.mb</name>
    	<value>512</value>
    </property>
    <property>
    	<name>mapreduce.map.memory.mb</name>
    	<value>512</value>
    </property>
    <property>
    	<name>mapreduce.reduce.memory.mb</name>
    	<value>512</value>
    </property>
    </configuration>
    

    vi .bashrc

    export JAVA_HOME=/usr/local/jdk
    export HADOOP_HOME=/home/fuxin.zhao/soft/hadoop-2.7.3
    export PATH=${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
    

    配置完成后,执行 NameNode 的格式化:

    ./bin/hdfs namenode -format
    ./sbin/start-dfs.sh
    ./sbin/start-yarn.sh
    mr-jobhistory-daemon.sh start historyserver

    查看hdfs的web页面:
    http://ubuntuserver01:50070/
    http://ubuntuserver01:8088/

    hadoop fs -ls /
    hadoop fs -mkdir /user
    hadoop fs -mkdir /user/fuxin.zhao
    hadoop fs -touchz textFile

    运行官方自带的测试job(teragen and terasort):

    测试job(teragen and terasort)
    #在/tmp/terasort/1000000下生成100M数据
    hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar teragen 1000000 /tmp/terasort/1000000-input
    
    #排序,输出到/tmp/terasort/1000000-output
    hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar terasort /tmp/terasort/1000000-input /tmp/terasort/1000000-output
    
    #删除临时文件
    hadoop fs -rm -r /tmp/terasort/1000000-input
    hadoop fs -rm -r /tmp/terasort/1000000-output
    
    
  • 相关阅读:
    Windows 7系统安装MySQL5.5.21图解
    VB中DateDiff 函数解释
    curl命令具体解释
    SecureCRT 6.7.1 注冊机 和谐 破解 补丁 方法
    CSDN--十年
    SxsTrace工具用法
    Gamma校正及其OpenCV实现
    Linux--对文件夹下的配置文件批量改动IP
    sublime配置全攻略
    awk笔记
  • 原文地址:https://www.cnblogs.com/honeybee/p/6400709.html
Copyright © 2020-2023  润新知