• 搭建伪分布式 hadoop3.1.3 + zookeeper 3.5.7 + hbase 2.2.2


    安装包

    • Hadoop 3.1.3

    • Zookeeper 3.5.7

    • Hbase 2.2.2

    所需工具链接:
    链接:https://pan.baidu.com/s/1jcenv7SeGX1gjPT9RnBsIQ
    提取码:rkca

    伪分布式并无用处,只作为课堂测试环境使用,配置所做笔记。

    配置

    Hadoop

    core-site.xml 填加配置项

    <!-- 指定HDFS中NameNode的地址 -->
    	<property>
    		<name>fs.defaultFS</name>
        		<!-- 根据实际地址进行配置 -->
          		<value>hdfs://hadoop104:8020</value>
    	</property>
    
    	<!-- 指定Hadoop运行时产生文件的存储目录 -->
    	<property>
    		<name>hadoop.data.dir</name>
    		<value>/opt/module/hadoop-3.1.3/data</value>
    	</property>
    
    	
    	<!-- 修改访问web端的用户 -->
    	<property>
    		<name>hadoop.http.staticuser.user</name>
    		<value>nevesettle</value>
    	</property>
    
    

    hdfs-site.xml 添加配置项

    <!--指定存储的副本个数 -->
    	<property>
    		<name>dfs.replication</name>
    		<value>1</value>
    	</property>
    
    	<!-- 指定namenode数据的存储目录  -->
    	<property>
                    <name>dfs.namenode.name.dir</name>
                    <value>file://${hadoop.data.dir}/name</value>
            </property>
    
    
    	<!-- 指定datanode数据的存储目录  -->
            <property>
                    <name>dfs.datanode.data.dir</name>
                    <value>file://${hadoop.data.dir}/data</value>
            </property>
    
    	<!-- 指定secondarynode数据的存储目录  -->
            <property>
                    <name>dfs.namenode.checkpoint.dir</name>
                    <value>file://${hadoop.data.dir}/namesecondary</value>
            </property>
    
    	<!-- 兼容配置-->
    	<property>
                    <name>dfs.client.datanode-restart.timeout</name>
                    <value>30s</value>
            </property>
    
    	<!--2nn web端访问地址-->
    	<property>
          		<name>dfs.namenode.secondary.http-address</name>
          		<value>hadoop104:9868</value>
    	</property>
    
    	
    	<!--nn web端访问地址-->
            <property>
                    <name>dfs.namenode.http-address</name>
                    <value>hadoop104:9870</value>
            </property>
    

    yarn-site.xml 添加配置项

    <!-- Reducer获取数据的方式 -->
    	<property>
    		<name>yarn.nodemanager.aux-services</name>
    		<value>mapreduce_shuffle</value>
    	</property>
    
    	<!-- 指定YARN的ResourceManager的地址 -->
    	<property>
    		<name>yarn.resourcemanager.hostname</name>
    		<value>hadoop104</value>
    	</property>
    
    
    	<!-- 环境变量的继承 -->
            <property>
                    <name>yarn.nodemanager.env-whitelist</name>
                    <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,PATH,LANG,TZ</value>
            </property>
    
    	
    
    
    
    	<!--开启日志聚集功能-->
    	<property>
    		<name>yarn.log-aggregation-enable</name>
    		<value>true</value>
    	</property>
    	<!--查看日志的路径-->
    	<property>
    		<name>yarn.log.server.url</name>
    		<value>http://hadoop104:19888/jobhistory/logs</value>
    	</property>
    	<!--日志存储的时间-->
    	<property>
    		<name>yarn.log-aggregation.retain-seconds</name>
    		<value>604800</value>
    	</property>
    
     	<!-- 找不到主类报错 -->
    	<property>
            <name>yarn.application.classpath</name>
    	<value>/opt/module/hadoop-3.1.3/etc/hadoop:/opt/module/hadoop-3.1.3/share/hadoop/common/lib/*:/opt/module/hadoop-3.1.3/share/hadoop/common/*:/opt/module/hadoop-3.1.3/share/hadoop/hdfs:/opt/module/hadoop-3.1.3/share/hadoop/hdfs/lib/*:/opt/module/hadoop-3.1.3/share/hadoop/hdfs/*:/opt/module/hadoop-3.1.3/share/hadoop/mapreduce/lib/*:/opt/module/hadoop-3.1.3/share/hadoop/mapreduce/*:/opt/module/hadoop-3.1.3/share/hadoop/yarn:/opt/module/hadoop-3.1.3/share/hadoop/yarn/lib/*:/opt/module/hadoop-3.1.3/share/hadoop/yarn/*
    	</value>
    	</property>
    
    

    mapred-site.xml 添加配置项

    <!-- 指定MR运行在Yarn上 -->
    	<property>
    		<name>mapreduce.framework.name</name>
    		<value>yarn</value>
    	</property>
    	
    		
    	<!-- 历史服务器端地址 -->
    	<property>
    		<name>mapreduce.jobhistory.address</name>
    		<value>hadoop104:10020</value>
    	</property>
    
    	<!-- 历史服务器web端地址 -->
    	<property>
    		<name>mapreduce.jobhistory.webapp.address</name>
    		<value>hadoop104:19888</value>
    	</property>
    
    

    workers (hadoop根目录/etc/hadoop/workers)

    hadoop104(改为自己的地址或映射)
    

    zookeeper

    zookeeper/conf 下

    zoo.cfg

    # The number of milliseconds of each tick
    tickTime=2000
    # The number of ticks that the initial 
    # synchronization phase can take
    initLimit=10
    # The number of ticks that can pass between 
    # sending a request and getting an acknowledgement
    syncLimit=5
    # the directory where the snapshot is stored.
    # do not use /tmp for storage, /tmp here is just 
    # example sakes.
    dataDir=/opt/module/zookeeper-3.5.7/zkData
    # the port at which the clients will connect
    clientPort=2181
    # the maximum number of client connections.
    # increase this if you need to handle more clients
    #maxClientCnxns=60
    #
    # Be sure to read the maintenance section of the 
    # administrator guide before turning on autopurge.
    #
    # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
    #
    # The number of snapshots to retain in dataDir
    #autopurge.snapRetainCount=3
    # Purge task interval in hours
    # Set to "0" to disable auto purge feature
    #autopurge.purgeInterval=1
    4lw.commands.whitelist=*
    server.1=hadoop104:2888:3888
    
    

    zookeeper根目录下创建目录 zkData(根据上边的配置命名)

    该目录下创建文件 myid(名称不可变)

    内容为 1 (该节点的id,进行选举时使用,虽然只有一台,但是也要写)

    hbase

    hbase-env.sh 修改与上篇博客相同,不再累述

    hbase-site.xml 添加配置项

    <property>     
    		<name>hbase.rootdir</name>     
    		<value>hdfs://hadoop104:8020/hbase</value>   
    	</property>
    
    	<property>   
    		<name>hbase.cluster.distributed</name>
    		<value>true</value>
    	</property>
    
    	<property>   
    		<name>hbase.zookeeper.quorum</name>
    	     <value>hadoop104:2181</value>
    	</property>
    
    	<property>   
    		<name>hbase.zookeeper.property.dataDir</name>
    	     <value>/opt/module/zookeeper-3.5.7/zkData</value>
    	</property>
    
    	<property>   
    		<name>hbase.unsafe.stream.capability.enforce</name>
    	     <value>false</value>
    	</property>
    

    regionservers 添加

    hadoop104
    

    编写启动、停止脚本

    依次开启、停止比较麻烦,写好了脚本供大家使用。

    #!/bin/bash
    if [ $# -lt 1 ]
    then
        echo '输入参数有误'
        exit
    fi
    
    
    case $1 in
    start)
    
        echo '========== start hdfs  =========='
        /opt/module/hadoop-3.1.3/sbin/start-dfs.sh
    
        echo '========== start yarn  =========='
        /opt/module/hadoop-3.1.3/sbin/start-yarn.sh
    
       # echo '========== start history  =========='
       # /opt/module/hadoop-3.1.3/bin/mapred --daemon start historyserver
        echo '========== start zookeeper ============'
        /opt/module/zookeeper-3.5.7/bin/zkServer.sh start
      
        echo '========== start hbase ============'
        /opt/module/hbase-2.2.2/bin/start-hbase.sh
    ;;
    
    stop)
    
        echo '========== stop hbase ============'
        /opt/module/hbase-2.2.2/bin/stop-hbase.sh
    
        echo '========== stop zookeeper ============'
        /opt/module/zookeeper-3.5.7/bin/zkServer.sh stop
    
        echo '========== stop yarn  =========='
        /opt/module/hadoop-3.1.3/sbin/stop-yarn.sh
    
        echo '========== stop hdfs  =========='
        /opt/module/hadoop-3.1.3/sbin/stop-dfs.sh
    ;;
    esac
    

    总结

    以上就是所有的配置项了,配置完成即可使用idea进行API操作了。

  • 相关阅读:
    2019年度SAP项目实践计划
    实现祖国统一其实并不难
    2018年终总结之摄影作品展
    2018年终总结之访问量较大的原创文章
    2018年终总结之AI领域开源框架汇总
    2018 AI产业界大盘点
    为什么我觉得Python烂的要死?
    SAP MM 根据采购订单反查采购申请?
    2018-8-10-win10-uwp-ApplicationView
    2018-8-10-WPF-播放-gif
  • 原文地址:https://www.cnblogs.com/wuren-best/p/13831527.html
Copyright © 2020-2023  润新知