• Hadoop-2.7.3-本地模式安装-wordcount例子


    1. 准备虚拟机:linux-rhel-7.4-server,由于不使用虚拟机进行联网,所以选择host-only网络模式。此处,需要再VitralBox的管理菜单中的主机网络管理器新建一个虚拟网卡。安装完成虚拟机之后,默认网卡是关闭的,需要进行开启,指令如下:
      [root@hadoop-01 network-scripts]# vi ifcfg-enp0s3 #默认网卡配置
      
      
      TYPE=Ethernet
      PROXY_METHOD=none
      BROWSER_ONLY=no
      BOOTPROTO=dhcp
      DEFROUTE=yes
      IPV4_FAILURE_FATAL=no
      IPV6INIT=yes
      IPV6_AUTOCONF=yes
      IPV6_DEFROUTE=yes
      IPV6_FAILURE_FATAL=no
      IPV6_ADDR_GEN_MODE=stable-privacy
      NAME=enp0s3
      UUID=9e448496-ecd5-4122-a91f-91f91bd15f5e
      DEVICE=enp0s3
      ONBOOT=yes #修改为 yes,默认是no然后重启虚拟机
    2. 此时再来查看本机网络配置如下,就可以与宿主机的同网段的虚拟网卡进行通讯了,如果宿主机启用的网络共享,那么可以让虚拟机进行联网
      [root@hadoop-01 network-scripts]# ifconfig
      enp0s3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500 #已经有IP分配进来
              inet 192.168.56.101  netmask 255.255.255.0  broadcast 192.168.56.255
              inet6 fe80::bcf9:1d0d:e75d:500f  prefixlen 64  scopeid 0x20<link>
              ether 08:00:27:fb:11:51  txqueuelen 1000  (Ethernet)
              RX packets 5763894  bytes 8204104505 (7.6 GiB)
              RX errors 0  dropped 0  overruns 0  frame 0
              TX packets 310622  bytes 23522131 (22.4 MiB)
              TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
      
      lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
              inet 127.0.0.1  netmask 255.0.0.0
              inet6 ::1  prefixlen 128  scopeid 0x10<host>
              loop  txqueuelen 1  (Local Loopback)
              RX packets 1698  bytes 134024 (130.8 KiB)
              RX errors 0  dropped 0  overruns 0  frame 0
              TX packets 1698  bytes 134024 (130.8 KiB)
              TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    3. 上传准备好的程序包
      ZBMAC-C03VQ091H:实验介质 hadoop$ ls
      ZooInspector.zip                        mysql-5.7.19-1.el7.x86_64.rpm-bundle.tar
      ZooViewer.zip                           mysql-connector-java-5.1.43-bin.jar
      apache-flume-1.7.0-bin.tar.gz           pig-0.17.0.tar.gz
      apache-hive-2.3.0-bin.tar.gz            sqoop-1.4.5.bin__hadoop-0.23.tar.gz
      hadoop-2.7.3.tar.gz                     virtualbox
      hbase-1.3.1-bin.tar.gz                  winscp513setup.exe
      hue-4.0.1.tgz                           zookeeper-3.4.10.tar.gz
      jdk-8u144-linux-x64.tar.gz
      
      #使用SCP命令:
      scp ./* hadoop-01@192.168.56.101:/home/hadoop-01/
    4. 安装JDK1.8并解压:
      tar -zxvf jdk-8u144-linux-x64.tar.gz
    5. 设置当前用户的java_home 编辑~/.bash_profile
      JAVA_HOME=/home/hadoop-02/sdk-home/jdk1.8.0_144
      export JAVA_HOME
      
      PATH=$JAVA_HOME/bin:$PATH
      export PATH
    6. java环境变量设置成功后,使用 :java -version查看版本是否正确
    7. 解压hadoop
      tar -zxvf hadoop-2.7.3.tar.gz
    8. hadoop目录结构解释:
      [hadoop-02@hadoop-02 ~]$ tree -L 3 /home/hadoop-02/sdk-home/hadoop-2.7.3/
      /home/hadoop-02/sdk-home/hadoop-2.7.3/
      |-- bin # 可执行命令
      |   |-- container-executor
      |   |-- hadoop|   |-- yarn
      |   `-- yarn.cmd
      |-- etc
      |   `-- hadoop # 配置文件目录
      |       |-- capacity-scheduler.xml
      |       |-- configuration.xsl
      |       |       |-- yarn-env.sh
      |       `-- yarn-site.xml
      |-- include
      |   |-- hdfs.h
      |  |   `-- TemplateFactory.hh
      |-- lib
      |   `-- native
      |       |-- libhadoop.a
      |       |-- libhadooppipes.a
      |    
      |       `-- libhdfs.so.0.0.0
      |-- libexec
      |   |-- hadoop-config.cmd
      |   |-- hadoop-config.sh
      |  |-- LICENSE.txt
      |-- logs
      |   |-- hadoop-hadoop-02-datanode-hadoop-02.log
      |   |-- hadoop-hadoop-02-datanode-hadoop-02.out
      |   |-- |-- NOTICE.txt
      |-- README.txt
      |-- sbin # 启停脚本
      |   |-- distribute-exclude.sh
      |   |-- hadoop-daemon.sh|   `-- yarn-daemons.sh
      `-- share
          |-- doc #文档目录
          |   `-- hadoop
          `-- hadoop  #所有jar包
              |-- common
              |-- hdfs
              |-- httpfs
              |-- kms
              |-- mapreduce #内含示例jar包
              |-- tools
              `-- yarn
    9. 设置hadoop的环境变量:
      # /hadoop-2.7.3/etc/hadoop/hadoop-env.sh
      
      #修改JAVA_HOME为实际对应目录:
      # The java implementation to use.
      export JAVA_HOME=/home/hadoop-02/sdk-home/jdk1.8.0_144/
      

        

    10. 至此本机环境已经准备好找到hadoop的sbin目录执行start-all.sh
      [hadoop-02@hadoop-02 sbin]$ ./start-all.sh
      This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
      Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured.
      Starting namenodes on []
      hadoop-02@localhost's password:
      localhost: starting namenode, logging to /home/hadoop-02/sdk-home/hadoop-2.7.3/logs/hadoop-hadoop-02-namenode-hadoop-02.out
      hadoop-02@localhost's password:
      localhost: starting datanode, logging to /home/hadoop-02/sdk-home/hadoop-2.7.3/logs/hadoop-hadoop-02-datanode-hadoop-02.out
      Starting secondary namenodes [0.0.0.0]
      hadoop-02@0.0.0.0's password:
      0.0.0.0: starting secondarynamenode, logging to /home/hadoop-02/sdk-home/hadoop-2.7.3/logs/hadoop-hadoop-02-secondarynamenode-hadoop-02.out
      0.0.0.0: Exception in thread "main" java.lang.IllegalArgumentException: Invalid URI for NameNode address (check fs.defaultFS): file:/// has no authority.
      0.0.0.0: 	at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:472)
      0.0.0.0: 	at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:462)
      0.0.0.0: 	at org.apache.hadoop.hdfs.server.namenode.NameNode.getServiceAddress(NameNode.java:455)
      0.0.0.0: 	at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:229)
      0.0.0.0: 	at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:192)
      0.0.0.0: 	at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:671)
      starting yarn daemons
      starting resourcemanager, logging to /home/hadoop-02/sdk-home/hadoop-2.7.3/logs/yarn-hadoop-02-resourcemanager-hadoop-02.out
      hadoop-02@localhost's password:
      localhost: starting nodemanager, logging to /home/hadoop-02/sdk-home/hadoop-2.7.3/logs/yarn-hadoop-02-nodemanager-hadoop-02.out
      [hadoop-02@hadoop-02 sbin]$
      

        

    11. 中间如果没有配置免密登录,会出现四次输入密码,注意观察日志在启动对应的服务。
    12. 检查服务是否正常,包含如下服务:
      [hadoop-02@hadoop-02 sbin]$ jps
      6305 Jps
      6178 NodeManager
      5883 ResourceManager
      [hadoop-02@hadoop-02 sbin]$
      

        

    13. 运行wordcount 示例:
      [hadoop-02@hadoop-02 sbin]$ hadoop jar /home/hadoop-02/sdk-home/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /home/hadoop-02/test_hadoop/wordcount.txt /home/hadoop-02/test_hadoop/wordcount_output/
      

        

    14. 到输出目录就能看到结果文件如下:
      [hadoop-02@hadoop-02 sbin]$ cd /home/hadoop-02/test_hadoop/wordcount_output
      [hadoop-02@hadoop-02 wordcount_output]$ ls
      _SUCCESS  part-r-00000
      

        

    15. 至此本地环境搭建就介绍到这里
  • 相关阅读:
    redis-原理-数据结构-链表(二)
    redis-原理-数据结构-SDS(一)
    springMVC源码阅读-解决body不能重复读取问题(十二)
    spring-security使用-安全防护HttpFirewall(七)
    Redis-6.2.1 主从和哨兵模式配置
    Navicat 连接Mysql 8.0以上版本报错1251
    docker logs命令查看容器的日志
    Nginx 反向代理
    docker swarm 删除节点 (解散集群)
    Docker Swarm 命令学习
  • 原文地址:https://www.cnblogs.com/sunlightlee/p/10235921.html
Copyright © 2020-2023  润新知