• 在centos7上安装部署hadoop2.7.3和spark2.0.0


    一、安装装备

    下载安装包:

    de3daa69-60a9-4901-ac87-42b01d8aee72

    vmware workstations pro 12

    三台centos7.1 mini 虚拟机

    网络配置NAT网络如下:

    29617e38-ecca-4cee-9281-af9d7ac77a9c

    二、创建hadoop用户和hadoop用户组

    1. groupadd hadoop

    2. useradd hadoop

    3. 给hadoop用户设置密码

        在root用户下:passwd hadoop设置新密码

    三、关闭防火墙和selinux

    1. yum install -y firewalld

    2. systemctl stop firewalld

    3. systemctl disable firewalld

    4. vi /etc/selinux/config

    a1ead171-185c-4fa8-be2b-4a813b17e604

    5.全部设置好重启虚拟机

    四、三台虚拟机之间ssh互通

    1. 在hadoop用户下

    1. ssh-keygen -t rsa

    2. ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub hadoop@虚拟机ip

    3. ssh namenode/dnode1/dnode2

    五、安装Java

    1. 官网下载jdk1.8.rpm包

    2.  rpm -ivh jdk1.8.rpm

    六、安装hadoop

    1. 官网下载hadoop2.7.3.tar.gz

    2. tar xzvf hadoop2.7.3.tar.gz

    3. mv hadoop2.7.3 /usr/local/

    4. chown -R hadoop:hadoop /usr/local/hadoop-2.7.3

    七、环境变量配置

    1.vim /home/hadoop/.bash_profile

    source /home/hadoop/.bash_profile

    1. export JAVA_HOME=/usr/java/jdk1.8.0_101
    2. export JRE_HOME=$JAVA_HOME/jre
    3. export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
    4. export HADOOP_HOME=/usr/local/hadoop-2.7.3
    5. export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
    6. export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
    7. export SCALA_HOME=/usr/local/scala-2.11.8
    8. export SPARK_HOME=/usr/local/spark-2.0.0
    9. PATH=$PATH:$HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$SCALA_HOME/bin:$SPARK_HOME/bin
    10. export PATH

    38666ff5-58b7-4378-98f0-0e0db5913762

    八、建立hadoop相关目录

    1. mkdir -p /home/hadoop/hd_space/tmp
    2. mkdir -p /home/hadoop/hd_space/hdfs/name
    3. mkdir -p /home/hadoop/hd_space/hdfs/data
    4. mkdir -p /home/hadoop/hd_space/mapred/local
    5. mkdir -p /home/hadoop/hd_space/mapred/system
    6. chown -R hadoop:hadoop /home/hadoop

    注意:至此,可以克隆虚拟机,不必每台都安装centos,克隆之后再建立互通

    九、配置hadoop

    1. 配置core-site.xml

        vim /usr/local/hadoop-2.7.3/etc/hadoop/core-site.xml

    1. <configuration>
    2. <property>
    3. <name>fs.defaultFS</name>
    4. <value>hdfs://namenode:9000</value>
    5. </property>
    6. <property>
    7. <name>hadoop.tmp.dir</name>
    8. <value>/home/hadoop/hd_space/tmp</value>
    9. </property>
    10. <property>
    11. <name>io.file.buffer.size</name>
    12. <value>131702</value>
    13. </property>
    14. </configuration>

    e709024b-7da4-49bf-8b10-f5e75dbe5dc6

    2. 配置hdfs-site.xml

    1. <property>
    2. <name>dfs.namenode.name.dir</name>
    3. <value>/home/hadoop/hd_space/hdfs/name</value>?
    4. </property>
    5. <property>
    6. <name>dfs.datanode.data.dir</name>
    7. <value>/home/hadoop/hd_space/hdfs/data</value>
    8. </property>
    9. <property>
    10. <name>dfs.replication</name>
    11. <value>2</value>
    12. </property>
    13. <property>
    14. <name>dfs.namenode.secondary.http-address</name>
    15. <value>dnode1:50090</value>
    16. </property>
    17. <property>
    18. <name>dfs.namenode.secondary.https-address</name>
    19. <value>dnode1:50091</value>
    20. </property>

    3f142f44-e3bd-46c9-82cb-72532fd81b8b

    3. 配置mapred-site.xml

    1. <configuration>
    2. <property>
    3. <name>mapreduce.cluster.local.dir</name>
    4. <value>/home/hadoop/hd_space/mapred/local</value>
    5. </property>
    6. <property>
    7. <name>mapreduce.cluster.system.dir</name>
    8. <value>/home/hadoop/hd_space/mapred/system</value>
    9. </property>
    10. <property>
    11. <name>mapreduce.framework.name</name>
    12. <value>yarn</value>
    13. </property>
    14. <property>
    15. <name>mapreduce.jobhistory.address</name>
    16. <value>namenode:10020</value>
    17. </property>?
    18. <property>
    19. <name>mapreduce.jobhistory.webapp.address</name>
    20. <value>namenode:19888</value>
    21. </property>?
    22. </configuration>

    3f11e422-28e7-4ded-bdbc-e0e33ea1efa7

    4.配置yarn-site.xml

    1. <configuration>
    2. <!-- Site specific YARN configuration properties -->
    3. <property>
    4. <description>The?hostname?of?the?RM.</description>
    5. <name>yarn.resourcemanager.hostname</name>
    6. <value>namenode</value>
    7. </property>
    8. <property>
    9. <description>the?valid?service?name?should?only?contain?a-zA-Z0-9_?and?can?not?start?with?numbers</description>
    10. <name>yarn.nodemanager.aux-services</name>
    11. <value>mapreduce_shuffle</value>
    12. </property>?
    13. </configuration>

    3c2c4703-82eb-40bb-9ded-1e4b84d4807b

    5. 配置slaves,两个数据节点dnode1,dnode2

    106a754a-aadb-44dc-a466-cb9a6e7af4ca

    6. 拷贝配置文件目录到另外两台数据节点

    1. for target in dnode1 dnode2 do 
    2.     scp -r /usr/local/hadoop-2.7.3/etc/hadoop $target:/usr/local/hadoop-2.7.3/etc
    3. done

    十、启动hadoop

    1. 用户登陆hadoop

    2. 格式化hdfs

    hdfs namenode -format

    3. 启动dfs

    start-dfs.sh

    4. 启动yarn

    start-yarn.sh

    十一、错误解决

    1. 出现以下环境变量JAVA_HOME找不到。

    解决办法:需要重新配置/usr/local/hadoop-2.7.3/libexec/hadoop-config.sh

    e8551d2c-1a93-4cfa-9154-d43bd9a49125

    5f34bb91-23d3-42bf-85e8-0a3854743c24

    2. oracle数据出现以下错误解决

    TNS-12555: TNS:permission denied,监听器启动不了 lsnrctl start(启动)/status(查看状态)/stop(停止)

    解决办法:chown -R hadoop:hadoop /var/tmp/.oracle

    chmod 777 /var/tmp/.oracle

    372ba438-0bc7-4dfe-883c-b9c11f84b49b

    问题:

    304afd74-9198-4764-9f99-3e81a4d72fca

    十二、客户端验证

    b7468680-cd0d-4ca7-a438-e81de456736f

    3dcf2e16-b548-492a-b180-28d075edfb25

    1. jps检查进程

    71afddbb-36d6-49e7-9c57-3a7a1483c237

    66b785f7-3f3d-41c5-b4fe-bc2186c8d480

    8b1c35a0-2e86-4a85-b488-8237ac85df6a

    至此,hadoop已经安装完毕

    十三、测试hadoop,hadoop用户登陆

        运行hadoop自带的wordcount实例

    1. 建立输入文件

    mkdir -p -m 755 /home/hadoop/test

    cd /home/hadoop/test

    echo "My first hadoop example. Hello Hadoop in input. " > testfile.txt

    2. 建立目录

    hadoop fs -mkdir /test

    #hadoop fs -rmr /test 删除目录

    3. 上传文件

    hadoop fs -put testfile.txt /test

    4. 执行wordcount程序

    hadoop jar /usr/local/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /test/testfile.txt /test/output

    5. 查看结果

    hadoop fs -cat /test/output/part-r-00000

    c40a842b-150b-463a-b5e6-596626729f3e

    十四、安装Spark2.0.0

    1. 解压scala-2.11.8.tar.gz到/usr/local

    tar xzvf scala-2.11.8 

    mv scala-2.11.8 /usr/local

    2. 解压spark-2.0..0.tgz 到/usr/local

    tar xzvf spark-2.0.0.tar.gz 

    mv spark-2.0.0 /usr/local

    3.配置spark

    cd /usr/local/spark-2.0.0/conf

    vim spark-env.sh

    1. export JAVA_HOME=/usr/java/jdk1.8.0_101
    2. export SCALA_HOME=/usr/local/scala-2.11.8
    3. export SPARK_HOME=/usr/local/spark-2.0.0
    4. export SPARK_MASTER_IP=namenode
    5. export SPARK_WORKER_MEMORY=1024m
    6. export master=spark://namenode:7070

    278f9cb3-3a7d-4a3c-8d77-a95ecb559938

    vim slaves

    65c89244-0ec7-4ec3-bafb-5f25b9da172a

    同步资源及配置文件到其它两个节点(hadoop用户)

    1. for target in dnode1 dnode2 do
    2. scp -r /usr/local/scala-2.11.8 $target:/usr/local
    3. scp -r /usr/local/spark-2.0.0 $target:/usr/local
    4. done

    4.启动spark集群

    cd $SPARK_HOME 

    # Start Master 

    ./sbin/start-master.sh  

    # Start Workers 

    ./sbin/start-slaves.sh

    5.客户端验证

    c00360ac-4dab-428d-8c30-781a8d87116c

    dbd13283-ed2f-41e2-90de-8d2b7be14d87

    d654bfc6-cf67-426d-aa63-94600ee5ae50

  • 相关阅读:
    git使用命令行方式提交代码到github或gitlab上
    如何创建AnjularJS项目
    基于react-native android的新闻app的开发
    Windows下搭建React Native Android开发环境
    python打怪之路【第一篇】:99乘法表
    python成长之路【第四篇】:装饰器
    python成长之路【第三篇】:函数
    python成长之路【第二篇】:列表和元组
    python成长之路【第一篇】:python简介和入门
    JavaScript进阶--慕课网学习笔记
  • 原文地址:https://www.cnblogs.com/panliu/p/6093195.html
Copyright © 2020-2023  润新知