• Hadoop2.7.5伪分布式安装


    • 将安装包复制到/soft文件目录下


    • 解压
    [hadoop@localhost soft]$ sudo tar -zxvf hadoop-2.7.5.tar.gz

    • 删除安装包
    [hadoop@localhost soft]$ sudo rm -rf hadoop-2.7.5.tar.gz
    

    • 创建符号连接
    [hadoop@localhost soft]$ sudo ln -s hadoop-2.7.5 hadoop2.7
    [hadoop@localhost soft]$ ll
    total 8
    lrwxrwxrwx. 1 root  root   12 Dec 21 03:02 hadoop2.7 -> hadoop-2.7.5
    drwxr-xr-x. 9 20415  101 4096 Dec 15 20:12 hadoop-2.7.5
    lrwxrwxrwx. 1 root  root   11 Dec 21 02:43 jdk1.8 -> jdk1.8.0_66
    drwxr-xr-x. 8    10  143 4096 Oct  6  2015 jdk1.8.0_66
    [hadoop@localhost soft]

    • 修改环境变量
    [hadoop@localhost sbin]$ sudo nano /etc/profile
    



    • 使得环境变量生效,并查看Hadoop是否安装成功
    [hadoop@localhost sbin]$ source /etc/profile
    [hadoop@localhost sbin]$ hadoop version
    Hadoop 2.7.5
    Subversion https://shv@git-wip-us.apache.org/repos/asf/hadoop.git -r 18065c2b6806ed4aa6a3187d77cbe21bb3dba075
    Compiled by kshvachk on 2017-12-16T01:06Z
    Compiled with protoc 2.5.0
    From source with checksum 9f118f95f47043332d51891e37f736e9
    This command was run using /soft/hadoop-2.7.5/share/hadoop/common/hadoop-common-2.7.5.jar
    

    • 进入etc/hadoop目录
    [hadoop@localhost hadoop2.7]$ cd etc/hadoop
    

    • 配置文件hadoop-env.sh(配置Java环境)
    [hadoop@localhost hadoop]$ nano hadoop-env.sh
    



    参考博客:点击打开链接
    • 配置core-site.xml ===>指定hdfs的地址,即namenode,指定Hadoop运行时产生文件的存储目录
    [hadoop@localhost hadoop]$ sudo nano core-site.xml
    [sudo] password for hadoop:
    

    添加如下代码:
    <property>
      <name>fs.defaultFS</name>
      <value>hdfs://hadoop:9000</value>
    </property>
    <property>
      <name>hadoop.tmp.dir</name>
      <value>/soft/hadoop2.7/tmp</value>
    </property>

    • 配置hdfs-site.xml ====>指定hdfs副本的数量
    <property>
      <name>dfs.replication</name>
      <value>1</value>
    <property>
    <name>dfs.permissions</name>
    <value>false</value>
    </property> 
    </property>

    • 配置mapred-site.xml===>指定map reduce运行在yarn上
    [hadoop@localhost hadoop]$ sudo cp mapred-site.xml.template mapred-site.xml//修改文件名
    [hadoop@localhost hadoop]$ sudo nano mapred-site.xml
    [hadoop@localhost hadoop]$
    
    <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
    </property>
    <property>
    <name>mapred.job.tracker</name>
    <value>hadoop:9001</value>
    </property>
    
    • 配置yarn-site.xml====>指定yarn资源管理器的地址,以及reducer获取数据的方式
    <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
    </property>
    <property>
      <name>yarn.resourcemanager.hostname</name>
      <value>hadoop</value>
      </property>
    第一次添加了这个,但是后来启动的时候一直显示java.net.BindException: Problem binding to [hadoop:8031] java.net.BindException: Cannot assign
    
    
     requested address,后来去掉了这个属性就可以了

    • 开始配置ssh
    • 查看是否安装了ssh相关的软件包
    [hadoop@localhost hadoop]$ yum list installed|grep ssh
    libssh2.x86_64                         1.4.3-8.el7                     @anaconda
    openssh.x86_64                         6.4p1-8.el7                     @anaconda
    openssh-clients.x86_64                 6.4p1-8.el7                     @anaconda
    openssh-server.x86_64                  6.4p1-8.el7                     @anaconda
    [hadoop@localhost hadoop]$
    

    • 检查是否启动了ssh进程
    [hadoop@localhost hadoop]$ ps -Af |grep sshd
    root       2208      1  0 02:22 ?        00:00:00 /usr/sbin/sshd -D
    root      14509   2208  0 02:28 ?        00:00:00 sshd: hadoop [priv]
    hadoop    14574  14509  0 02:29 ?        00:00:00 sshd: hadoop@pts/1
    root      16620   2208  0 02:40 ?        00:00:03 sshd: root@notty
    root      45692   2208  0 03:31 ?        00:00:00 sshd: hadoop [priv]
    hadoop    45704  45692  0 03:31 ?        00:00:00 sshd: hadoop@pts/2
    hadoop    45985  16486  0 03:48 pts/1    00:00:00 grep --color=auto sshd
    [hadoop@localhost hadoop]$
    

    • 在客户端client生成公私密钥对
    [hadoop@localhost hadoop]$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa//表示将密钥对放在文件夹id_rsa下
    Generating public/private rsa key pair.
    Created directory '/home/hadoop/.ssh'.
    Your identification has been saved in /home/hadoop/.ssh/id_rsa.
    Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
    The key fingerprint is:
    36:db:ae:21:48:b6:e4:ed:5e:80:cf:1c:cc:47:db:d9 hadoop@localhost.localdomain
    The key's randomart image is:
    +--[ RSA 2048]----+
    |                 |
    |                 |
    |        .        |
    |     + . o o     |
    |    = = S o E    |
    |   = B = +       |
    |    + * + .      |
    |     . o o       |
    |     .o ...      |
    +-----------------+
    

    • 查看密钥对(公钥和私钥)
    [hadoop@localhost ~]$ cd .ssh
    [hadoop@localhost .ssh]$ ls
    id_rsa  id_rsa.pub
    • 将公钥追加到server的authorizekey里面
    [hadoop@localhost .ssh]$ cat id_rsa.pub>>authorized_keys
    

    • 修改authorized——keys的权限为644
    [hadoop@localhost .ssh]$ chmod 644 authorized_keys
    

    • 测试ssh退出session之后是否可以免密码登录
    [hadoop@localhost .ssh]$ ssh localhost
    The authenticity of host 'localhost (::1)' can't be established.
    ECDSA key fingerprint is a7:5b:2c:55:73:e9:9a:2e:8d:48:a5:8b:98:dd:f8:05.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
    Last login: Thu Dec 21 03:31:45 2017 from 192.168.109.1
    [hadoop@localhost ~]$ exit
    logout
    Connection to localhost closed.
    [hadoop@localhost .ssh]$ ssh localhost
    Last login: Thu Dec 21 04:01:32 2017 from localhost
    [hadoop@localhost ~]$
    • Hadoop完全分布式
    • 在本地模式,伪分布模式,完全分布式模式之间切换
    • 复制一个本地模式local
    [hadoop@localhost etc]$ sudo cp -r hadoop local
    • 将刚才的配置文件内容清除core-site.xml mapred-site.xml .....
    [hadoop@localhost local]$ nano core-site.xml
    [hadoop@localhost local]$ nano mapred-site.xml
    [hadoop@localhost local]$ nano hdfs-site.xml
    [hadoop@localhost local]$ nano yarn-site.xml
    [hadoop@localhost local]$
    

    • 复制伪分布模式,完全分布式模式
    [hadoop@localhost etc]$ sudo cp -r hadoop pseudo
    [hadoop@localhost etc]$ ls
    hadoop  local  pseudo
    [hadoop@localhost etc]$ sudo cp -r hadoop full
    [hadoop@localhost etc]$ ls
    full  hadoop  local  pseudo
    [hadoop@localhost etc]$
    • 删除Hadoop文件夹,创建符号链接来切换模式
    [hadoop@localhost etc]$ sudo rm -rf hadoop
    
    [hadoop@localhost etc]$ sudo ln -s pseudo hadoop
    [hadoop@localhost etc]$ ll
    total 12
    drwxr-xr-x. 2 root   root   4096 Dec 21 04:28 full
    lrwxrwxrwx. 1 root   root      6 Dec 21 04:30 hadoop -> pseudo
    drwxr-xr-x. 2 hadoop hadoop 4096 Dec 21 04:10 local
    drwxr-xr-x. 2 root   root   4096 Dec 21 04:28 pseudo
    [hadoop@localhost etc]$

    • 现在切换到伪分布模式,需要启动Hadoop进程,启动进程之前需要初始化Hadoop文件系统hdfs
    [hadoop@localhost:/soft/hadoop2.7/bin]hadoop namenode format
    

    • 启动所有进程 这里我使用Hadoop用户启动start-all.sh一直报错,查了很多资料说是ssh没有启动,但是我的ssh时启动成功了的,至今也不知道什么原因
    错误如下
    hadoop: ssh: connect to host hadoop port 22: Connection refused
    localhost: chown: changing ownership of ‘/soft/hadoop-2.7.5/logs’: Operation not permitted
    localhost: starting datanode, logging to /soft/hadoop-2.7.5/logs/hadoop-hadoop-datanode-localhost.localdomain.out
    查看ssh进程
    [hadoop@localhost:/home/hadoop]yum list installed|grep ssh
    libssh2.x86_64                         1.4.3-8.el7                     @anaconda
    openssh.x86_64                         6.4p1-8.el7                     @anaconda
    openssh-clients.x86_64                 6.4p1-8.el7                     @anaconda
    openssh-server.x86_64                  6.4p1-8.el7                     @anaconda
    [hadoop@localhost:/home/hadoop]ps -Af|grep sshd
    root       1712      1  0 06:51 ?        00:00:00 /usr/sbin/sshd -D //这里的sshd为root用户,我想估计是这里的问题,需要将Hadoop加入sshd -D
    root      12975   1712  0 06:53 ?        00:00:00 sshd: hadoop [priv]
    hadoop    12981  12975  0 06:53 ?        00:00:00 sshd: hadoop@pts/1
    root      13876   1712  0 07:04 ?        00:00:00 sshd: hadoop [priv]
    hadoop    13879  13876  0 07:04 ?        00:00:00 sshd: hadoop@pts/2
    hadoop    18834  18788  0 07:37 pts/2    00:00:00 grep --color=auto sshd
    [hadoop@localhost:/home/hadoop]ssh localhost
    
    后来我切换到root用户启动,就成功了,但是利用root启动的时候每个节点都需要输入root用户的密码,很麻烦
    [root@localhost:/soft/hadoop2.7/etc]start-all.sh
    
    [root@localhost:/soft/hadoop2.7/etc]jps
    18579 NodeManager
    18691 Jps
    18423 ResourceManager
    18249 SecondaryNameNode
    18043 DataNode

    
    单独启动namenode进程
    
    
    
    
    
    
    
    
    
    [root@localhost:/soft/hadoop2.7/sbin]hadoop-daemon.sh start namenode
    

    • 查看HDFS文件系统
    又出现了这个问题真是头疼
    Call From localhost/127.0.0.1 to hadoop:9000 failed on connection exception: java.net.ConnectException: Connection refused;
    
    
    
    
    
    
    最后修改了那个core-site.xml成功了
    修改如下
    [root@localhost:/soft/hadoop2.7/etc/hadoop]cat core-site.xml
    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <configuration>
    
    <property>
      <name>fs.defaultFS</name>
      <value>hdfs://localhost:9000</value>//把Hadoop改成了local host
    </property>
    <property>
      <name>hadoop.tmp.dir</name>
      <value>/root/hadoop_tmp</value>//把Hadoop的临时缓存目录重新在家目录新建了一个文件夹,然后作为Hadoop的tmp文件夹
    //见我的另外一篇博客http://blog.csdn.net/xiaoqiu_cr/article/details/78868420,这里也主要和我每次启动start-all.sh的时候,namenode总是需要手动启动有关
    </property>
    
    </configuration>
    真的很忧桑,感觉举步维艰,下面我们重新启动HDFS文件管理系统
    [root@localhost:/soft/hadoop2.7/etc/hadoop]hadoop fs -ls 
    • 创建文件夹
    [root@localhost:/soft/hadoop2.7/etc/hadoop]hadoop fs -mkdir -p /user/centos/hadoop
    [root@localhost:/soft/hadoop2.7/etc/hadoop]hadoop fs -lsr /
    lsr: DEPRECATED: Please use 'ls -R' instead.
    drwxr-xr-x   - root supergroup          0 2017-12-21 09:38 /user
    drwxr-xr-x   - root supergroup          0 2017-12-21 09:38 /user/centos
    drwxr-xr-x   - root supergroup          0 2017-12-21 09:38 /user/centos/hadoop
    

    • web端访问50070
    发现又访问失败,真是命运多舛,后来发现是防火墙的原因
    先关闭防火墙
    [root@localhost:/soft/hadoop2.7/etc/hadoop]systemctl stop firewalld.service

    • 重新访问http://192.168.109.130:50070
    终于看到了感人的页面






























































    欢迎关注我的公众号:小秋的博客 CSDN博客:https://blog.csdn.net/xiaoqiu_cr github:https://github.com/crr121 联系邮箱:rongchen633@gmail.com 有什么问题可以给我留言噢~
  • 相关阅读:
    使用git管理github项目
    router 跳转页面
    JS中[object object]怎么取值
    微信授权获取code
    闭包
    css属性clear
    javaScript循环
    css属性position
    跨域
    浅析JS内存 一
  • 原文地址:https://www.cnblogs.com/flyingcr/p/10327048.html
Copyright © 2020-2023  润新知