• 数据仓库_hadoop(1)


    1.安装hadoop的hdfs伪分布式部署
    2.hadoop fs常规命令
    3.配置文件在官方哪里找
    4.整理 jdk、ssh、hosts文件

    1.安装hadoop的hdfs伪分布式部署

    1.1 创建用户和目录

    [root@aliyun ~]# useradd hadoop
    [root@aliyun ~]# su - hadoop
    [hadoop@aliyun ~]$ mkdir app software sourcecode log tmp data lib
    [hadoop@aliyun ~]$ ll
    total 28
    drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 app    #解压的文件夹  软连接
    drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 data   #数据
    drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 lib    #第三方的jar
    drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 log    #日志文件夹
    drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 software #压缩包
    drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 sourcecode  #源代码编译
    drwxrwxr-x 2 hadoop hadoop 4096 Nov 28 11:26 tmp    #临时文件夹

    1.2下载/上传压缩包

    [hadoop@aliyun ~]$ cd software/
    [hadoop@aliyun software]$ wget http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.16.2.tar.gz

    1.3  解压

    [hadoop@aliyun software]$ tar -xzvf hadoop-2.6.0-cdh5.16.2.tar.gz -C ../app/
    ...
    ...
    ...
    [hadoop@aliyun software]$ cd ../app/
    [hadoop@aliyun app]$ ln -s hadoop-2.6.0-cdh5.16.2/ hadoop
    [hadoop@aliyun app]$ ll
    total 4
    lrwxrwxrwx  1 hadoop hadoop   23 Nov 28 11:36 hadoop -> hadoop-2.6.0-cdh5.16.2/
    drwxr-xr-x 14 hadoop hadoop 4096 Jun  3 19:11 hadoop-2.6.0-cdh5.16.2

    1.4环境要求

    [root@aliyun java]# mkdir /usr/java
    [root@aliyun java]# cd /usr/java
    [root@aliyun java]# rz -E
    [root@aliyun java]# tar -xzvf jdk-8u144-linux-x64.tar.gz
    [root@aliyun java]# chown -R  root:root jdk1.8.0_144/
    [root@aliyun java]# ln -s jdk1.8.0_144/ jdk
    [root@aliyun java]# ll
    total 4
    lrwxrwxrwx 1 root root   13 Nov 28 12:01 jdk -> jdk1.8.0_144/
    drwxr-xr-x 8 root root 4096 Jul 22  2017 jdk1.8.0_144
    [root@aliyun java]# vim /etc/profile
        #env
        export JAVA_HOME=/usr/java/jdk
        export PATH=$JAVA_HOME/bin:$PATH
    [root@aliyun java]# source /etc/profile
    [root@aliyun java]# which java
    /usr/java/jdk/bin/java

    1.5 JAVA_HOME 显性配置

    [hadoop@aliyun hadoop]$ vi hadoop-env.sh
    export JAVA_HOME=/usr/java/jdk
    [root@aliyun java]# cat /etc/hosts
    127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
    ::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
    
    172.16.39.48 aliyun

    1.6配置文件

    etc/hadoop/core-site.xml:
    <configuration>
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://aliyun:9000</value>
        </property>
    </configuration>
    
    etc/hadoop/hdfs-site.xml:
    <configuration>
        <property>
            <name>dfs.replication</name>
            <value>1</value>
        </property>
    </configuration>

    1.7 ssh无密码信任关系

    家目录下输入
      $ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
      $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
      $ chmod 0600 ~/.ssh/authorized_keys
    [hadoop@aliyun ~]$ ssh aliyun date
    Thu Nov 28 12:15:08 CST 2019

    1.8 环境变量 hadoop

    [hadoop@aliyun ~]$ vi .bashrc
    export HADOOP_HOME=/home/hadoop/app/hadoop
    export PATH=${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
    [hadoop@aliyun ~]$ source .bashrc 
    [hadoop@aliyun ~]$ which hadoop
    ~/app/hadoop/bin/hadoop

    1.9 格式化

    [hadoop@aliyun ~]$ hdfs namenode -format
    has been successfully formatted.

    1.10 第一次启动

    [hadoop@aliyun ~]$ start-dfs.sh 
    [hadoop@aliyun ~]$ jps
    10804 SecondaryNameNode
    10536 NameNode
    10907 Jps
    10654 DataNode
    [hadoop@aliyun ~]$ 

    坑:第一次启动会输入yes确定信任关系,我们打开./ssh下的known_hosts文件,这个文件中存放信任关系

    [hadoop@aliyun .ssh]$ cat known_hosts
    aliyun,172.16.39.48 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBCjHBKn/7LF5sfbae1OLkK5QoWm11Xn8RZs1JTc7K8v4RFum1OKIjArocvRjLOYPsq5ezYo8TlBHTrAgeUcvkBM=
    localhost ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBCjHBKn/7LF5sfbae1OLkK5QoWm11Xn8RZs1JTc7K8v4RFum1OKIjArocvRjLOYPsq5ezYo8TlBHTrAgeUcvkBM=
    0.0.0.0 ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBCjHBKn/7LF5sfbae1OLkK5QoWm11Xn8RZs1JTc7K8v4RFum1OKIjArocvRjLOYPsq5ezYo8TlBHTrAgeUcvkBM=

    将来也许在启动hadoop的时候一直要输入密码,就是这里面已经存在了主机的信任关系,但是密匙对是新的,删除这个文件或者内容即可

    1.11 DN SNN都以 ruozedata001启动

      NN:core-site.xml fs.defaultFS控制
      DN: slaves文件
      2NN:hdfs-site.xml

    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>aliyun:50090</value>       #注意端口号,新旧版本有区别
    </property>
    <property>
        <name>dfs.namenode.secondary.https-address</name>
        <value>aliyun:50091</value>       #注意端口号,新旧版本有区别
    </property>

    2.hadoop fs常规命令

    hadoop fs -mkdir /
    hadoop fs -put
    hadoop fs -get
    hadoop fs -cat
    hadoop fs -rm
    hadoop fs -ls

    3.配置文件在官方哪里找 

    https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation

    4.整理 jdk、ssh、hosts文件

    jdk和ssh是hadoop运行的先决条件

    hosts文件存放主机名和ip地址的映射

    学习中,博客都是自己学习用的笔记,持续更新改正。。。
  • 相关阅读:
    MySQL_01 常用命令
    32_Go基础(TCP通信)
    oracle查询优化
    Eclipse中自动添加注释(作者,时间)
    java注解的学习
    JqueryeasyUIdatagrid参数之 queryParams
    Eclipse中,打开文件所在文件夹的插件,及设置
    更改Zend Studio/Eclipse代码风格主题
    JAVA中使用File类批量重命名文件及java.io.File的常见用法
    java面试笔试题大全
  • 原文地址:https://www.cnblogs.com/Tunan-Ki/p/11949217.html
Copyright © 2020-2023  润新知