• Hadoop 集群-本地模式(Local (Standalone) Mode)

             Hadoop 集群-本地模式(Local (Standalone) Mode)





    1>.查看对应Apache Hadoop版本的官方文档








    [root@hadoop101.yinzhengjie.org.cn ~]# mkdir -pv bigdata/inputDir


    [root@hadoop101.yinzhengjie.org.cn ~]# cp /yinzhengjie/softwares/hadoop-2.10.0/etc/hadoop/*.xml bigdata/inputDir/

    4>.执行Apache Hadoop官方文档的Grep案例

    [root@hadoop101.yinzhengjie.org.cn ~]# ll bigdata/
    total 0
    drwxr-xr-x 2 root root 187 Mar 10 22:24 inputDir
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# hadoop jar ${HADOOP_HOME}/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.10.0.jar grep bigdata/inputDir bigdata/outputDir 'dfs[a-z.]+'


    [root@hadoop101.yinzhengjie.org.cn ~]# ll bigdata/
    total 0
    drwxr-xr-x 2 root root 187 Mar 10 22:24 inputDir
    drwxr-xr-x 2 root root  88 Mar 10 22:35 outputDir
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# ll bigdata/outputDir/
    total 4
    -rw-r--r-- 1 root root 11 Mar 10 22:35 part-r-00000
    -rw-r--r-- 1 root root  0 Mar 10 22:35 _SUCCESS
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# cat bigdata/outputDir/part-r-00000 
    1    dfsadmin
    [root@hadoop101.yinzhengjie.org.cn ~]# 



    [root@hadoop101.yinzhengjie.org.cn ~]# mkdir -v bigdata/wcinput
    mkdir: created directory ‘bigdata/wcinput’
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# vim bigdata/wcinput/hadoop.txt
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# cat bigdata/wcinput/hadoop.txt 
    The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing.
    The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local com
    putation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
    [root@hadoop101.yinzhengjie.org.cn ~]#


    [root@hadoop101.yinzhengjie.org.cn ~]# ll bigdata/
    total 0
    drwxr-xr-x 2 root root 187 Mar 10 22:24 inputDir
    drwxr-xr-x 2 root root  88 Mar 10 22:35 outputDir
    drwxr-xr-x 2 root root  24 Mar 10 22:55 wcinput
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# ll bigdata/wcinput/
    total 4
    -rw-r--r-- 1 root root 662 Mar 10 22:47 hadoop.txt
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# hadoop jar ${HADOOP_HOME}/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.10.0.jar wordcount bigdata/wcinput bigdata/wcoutput


    [root@hadoop101.yinzhengjie.org.cn ~]# ll bigdata
    total 0
    drwxr-xr-x 2 root root 187 Mar 10 22:24 inputDir
    drwxr-xr-x 2 root root  88 Mar 10 22:35 outputDir
    drwxr-xr-x 2 root root  24 Mar 10 22:55 wcinput
    drwxr-xr-x 2 root root  88 Mar 10 23:01 wcoutput
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# ll bigdata/wcoutput/
    total 4
    -rw-r--r-- 1 root root 708 Mar 10 23:01 part-r-00000
    -rw-r--r-- 1 root root   0 Mar 10 23:01 _SUCCESS
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# cat bigdata/wcoutput/part-r-00000 
    Apache    1
    Apache™    1
    Hadoop    1
    Hadoop®    1
    It    1
    Rather    1
    The    2
    a    3
    across    1
    allows    1
    and    2
    application    1
    at    1
    be    1
    cluster    1
    clusters    1
    computation    1
    computers    1
    computers,    1
    computing.    1
    data    1
    deliver    1
    delivering    1
    designed    2
    detect    1
    develops    1
    distributed    2
    each    2
    failures    1
    failures.    1
    for    2
    framework    1
    from    1
    handle    1
    hardware    1
    high-availability,    1
    highly-available    1
    is    3
    itself    1
    large    1
    layer,    1
    library    2
    local    1
    machines,    1
    may    1
    models.    1
    of    6
    offering    1
    on    2
    open-source    1
    processing    1
    programming    1
    project    1
    prone    1
    reliable,    1
    rely    1
    scalable,    1
    scale    1
    servers    1
    service    1
    sets    1
    simple    1
    single    1
    so    1
    software    2
    storage.    1
    than    1
    that    1
    the    3
    thousands    1
    to    5
    top    1
    up    1
    using    1
    which    1
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# 
    [root@hadoop101.yinzhengjie.org.cn ~]# cat bigdata/wcoutput/part-r-00000

  • 相关阅读:
    跨DLL边界传递CRT对象的隐患(或诸如:HEAP[]: Invalid Address specified to RtlValidateHeap(#,#)问题出现的原因)
    [原]在 go/golang语言中使用 google Protocol Buffer
    防护针对SQL Server数据库的SQL注入攻击
  • 原文地址:https://www.cnblogs.com/yinzhengjie2020/p/12423980.html
Copyright © 2020-2023  润新知