• 编译hadoop eclipse的插件(hadoop1.0)


    原创文章,转载请注明: 转载自工学1号馆

    欢迎关注我的个人博客:www.wuyudong.com, 更多云计算与大数据的精彩文章

    在hadoop-1.0中,不像0.20.2版本,有现成的eclipse-plugin源码包,而是在HADOOP_HOME/src/contrib/eclipse-plugin目录下放置了eclipse插件的源码,这篇文章 ,我想详细记录一下自己是如何编译此源码生成适用于Hadoop1.0的eclipse插件

    1、安装环境

    操作系统:Ubuntu14.4
    软件:
    eclipse
    java
    Hadoop 1.0

    2、编译步骤

    (1)首先下载ant与ivy的安装包

    将安装包解压缩到指定的目录,然后将ivy包中的ivy-2.2.0.jar包放到ant安装目录的lib目录下,然后在/etc/profile中添加以下内容以设置配置环境:

    export ANT_HOME=/home/wu/opt/apache-ant-1.8.3
    export PATH=”$ANT_HOME/bin:$PATH”

    (2)终端转到hadoop安装目录下,执行ant compile,结果如下:

    ……………………

    compile:
    [echo] contrib: vaidya
    [javac] /home/wu/opt/hadoop-1.0.1/src/contrib/build-contrib.xml:185: warning: ‘includeantruntime’ was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds
    [javac] Compiling 14 source files to /home/wu/opt/hadoop-1.0.1/build/contrib/vaidya/classes
    [javac] Note: /home/wu/opt/hadoop-1.0.1/src/contrib/vaidya/src/java/org/apache/hadoop/vaidya/statistics/job/JobStatistics.java uses unchecked or unsafe operations.
    [javac] Note: Recompile with -Xlint:unchecked for details.

    compile-ant-tasks:
    [javac] /home/wu/opt/hadoop-1.0.1/build.xml:2170: warning: ‘includeantruntime’ was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds
    [javac] Compiling 5 source files to /home/wu/opt/hadoop-1.0.1/build/ant

    compile:

    BUILD SUCCESSFUL

    Total time: 12 minutes 29 seconds

    可以看到编译成功!花的时间比较长,可以泡壶茶休息一下~~

    (3)再将终端定位到HADOOP_HOME/src/contrib/eclipse-plugin,然后执行下面的命令:

    ant -Declipse.home=/home/wu/opt/eclipse -Dversion=1.0.1 jar

    编译完成后就可以找到eclipse插件了

    3、安装步骤

    (1)伪分布式的配置过程也很简单,只需要修改几个文件,在代码的conf文件夹内,就可以找到下面几个配置文件,具体过程我就不多说了,这里列出我的配置:

    core-site.xml

    <configuration>
    <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:9000</value>
    </property>
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/wu/hadoop-0.20.2/tmp</value>
    </property>
    </configuration>

    hdfs-site.xml

    <configuration>
    <property>
    <name>dfs.replication</name>
    <value>1</value>
    </property>
    </configuration>

    mapred-site.xml

    <configuration>
    <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:9000</value>
    </property>
    <property>
    <name>mapred.job.tracker</name>
    <value>hdfs://localhost:9001</value>
    </property>
    </configuration>

    进入conf文件夹,修改配置文件:hadoop-env.sh,将里面的JAVA_HOME注释打开,并把里面的地址配置正确

    (2)运行hadoop

    进入hadoop目录,首次运行,需要格式化文件系统,输入命令:

    bin/hadoop namenode -format

    输入命令,启动所有进出:

    bin/start-all.sh

    关闭hadoop可以用:

    bin/stop-all.sh

    最后验证hadoop是否安装成功,打开浏览器,分别输入:

    http://localhost:50030/ (MapReduce的web页面)

    http://localhost:50070/ (HDFS的web页面)

    用jps命令看一下有几个java进程在运行,如果是下面几个就正常了

    wu@ubuntu:~/opt/hadoop-1.0.1$ jps
    4113 SecondaryNameNode
    4318 TaskTracker
    3984 DataNode
    3429 
    3803 NameNode
    4187 JobTracker
    4415 Jps

    系统启动正常后,现在来跑个程序:

    $mkdir input
    $cd input
    $echo "hello world">test1.txt
    $echo "hello hadoop">test2.txt
    $cd ..
    $bin/hadoop dfs -put input in
    $bin/hadoop jar hadoop-examples-1.0.1.jar wordcount in out
    $bin/hadoop dfs -cat out/*

    出现一长串的运行过程:

    ****hdfs://localhost:9000/user/wu/in
    15/05/29 10:51:41 INFO input.FileInputFormat: Total input paths to process : 2
    15/05/29 10:51:42 INFO mapred.JobClient: Running job: job_201505291029_0001
    15/05/29 10:51:43 INFO mapred.JobClient: map 0% reduce 0%
    15/05/29 10:52:13 INFO mapred.JobClient: map 100% reduce 0%
    15/05/29 10:52:34 INFO mapred.JobClient: map 100% reduce 100%
    15/05/29 10:52:39 INFO mapred.JobClient: Job complete: job_201505291029_0001
    15/05/29 10:52:39 INFO mapred.JobClient: Counters: 29
    15/05/29 10:52:39 INFO mapred.JobClient: Job Counters 
    15/05/29 10:52:39 INFO mapred.JobClient: Launched reduce tasks=1
    15/05/29 10:52:39 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=43724
    15/05/29 10:52:39 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
    15/05/29 10:52:39 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
    15/05/29 10:52:39 INFO mapred.JobClient: Launched map tasks=2
    15/05/29 10:52:39 INFO mapred.JobClient: Data-local map tasks=2
    15/05/29 10:52:39 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=20072
    15/05/29 10:52:39 INFO mapred.JobClient: File Output Format Counters 
    15/05/29 10:52:39 INFO mapred.JobClient: Bytes Written=25
    15/05/29 10:52:39 INFO mapred.JobClient: FileSystemCounters
    15/05/29 10:52:39 INFO mapred.JobClient: FILE_BYTES_READ=55
    15/05/29 10:52:39 INFO mapred.JobClient: HDFS_BYTES_READ=239
    15/05/29 10:52:39 INFO mapred.JobClient: FILE_BYTES_WRITTEN=64837
    15/05/29 10:52:39 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=25
    15/05/29 10:52:39 INFO mapred.JobClient: File Input Format Counters 
    15/05/29 10:52:39 INFO mapred.JobClient: Bytes Read=25
    15/05/29 10:52:39 INFO mapred.JobClient: Map-Reduce Framework
    15/05/29 10:52:39 INFO mapred.JobClient: Map output materialized bytes=61
    15/05/29 10:52:39 INFO mapred.JobClient: Map input records=2
    15/05/29 10:52:39 INFO mapred.JobClient: Reduce shuffle bytes=61
    15/05/29 10:52:39 INFO mapred.JobClient: Spilled Records=8
    15/05/29 10:52:39 INFO mapred.JobClient: Map output bytes=41
    15/05/29 10:52:39 INFO mapred.JobClient: CPU time spent (ms)=7330
    15/05/29 10:52:39 INFO mapred.JobClient: Total committed heap usage (bytes)=247275520
    15/05/29 10:52:39 INFO mapred.JobClient: Combine input records=4
    15/05/29 10:52:39 INFO mapred.JobClient: SPLIT_RAW_BYTES=214
    15/05/29 10:52:39 INFO mapred.JobClient: Reduce input records=4
    15/05/29 10:52:39 INFO mapred.JobClient: Reduce input groups=3
    15/05/29 10:52:39 INFO mapred.JobClient: Combine output records=4
    15/05/29 10:52:39 INFO mapred.JobClient: Physical memory (bytes) snapshot=338845696
    15/05/29 10:52:39 INFO mapred.JobClient: Reduce output records=3
    15/05/29 10:52:39 INFO mapred.JobClient: Virtual memory (bytes) snapshot=1139433472
    15/05/29 10:52:39 INFO mapred.JobClient: Map output records=4

    查看out文件夹:

    wu@ubuntu:~/opt/hadoop-1.0.1$ bin/hadoop dfs -cat out/*

    hadoop 1
    hello 2
    world 1

  • 相关阅读:
    【C语言】C语言static和extern区别
    【C语言】C语言外部变量和内部变量
    【C语言】C语言局部变量和全局变量
    【C语言】C语言常量和变量
    【C语言】C语言数据类型
    【C语言】C语言标识符
    【C语言】C语言关键字
    【C语言】外部函数和内部函数
    【C语言】C语言函数
    Android 测试 Appium、Robotium、monkey等框架或者工具对比
  • 原文地址:https://www.cnblogs.com/wuyudong/p/4539750.html
Copyright © 2020-2023  润新知