• Hadoop 运行 c++ 程序实验


    假设有C++程序boss.exe, 其执行格式如下(第一个参数是输入文件,第二个参数是输出文件):

    ./boss.exe ADDRESS_BOOK_FILE NEW_ADDRESS_BOOK_FILE

    现在需要在hadoop的Map函数中启动boss.exe,其输入输出文件均在HDFS中,格式为:

    hdfs://127.0.0.1:8020/user/donal/address1.txt 或者 hdfs:///user/donal/address1.txt

    Map函数解决思路:

    1.先将输入文件从HDFS拷贝到本地

    hadoop fs -copyToLocal /user/donal/address1.txt /tmp/address1.txt

    2.执行C++程序

    ./boss.exe /tmp/address1.txt /tmp/address2.txt

    3.将结果文件拷贝到HDFS

    hadoop fs -copyFromLocal /tmp/address2.txt /user/donal/address2.txt

    具体的代码如下:

    Map.java

    import java.lang.Runtime;
    import java.util.Arrays;

    class Map{
    public static int RunProcess(String[] args){
    int exitcode = -1;
    System.out.println(Arrays.toString(args));
    try{
    Runtime runtime=Runtime.getRuntime();
    final Process process=runtime.exec(args);
    // any error message?
    new StreamGobbler(process.getErrorStream(), "ERROR").start();
    // any output?
    new StreamGobbler(process.getInputStream(), "OUTPUT").start();
    process.getOutputStream().close();
    exitcode=process.waitFor();
    }catch (Throwable t){
    t.printStackTrace();
    }
    return exitcode;
    }
    public static void main(String[] args) throws Exception {
    if (args.length != 2) {
    System.err.println("Usage: Map ADDRESS_BOOK_FILE NEW_ADDRESS_BOOK_FILE");
    System.exit(-1);
    }
    String inFileName=args[0];
    String outFileName=args[1];
    String localInFileName="";
    String localOutFileName="";
    //String[] commandArgs;
    try{
    if(args[0].startsWith("hdfs://")){
    inFileName=args[0].substring(args[0].indexOf('/',6)+1);
    localInFileName="/tmp/" + inFileName.substring(inFileName.lastIndexOf('/')+1);
    //copy the input file from HDFS
    RunProcess(new String[]{"/bin/sh","-c","/usr/lib/hadoop/bin/hadoop fs -copyToLocal "+inFileName+" "+localInFileName});
    }
    if(args[1].startsWith("hdfs://")){
    outFileName=args[1].substring(args[1].indexOf('/',6)+1);
    localOutFileName="/tmp/" + outFileName.substring(outFileName.lastIndexOf('/')+1);
    }
    String[] commandArgs={"./boss.exe",localInFileName,localOutFileName};
    int exitcode = RunProcess(commandArgs);
    if(args[1].startsWith("hdfs://")){
    //copy the result file to HDFS
    RunProcess(new String[]{"/bin/sh","-c","/usr/lib/hadoop/bin/hadoop fs -copyFromLocal "+localOutFileName+" "+outFileNam
    e});
    }
    System.out.println("finish:"+exitcode);
    }catch (Throwable t){
    t.printStackTrace();
    }
    }
    }

    StreamGobbler.java:

    import java.util.*;
    import java.io.*;
    class StreamGobbler extends Thread
    {
    InputStream is;
    String type;
    StreamGobbler(InputStream is, String type)
    {
    this.is = is;
    this.type = type;
    }

    public void run()
    {
    try
    {
    InputStreamReader isr = new InputStreamReader(is);
    BufferedReader br = new BufferedReader(isr);
    String line=null;
    while ((line = br.readLine()) != null)
    System.out.println(type + ">" + line);
    } catch (IOException ioe)
    {
    ioe.printStackTrace();
    }
    }
    }


    说明:

    1.HDFS文件的拷贝可以使用Java API实现,本例中调用了shell命令

    2.StreamGobbler.java是为了输出子进程的ErrorSteam和InputSteam输出



  • 相关阅读:
    设计模式整理_单例设计模式
    设计模式整理_工厂模式
    设计模式整理_装饰者模式
    设计模式整理_观察者模式
    设计模式整理_策略模式
    JavaSE复习_7 异常
    JavaSE复习_6 枚举类
    JavaSE复习_5 Eclipse的常见操作
    pta编程题19 Saving James Bond 2
    ImportError: No module named PIL
  • 原文地址:https://www.cnblogs.com/Donal/p/2387873.html
Copyright © 2020-2023  润新知