• MR并行算法编程过程中遇到问题的思考


    1. Reducer 类中 reduce函数外定义的变量是在Reducer机器上属于全局变量的,因此,一台机器上reduce函数均可以对该变量的值做出贡献。如代码:(sum和count数据Reducer机器上的全局变量)‘

    	public static class AvgCalReducer extends Reducer<EntityEntityWritable,FloatWritable,EntityEntityWritable,FloatWritable>
    	{
    		FloatWritable avg;
    		float sum=0;
    		int count=0;		
    		public void reduce(EntityEntityWritable key,Iterable<FloatWritable>values,Context context) throws IOException, InterruptedException
    		{
    
    			System.out.println("reducer starting:");
    			for (FloatWritable value:values)
    			{
    				sum=sum+value.get();
    				count++;
    				System.out.println(" key = "+key+" value = "+value.get());
    			}
    			System.out.println("average:"+sum/count);
    			System.out.println("this reducer ending...");
    			avg=new FloatWritable(sum/count);
    			context.write(key, avg);
    		}
    	}

    如果想使sum和count的值仅通过reduce函数进行改变,即只计算同一个key对应value的sum和count,则需要将sum和count放入reduce函数内,如下:

    	public static class AvgCalReducer extends Reducer<EntityEntityWritable,FloatWritable,EntityEntityWritable,FloatWritable>
    	{
    		FloatWritable avg;
    		
    		public void reduce(EntityEntityWritable key,Iterable<FloatWritable>values,Context context) throws IOException, InterruptedException
    		{
    			float sum=0;
    			int count=0;
    			System.out.println("reducer starting:");
    			for (FloatWritable value:values)
    			{
    				sum=sum+value.get();
    				count++;
    				System.out.println(" key = "+key+" value = "+value.get());
    			}
    			System.out.println("average:"+sum/count);
    			System.out.println("this reducer ending...");
    			avg=new FloatWritable(sum/count);
    			context.write(key, avg);
    		}
    	}

    2. 对于顺序组合式MapReduce作业:用两个job举例:

    		Configuration conf1=new Configuration();
    		Job job1=new Job(conf1,"Job1");
    		job1.waitForCompletion(true);
    
    		Configuration conf2=new Configuration();
    		Job job2=new Job(conf2,"Job2");
    		job2.waitForCompletion(true);

    注意我们之前经常写的System.exit(job.waitForCompletion(true)?0:1)在这里不可以使用,比如第一个job处的(job1.waitForCompletion(true)改成System.exit(job.waitForCompletion(true)?0:1),则系统成功完成job1后正常退出系统,没有机会再去运行job2了。


  • 相关阅读:
    问题 A: 【递归入门】全排列
    第一个struct2程序(2)
    第一个struct2程序
    Java学习 第二节
    重学Java
    Servlet过滤器
    struct2
    Java web struct入门基础知识
    one by one 项目 part 6
    软件工程导论 桩模块和驱动模块
  • 原文地址:https://www.cnblogs.com/eva_sj/p/3971164.html
Copyright © 2020-2023  润新知