• MapReduce寻找共同好友


    1.测试文件

    A:B,C,D,F,E,O
    B:A,C,E,K
    C:F,A,D,I
    D:A,E,F,L
    E:B,C,D,M,L
    F:A,B,C,D,E,O,M
    G:A,C,D,E,F
    H:A,C,D,E,O
    I:A,O
    J:B,O
    K:A,C,D
    L:D,E,F
    M:E,F,G
    O:A,H,I,J

    2.方法

    2-1.方法一:

    1.将域用户和好友分别作为值和键输出
      {B,C,D,F,E,O}:A
      {A,C,E,K}:B
    
    2.可以看出:B,C,D,F,E,O都有共同好友A,
    
    3.把A的好友两两组合作为键,A作为值,冒泡输出
    
    4.经过shuffle处理后,会把BC作为键,共同好友作为值放入集合中
    
    5.迭代集合中的好友,一次输出即可
    

    2-2.方法二:

    1.将用户和好友作为键和值输出
    
      A:B,C,D,F,E,O     --A:B,C,D,F,E,O
      B:A,C,E,K     --B:A,C,E,K
      C:F,A,D,I     --C:A,D,F,I
      D:A,E,F,L     --D:A,E,F,L
      E:B,C,D,M,L       --E:B,C,D,L,M
    
    2.将所有键值对添加到map集合中
    
    3.取map的键(所有用户)为数组
    
    4.迭代数组,通过用户名"A"在map中取得他的好友
    
    5.迭代除用户"A"以外的其他用户,获取这些用户的好友;
    
      如果有用户同时存在于"A"和"B"的好友列表中
    
      那么这些好友就是"AB"的共同好友
    
      --A:{B,C,D,F,E,O}
      --B:{A,C,E,K}
    
      "A"中存在"C,E"用户,"B"中也存在"C,E"用户,那么"C,E"就是AB的共同好友
    
    6.将"AB"作为键,共同好友作为值输出即可
    

    3.代码

    public class Friends {
    
        // map
        public static class MRMapper extends Mapper<LongWritable, Text, Text, Text> {
    
            protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
                String str = value.toString();
                String friends = str.substring(2);
                System.out.println(friends);
                context.write(new Text(str.charAt(0) + ""), new Text(friends));
            }
        }
    
        // reduce
        public static class MRReducer extends Reducer<Text, Text, Text, Text> {
    
            private static HashMap<String, String> map1 = new HashMap<String, String>();
            public void run(Context context) throws IOException, InterruptedException {
                try {
                    while (context.nextKeyValue()) {
                        reduce(context.getCurrentKey(), context.getValues(), context);
                    }
                } finally {
                    cleanup(context);
                }
            }
    
            public void reduce(Text key, Iterable<Text> iterable, Context context)
                    throws IOException, InterruptedException {
    
                for (Text t : iterable) {
                    map1.put(key.toString(), t.toString());
                }
            }
    
            public void cleanup(Reducer<Text, Text, Text, Text>.Context context) 
                    throws IOException, InterruptedException {
    
                List<String> list = new ArrayList<String>();
    
                Collection<String> keys = map1.keySet();// 所有用户
    
                String keys1 = keys.toString();
    
                String keys2 = keys1.substring(1, keys1.length() - 1);
    
                String[] split = keys2.split(",");
    
                for (int i = 1; i < split.length; i++) {//迭代用户
    
                    String a = split[i].trim();
    
                    for (int j = (i+1); j < split.length; j++) {//迭代除外层循环以外的用户
    
                        String b = split[j].trim();
    
                        String a_and_b = "";
    
                        // a的好友
                        String af = map1.get(a);
    
                        String[] friends = af.split(",");
    
                        for (String s : friends) {//比较两个用户的好友列表,取共同好友
    
                            if (map1.get(b).contains(s)) {
    
                                a_and_b += "," + s;
                            }
                        }
    
                        System.out.println(a + "," + b + " 共同好友  " + a_and_b);
    
                        if (a_and_b.length() > 1) {
    
                            list.add(a + "," + b + " 共同好友 :" + a_and_b.substring(1));
                        }
                    }
                }
                for(String s:list){
    
                    context.write(new Text(""), new Text(s));
                }
            }
        }
    
        public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
    
            Configuration conf = new Configuration();
    
            Job job = Job.getInstance(conf);
            job.setJarByClass(Friends.class);
    
            job.setMapperClass(MRMapper.class);
            job.setReducerClass(MRReducer.class);
            job.setCombinerClass(MRReducer.class);
    
            job.setMapOutputKeyClass(Text.class);
            job.setMapOutputValueClass(Text.class);
    
            job.setOutputKeyClass(Text.class);
            job.setOutputValueClass(Text.class);
    
            FileInputFormat.setInputPaths(job, new Path("hdfs://hadoop5:9000/input/friends.txt"));
            FileOutputFormat.setOutputPath(job, new Path("hdfs://hadoop5:9000/output/friends"));
    
            System.out.println(job.waitForCompletion(true) ? 1 : 0);
        }
    }

    如果有更简洁的方法,欢迎留言给博主。

  • 相关阅读:
    移除jboss响应中的中间件信息
    Cypress web自动化1-windows环境npm安装Cypress
    pytest文档39-参数化(parametrize)结合allure.title()生成不同标题报告
    pytest文档38-allure.step()添加测试用例步骤
    python笔记45-经典面试题:判断字符串括号是否闭合{}[]()
    Linux学习28-linux一行命令杀掉指定名称进程(killall 、kill 、pkill)
    pytest文档37-自定义用例顺序(pytest-ordering)
    pytest文档36-断言失败后还能继续执行pytest-assume
    pytest文档35-Hooks函数之统计测试结果(pytest_terminal_summary)
    pytest文档34-Hooks函数改变用例执行顺序(pytest_collection_modifyitems)
  • 原文地址:https://www.cnblogs.com/lyjing/p/7571002.html
Copyright © 2020-2023  润新知