Leetcode721. 账户合并

721. 账户合并

Difficulty: 中等

给定一个列表 accounts，每个元素 accounts[i] 是一个字符串列表，其中第一个元素 accounts[i][0] 是 名称 (name)，其余元素是 emails 表示该账户的邮箱地址。

现在，我们想合并这些账户。如果两个账户都有一些共同的邮箱地址，则两个账户必定属于同一个人。请注意，即使两个账户具有相同的名称，它们也可能属于不同的人，因为人们可能具有相同的名称。一个人最初可以拥有任意数量的账户，但其所有账户都具有相同的名称。

合并账户后，按以下格式返回账户：每个账户的第一个元素是名称，其余元素是按字符 ASCII 顺序排列的邮箱地址。账户本身可以以任意顺序返回。

示例 1：

输入：
accounts = [["John", "johnsmith@mail.com", "john00@mail.com"], ["John", "johnnybravo@mail.com"], ["John", "johnsmith@mail.com", "john_newyork@mail.com"], ["Mary", "mary@mail.com"]]
输出：
[["John", 'john00@mail.com', 'john_newyork@mail.com', 'johnsmith@mail.com'],  ["John", "johnnybravo@mail.com"], ["Mary", "mary@mail.com"]]
解释：
第一个和第三个 John 是同一个人，因为他们有共同的邮箱地址 "johnsmith@mail.com"。 
第二个 John 和 Mary 是不同的人，因为他们的邮箱地址没有被其他帐户使用。
可以以任何顺序返回这些列表，例如答案 [['Mary'，'mary@mail.com']，['John'，'johnnybravo@mail.com']，
['John'，'john00@mail.com'，'john_newyork@mail.com'，'johnsmith@mail.com']] 也是正确的。

提示：

accounts的长度将在[1，1000]的范围内。
accounts[i]的长度将在[1，10]的范围内。
accounts[i][j]的长度将在[1，30]的范围内。

Solution

思路：并查集，使用并查集将属于同一个人的邮箱进行合并。

Language: java

/**
*      版本2
*/
class Solution {
    public List<List<String>> accountsMerge(List<List<String>> accounts) {
        List<String> info = new ArrayList<>();      //将accounts的邮箱信息都保存在这个list中，将二维数据转化为一维数据
        Map<String, String> emailToName = new HashMap<>();
        for(List<String> account : accounts){
            for (int j = 1; j < account.size(); j++) {
                info.add(account.get(j));
                emailToName.put(account.get(j), account.get(0));
            }
        }
        UnionFind unionFind = new UnionFind(info.size());      //初始化并查集
        for (int i = 0, cnt = 0, n = 0; i < accounts.size(); i++) {      //该步对数据进行简单划分，就是accounts[i]下的所有邮箱都会被放在同一个集合中。最后一共有accounts.length个集合。
            for (int j = 1; j < accounts.get(i).size(); j++) {
                unionFind.parent[cnt++] = n;      //将同一个用户的下的邮箱放在一个集合中。注意：此时还存在重复的用户邮箱
            }
            n += accounts.get(i).size() - 1;
        }
        Map<String, Integer> emailToIdx = new HashMap<>();
        for(int i=0; i<info.size(); i++){
            if(emailToIdx.containsKey(info.get(i))){
                int idx = emailToIdx.get(info.get(i));      //出现重复的邮箱，说明这两个集合是同一个用户，进行合并。
                unionFind.merge(idx, i);
            }else{
                emailToIdx.put(info.get(i), i);
            }
        }
        List<List<String>> res = new ArrayList<>();
        Map<Integer, Integer> idxToIdx = new HashMap<>();
        Set<String> set = new HashSet<>();
        for(int i=0; i<info.size(); i++){
            if(set.contains(info.get(i))) continue;      //去除重复的邮箱，将同一个集合中的邮箱放在一个list中，最后所有list放在res中
            set.add(info.get(i));
            int pidx = unionFind.find(i);
            int idx = idxToIdx.getOrDefault(pidx, res.size());
            if (idx == res.size()) {
                idxToIdx.put(pidx, idx);
                res.add(new ArrayList<>());
            }
            res.get(idx).add(info.get(i));
        }

        for (int i = 0; i < res.size(); i++) {
            Collections.sort(res.get(i));
            res.get(i).add(0, emailToName.get(res.get(i).get(0)));
        }
        return res;
    }

    class UnionFind{      //并查集
        int[] parent;

        public UnionFind(int n) {
            this.parent = new int[n];
            for(int i=0; i<n; i++){
                this.parent[i] = i;
            }
        }

        public int find(int x){
            return x == parent[x] ? x : (parent[x] = find(parent[x]));
        }

        public void merge(int x, int y){
            parent[find(y)] = find(x);
        }
    }
}

执行用时：37 ms, 在所有 Java 提交中击败了86.20%的用户
内存消耗：43.8 MB, 在所有 Java 提交中击败了25.25%的用户

/**
*        版本1
*/
class Solution {
    private List<Account> fa = new ArrayList<>();      //并查集
    public List<List<String>> accountsMerge(List<List<String>> accounts) {
        for(int i=0, idx=0; i<accounts.size(); i++){
            for(int j=1; j<accounts.get(i).size(); j++){
                fa.add(new Account(accounts.get(i).get(0), accounts.get(i).get(j), idx));      //初始化
            }
            idx += accounts.get(i).size()-1;      //accounts[i]下的邮箱为一个集合
        }
        Map<Account, Integer> map = new HashMap<>();
        for(Account act : fa){
            if(map.containsKey(act)){
                merge(fa.get(map.get(act)), act);      //当前act之前出现过，说明该act是重复的，此时合并两个集合，因为两个集合的邮箱账户都是同一个人的
            }else{
                map.put(act, act.num);
            }
        }
        Map<Integer, Integer> idxMap = new HashMap<>();
        Set<Account> set = new HashSet<>();
        List<List<String>> res = new ArrayList<>();
        for(int i=0; i<fa.size(); i++){
            if(set.contains(fa.get(i))) continue;      //删除重复的
            set.add(fa.get(i));
            int idx = idxMap.getOrDefault(find(fa.get(i)).num, res.size());
            if(idx == res.size()){
                idxMap.put(fa.get(i).num, idx);
                res.add(new ArrayList<>());
                res.get(idx).add(fa.get(i).name);
            }
            res.get(idx).add(fa.get(i).email);
        }
        for(int i=0; i<res.size(); i++){      //进行排序
            String name = res.get(i).remove(0);
            Collections.sort(res.get(i));
            res.get(i).add(0, name);
        }
        return res;
    }
    private Account find(Account account){
        if(account == fa.get(account.num)){
            return account;
        }else{
            Account t = find(fa.get(account.num));
            account.num = t.num;
            return t;
        }
    }
    private void merge(Account a, Account b){
        find(b).num = find(a).num;
    }

    class Account{      //账户信息类，用来保存父类索引
        String name;
        String email;
        int num;      //用来记录父亲结点的位置
        public Account(String name, String email, int num){
            this.name = name;
            this.email = email;
            this.num = num;
        }

        @Override
        public boolean equals(Object obj) {
            Account account = (Account) obj;
            return (this.name+this.email).equals(account.name+account.email);
        }

        @Override
        public int hashCode() {
            return this.name.hashCode()<<16 + this.email.hashCode()>>16;
        }
    }
}

执行用时：1662 ms, 在所有 Java 提交中击败了5.11%的用户
内存消耗：46.5 MB, 在所有 Java 提交中击败了11.53%的用户

note：

同样的思路，不同的代码表现方式，真的效果差太大了。版本1是最开始的想法实现，但效率太低。。。后面瞄了一眼官方题解，看到他但并查集只记录索引，才恍然大悟，索引不一定要和数据绑死，在版本2中就是并查集只有parent记录父节点位置，而版本1父节点一定要和数据（name、email）在一个结构中，这样局限太大了。记录下，以后写代码应该要注意这点。

相关阅读:
深入探讨 Python 的 import 机制：实现远程导入模块
 PEP8中文版 -- Python编码风格指南
 【转】python---方法解析顺序MRO（Method Resolution Order）<以及解决类中super方法>
【转】Python3中遇到UnicodeEncodeError: 'ascii' codec can't encode characters in ordinal not in range(128)
Python字典的json格式化处理（换行与不换行）
最详细的CentOS 6与7对比（三）：性能测试对比
 最详细的CentOS 6与7对比（二）：服务管理对比
 WEBAPI 自动生成帮助文档
 MEF IOC使用
 MVC分页示例
原文地址：https://www.cnblogs.com/liuyongyu/p/14294436.html