• [LintCode] String Homomorphism Review


    Given two strings s and t, determine if they are isomorphic.

    Two strings are isomorphic if the characters in s can be replaced to get t.

    All occurrences of a character must be replaced with another character while preserving the order of characters. No two characters may map to the same character but a character may map to itself.

    Example

    Given s = "egg", t = "add", return true.

    Given s = "foo", t = "bar", return false.

    Given s = "paper", t = "title", return true.

    Solution 1. 

    Based on the definition of homomorphism, we know that s and t must have the same structure, i.e, the same duplicate character must appear at the same index 

    for s and t. If s has two of same characters at index 2 and 3, then t must have two of same characters at index 2 and 3 as well.  

    From the above analysis, we can derive the following steps.

    1. Use two hash maps to store each character's first apperance index for s and t.

    2. Construct a index array structure using the hash maps.  For example, given string "title", we get a structure of [0 1 0 3 4]. Each number in this 

    structure represents the first appearance index of "title".charAt(i) in "title". 

    3. Compare the two structures for s and t.

    Time/Space efficiency

    1. Run time is O(n), this is the BCR(best conceivable runtime) as we have to at least scan both strings once.

    2. Space is O(n).  O(4 * n) to be exact.  

    Q: We can't do better in runtime since we've already got the BCR. But can we do better with space efficiency?

    A:  We sure can as shown in solution 2.

     1 public class Solution {
     2     public boolean isIsomorphic(String s, String t) {
     3         if(s == null || t == null){
     4             return false;
     5         }
     6         if(s.length() != t.length()){
     7             return false;
     8         }
     9         int n = s.length();
    10         HashMap<Character, Integer> map1 = new HashMap<Character, Integer>();
    11         HashMap<Character, Integer> map2 = new HashMap<Character, Integer>();
    12         int[] index1 = new int[n];
    13         int[] index2 = new int[n];
    14         for(int i = 0; i < n; i++){
    15             if(!map1.containsKey(s.charAt(i))){
    16                 map1.put(s.charAt(i), i);
    17             }
    18             if(!map2.containsKey(t.charAt(i))){
    19                 map2.put(t.charAt(i), i);
    20             }            
    21         }
    22         for(int i = 0; i < n; i++){
    23             index1[i] = map1.get(s.charAt(i));
    24             index2[i] = map2.get(t.charAt(i));
    25         }
    26         for(int i = 0; i < n; i++){
    27             if(index1[i] != index2[i]){
    28                 return false;
    29             }
    30         }
    31         return true;
    32     }
    33 }

    Solution 2. Optimization on space efficiency 

    Assuming input strings only have ASCII characters, which is 128 in total.  Then we can use two arrays of size 128 to store the mapping information 

    as we scan through s and t. O(2 * 128) is O(1) as it is only a constant that does not scale up when the input size gets bigger.

    1. Init the map arrays to all Integer.MAX_VALUE, indicating there is no mapping between any s.charAt(i) and t.charAt(i).

    2. Iterate through s and t.

    If there is no mapping between s.charAt(i) and t.charAt(j), establish a mapping relation. 

    A value of Integer.MIN_VALUE for m2[i] means that t.charAt(i) already has a mapping from s.charAt(i).

    If there is no mapping for s.charAt(i) and there is a mapping to t.charAt(i), return false.

    If there is a mapping for s.charAt(i) but it is not mapped to t.charAt(i), regardless if t.charAt(i) has a mapping to it or not, return false.

     1 public class Solution {
     2     public boolean isIsomorphic(String s, String t) {
     3         int[] m1 = new int[128];
     4         int[] m2 = new int[128];
     5         for(int i = 0; i < 128; i++){
     6             m1[i] = Integer.MAX_VALUE;
     7             m2[i] = Integer.MAX_VALUE;
     8         }
     9         for (int i = 0; i < s.length(); ++i) {
    10             int cs = (int) s.charAt(i);
    11             int ts = (int) t.charAt(i);
    12             if(m1[cs] == Integer.MAX_VALUE){
    13                 //neither s.charAt(i) nor t.charAt(i) has a mapping
    14                 if(m2[ts] == Integer.MAX_VALUE){
    15                     m1[cs] = ts;
    16                     m2[ts] = Integer.MIN_VALUE;
    17                 }
    18                 //s.charAt(i) has no mapping but t.charAt(i) already
    19                 //has a mapping to some other character that is not
    20                 //s.charAt(i)
    21                 else{
    22                     return false;
    23                 }
    24             }
    25             //s.charAt(i) already has a mapping, then it must maps to
    26             //t.charAt(i)
    27             else if(m1[cs] != ts){
    28                 return false;
    29             }
    30         }
    31         return true;
    32     }
    33 }

    Solution 3. Instead of mapping a character to another character, map both character to the same integer. 

    It seems straightforward to use the characters' indices in above mapping, shown as following.

    class Solution {
        public boolean isIsomorphic(String s, String t) {
            int[] m1 = new int[128], m2 = new int[128];
            for(int i = 0; i < s.length(); i++) {
                if(m1[s.charAt(i) - '\0'] != m2[t.charAt(i) - '\0']) {
                    return false;
                }
                m1[s.charAt(i) - '\0'] = i;
                m2[t.charAt(i) - '\0'] = i;
            }
            return true;
        }
    }

    But this does not work for s = "aa", t = "ab". The reason is that the mapping arrays are initialized to all 0s by default. In this counter example, the first mapping a -> a uses index 0, making 0 ambiguious. It can represent there hasn't been a mapping or a mapping using 0. We need to either initialize the maps to all -1 or use integers that are bigger than 0 to make a distinction. 

    Correct implementations:

    class Solution {
        public boolean isIsomorphic(String s, String t) {
            int[] m1 = new int[128], m2 = new int[128];
            Arrays.fill(m1, -1); Arrays.fill(m2, -1);
            for(int i = 0; i < s.length(); i++) {
                if(m1[s.charAt(i) - '\0'] != m2[t.charAt(i) - '\0']) {
                    return false;
                }
                else if(m1[s.charAt(i) - '\0'] < 0) {
                    m1[s.charAt(i) - '\0'] = i;
                    m2[t.charAt(i) - '\0'] = i;                
                }
            }
            return true;
        }
    }
    class Solution {
        public boolean isIsomorphic(String s, String t) {
            int[] m1 = new int[128], m2 = new int[128];
            for(int i = 0; i < s.length(); i++) {
                if(m1[s.charAt(i) - '\0'] != m2[t.charAt(i) - '\0']) {
                    return false;
                }
                else if(m1[s.charAt(i) - '\0'] == 0) {
                    m1[s.charAt(i) - '\0'] = i + 1;
                    m2[t.charAt(i) - '\0'] = i + 1;                
                }
            }
            return true;
        }
    }

    The actual mapping integer value doesn't matter as long as we can distinguish between unmapped characters and mapped characters. The following code still yield the correct result.

    class Solution {
        public boolean isIsomorphic(String s, String t) {
            int[] m1 = new int[128], m2 = new int[128];
            Random rand = new Random();
            for(int i = 0; i < s.length(); i++) {
                if(m1[s.charAt(i) - '\0'] != m2[t.charAt(i) - '\0']) {
                    return false;
                }
                else if(m1[s.charAt(i) - '\0'] == 0) {
                    int idx = 1 + rand.nextInt(1005);
                    m1[s.charAt(i) - '\0'] = idx;
                    m2[t.charAt(i) - '\0'] = idx;                 
                }  
            }
            return true;
        }
    }

    Key Notes

    For string problems, always ask what is the possible characters set of a given string, you may be able to optimize 

    the space usage if there are only ASCII characters.

    Related Problems

    Anagrams

  • 相关阅读:
    POJ1470 Closest Common Ancestors(LCA入门)
    POJ1330 Nearest Common Ancestors(倍增LCA算法求无边权树的模板)
    HDU3078 Network (倍增LCA算法求树链)
    HDU2874 Connections between cities(并查集+倍增LCA算法求森林最短路)
    HDU2586 How far away?(倍增LCA算法求带边权树上最短路)
    POJ1062 昂贵的聘礼
    HDU4725 The Shortest Path in Nya Graph(堆优化的dijkstra算法)
    数据仓库详解:包括概念、架构及设计
    利用行为标签构建用户画像
    Spark SQL深度理解篇:模块实现、代码结构及执行流程总览(2)
  • 原文地址:https://www.cnblogs.com/lz87/p/6943163.html
Copyright © 2020-2023  润新知