• [LintCode] Substring Anagrams


    Given a string s and a non-empty string p, find all the start indices of p's anagrams in s.

    Strings consists of lowercase English letters only and the length of both strings s and p will not be larger than 40,000.

    The order of output does not matter.

    Example

    Given s = "cbaebabacd" p = "abc"

    return [0, 6]

    The substring with start index = 0 is "cba", which is an anagram of "abc".
    The substring with start index = 6 is "bac", which is an anagram of "abc".




    Solution 1.
    1. Store all characters' frequencies of p in array pMap.
    2. Similarly with sliding window problems, scan each character
    in s and add it to a queue and update its frequency in array sMap.
    3. When the queue's size equals to p's length, compare if pMap's
    values are the same with sMap's. If they are, we've found a start
    index of p's anagram.
    4. Remove the head character from the queue, then repeat the above
    steps until we've checked all s' characters.

    Runtime: O(n) * O(26) -- O(n), O(n) for iterating through both s and p;
    O(26) for checking if found a substring anagram.
    Space: O(26) * 2 + O(m) -- O(m), m is the length string p.

     1 public class Solution {
     2     public List<Integer> findAnagrams(String s, String p) {
     3         List<Integer> indices = new ArrayList<Integer>();
     4         if(s == null || s.length() < p.length()){
     5             return indices;
     6         }
     7         int[] sMap = new int[26];
     8         int[] pMap = new int[26];
     9         for(int i = 0; i < p.length(); i++){
    10             pMap[p.charAt(i) - 'a']++;
    11         }
    12         Queue<Character> queue = new LinkedList<Character>();
    13         int index = 0; 
    14         while(index < s.length()){
    15             queue.add(s.charAt(index));
    16             sMap[s.charAt(index) - 'a']++;
    17             if(queue.size() == p.length()){
    18                 if(isAnagrams(sMap, pMap)){
    19                     indices.add(index - p.length() + 1);
    20                 }
    21                 sMap[queue.poll() - 'a']--;
    22             }
    23             index++;
    24         }
    25         return indices;
    26     }
    27     private boolean isAnagrams(int[] sMap, int[] pMap){
    28         for(int i = 0; i < sMap.length; i++){
    29             if(sMap[i] != pMap[i]){
    30                 return false;
    31             }
    32         }
    33         return true;
    34     }
    35 }

    Solution 2.

    Can we do better?  

    For runtime, we've already acheived BCR, so can we optimize the constance?

    For space efficiency, we used O(m) extra space in Solution 1. Is it possible for us to only use O(1) space? 

    We sure can.

    In solution 1, we used 2 arrays of size 26 and 1 queue of max size m(m is the length of p);

    we can reduce the space usage to only 1 array of size 26.  

    The key idea here is that we initalizethis array to be p's characters appearance times.

    Then we modify its element and restore it as we scan through s.

    If we find a match, decrease that matched character's apperance times by 1.

    To simulate the queue used in solution 1, we introduce a new variable matched that keeps track

    of how many characters we've matched so far.  This matched variable also saves us from 

    the O(26) anagram check each time we've found a substring of length m in solution 1.

     1 public class Solution {
     2     public List<Integer> findAnagrams(String s, String p) {
     3         List<Integer> ans = new ArrayList <Integer>();
     4         int[] sum = new int[26];
     5         int plength = p.length(), slength = s.length();
     6         //store all characters' frequencies from p
     7         for(char c : p.toCharArray()){
     8             sum[c - 'a']++;
     9         }
    10         int start = 0, end = 0, matched = 0;
    11         while(end < slength){
    12             //find a character match
    13             if(sum[s.charAt(end) - 'a'] >= 1){
    14                 matched++;
    15             }
    16             sum[s.charAt(end) - 'a']--;
    17             end++;
    18             //if find all needed matches, add index start to final result
    19             if(matched == plength) {
    20                 ans.add(start);
    21             }
    22             //sliding window principle
    23             if(end - start == plength){
    24                 //found a match at index start before need to decrease matched 
    25                 //by 1 as s.charAt(start) will be out of the sliding window
    26                 if(sum[s.charAt(start) - 'a'] >= 0){
    27                     matched--;
    28                 }
    29                 //restore the frequency of character s.charAt(start) for later check
    30                 sum[s.charAt(start) - 'a']++;
    31                 start++;
    32             }
    33         }
    34         return ans;
    35     }
    36 }

     Rewrite of solution 2 to make it have the same code structure with Minimum Window Substring.

     1 public class Solution {
     2     public List<Integer> findAnagrams(String s, String p) {
     3         List<Integer> ans = new ArrayList <Integer>();
     4         int[] sum = new int[26];
     5         int plength = p.length(), slength = s.length();
     6         //store all characters' frequencies from p
     7         for(char c : p.toCharArray()){
     8             sum[c - 'a']++;
     9         }
    10         int start = 0, end = 0, matched = 0;
    11         for(end = 0; end < slength; end++) {
    12             //find a character match
    13             if(sum[s.charAt(end) - 'a'] > 0){
    14                 matched++;
    15             }
    16             sum[s.charAt(end) - 'a']--;
    17             //if find all needed matches, add index start to final result
    18             if(matched == plength) {
    19                 ans.add(start);
    20             }
    21             //sliding window principle
    22             if(end - start  + 1 == plength){
    23                 //restore the frequency of character s.charAt(start) for later check
    24                 sum[s.charAt(start) - 'a']++;
    25                 //found a match at index start before; need to decrease matched 
    26                 //by 1 as s.charAt(start) will be out of the sliding window
    27                 if(sum[s.charAt(start) - 'a'] > 0){
    28                     matched--;
    29                 }
    30                 start++;
    31             }            
    32         }
    33         return ans;
    34     }
    35 }


    Related Problem
    Minimum Window Substring
  • 相关阅读:
    JAVA,统计字符串里面每个字符出现的次数
    JAVA,遍历HashMap的2种方式,keySet方法和entrySet对象
    JAVA,Map接口常用方法
    JAVA,Collection集合常用API和集合特点
    object类常用方法
    [Algorithm] 双指针应用
    [CV]计算机视觉中值变换、拉普拉斯算子、傅里叶变换
    [CNBlogs]使用Typora和MetaWeblog改善博客园书写体验
    [C#] 动态编程表达式树实现NPOI.IROW快速映射优化
    [CV] 计算机视觉像素位变换(反转变换、对数变换,伽马变换,直方图均衡化)
  • 原文地址:https://www.cnblogs.com/lz87/p/6948738.html
Copyright © 2020-2023  润新知