Find all the repeating substring of specified length in a large string sequence.
For e.g.
Input String: "ABCACBABC" repeated sub-string length: 3 Output: ABC
eg.
Input String: "ABCABCA" repeated sub-string length: 2 Output: AB, BC, CA
Solution
Similar to [Amazon] Longest Repeating Substring, the best solution is to do Suffix Tree, or suffix array. We then need to print nodes on a certain level, who has more than 1 descendant.
However, since the length of substring is given, we can also do simply iteration: insert all substring with given length into a HashSet, and check repetition. ref
Code
Suffix tree solution: not written.
Hashset code:
public List<String> solve(String input, int k) { List<String> ans = new ArrayList<String>(); HashSet<String> set = new HashSet<String>(); for (int i = 0; i <= input.length() - k; i++) { String sub = input.substring(i, i + k); if (set.contains(sub)) { ans.add(sub); } set.add(sub); } return ans; }
reference: http://www.shuatiblog.com/blog/2015/01/11/all-repeating-substring-given-length/