Repeated DNA Sequences
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].
每个长度为10的字符串,通过查找map,判断之前是否出现过。
用hash可以降低空间复杂度。
class Solution { public: vector<string> findRepeatedDnaSequences(string s) { vector<string> ret; if(s.size() < 10) return ret; unordered_map<size_t, int> m; hash<string> strh; for(int i = 0; i <= s.size()-10; i ++) { string sub = s.substr(i, 10); m[strh(sub)] ++; if(m[strh(sub)] == 2) {//find repeat ret.push_back(sub); } } return ret; } };