Substring with Concatenation of All Words
题目:
You are given a string, S, and a list of words, L, that are all of the same length. Find all starting indices of substring(s) in S that is a concatenation of each word in L exactly once and without any intervening characters.
For example, given:
S: "barfoothefoobarman"
L: ["foo", "bar"]
You should return the indices: [0,9]
.
(order does not matter).
开始想的维持一个窗口,c++实现代码如下:
vector<int> findSubstring(string S, vector<string> &L) { vector<int> result; map<string, int> lmap; int llen = L.size(); if(llen == 0)return result; int l = L[0].length(); int i,j,k,start,window = 0; //vector to map lmap.clear(); vector<string>::iterator iter; for(iter = L.begin(); iter != L.end(); iter++) { lmap.insert(pair<string,int>(*iter, -1)); } //find for(i = 0; i < l; i++) { start = i; for(auto it = lmap.begin(); it != lmap.end(); ++it) (*it).second = -1; window = 0; for(j = i; j < S.length(); j+=l) { if(j+l >= S.length())break; map<string, int>::iterator itr = lmap.find(S.substr(j, l)); if(itr == lmap.end()) { //如果没有匹配上,则前面的都不匹配,window=0,前面匹配的字符串对应的值都改为-1 for(k = start; k < j; k+=l) { lmap.find(S.substr(k,l)) -> second = -1; } start = j+l; window = 0; }else { int cur = itr ->second; if(cur == -1) { itr -> second = j; window++; if(window == llen) { result.push_back(start); window--; lmap.find(S.substr(j-l*(llen - 1),l)) -> second = -1; start += l; } }else { for(k = start; k <= cur; k+=l) { lmap.find(S.substr(k,l)) -> second = -1; window--; } start += l; itr->second = j; window++; } } } } return result; } };
但是有一个问题遇到L里有相同字符串时匹配不了。现在还没解决。
后来是参考了http://blog.csdn.net/ojshilu/article/details/22212703这篇博客。
思路是:假设L中的单位长度为n,依次从S中取长度为n的子串,如果在L中,就记下来。需要借助hash或map,如果整个L都匹配完了,就算是一个concatenation;当匹配错误的时候,S右移一个位置。但是这样效率很低,用了742ms,因为有些查找是可以跳过而没有跳过。还是想办法把窗口思想的实现吧。
vector<int> findSubstring(string S, vector<string> &L) { vector<int> result; int n = L[0].size(); int len = n * L.size(); if(len > S.size()) return result; map<string, int> m; for(int i=0;i<L.size();i++) m[L[i]]++; // 发现可以不用特意初始化直接开始自增 int idx = 0; map<string, int> tmp; while(idx <= S.size() - len) { bool flag = true; tmp.clear(); for(int i=idx;i<=idx+n*(L.size()-1);i+=n) { string now = S.substr(i, n); if(m.find(now) == m.end()) { flag = false; break; } tmp[now]++; if(tmp[now] > m[now]) { flag = false; break; } } if(flag == true) result.push_back(idx); idx++; } return result; }