KMP算法实现就是字符查找问题,假设现在有这样一个问题,有一个文本串S和一个模式串P,要查找P在S中的位置,即从文本串S中找出模式串P第一次出现的位置。
如何比较字符串
int j;
j=0;//j可以看做表示当前已经匹配完的模式串的最后一位的位置
//如果看不懂,你也可以理解为j表示模式串匹配到第几位了
for(int i=1;i<=la;i++)
{
while(j&&b[j+1]!=a[i])j=kmp[j];
//如果失配 ,那么就不断向回跳,直到可以继续匹配
if (b[j+1]==a[i]) j++;
//如果匹配成功,那么对应的模式串位置++
if (j==lb)
{
cout<<i-lb+1<<endl;
j=kmp[j];
//继续匹配
}
}
如何求kmp
j=0;
for (int i=2;i<=lb;i++)
{
while(j&&b[i]!=b[j+1])
//此处判断j是否为0的原因在于,如果回跳到第一个字符就不 用再回跳了
j=kmp[j];
//通过自己匹配自己来得出每一个点的kmp值
if(b[j+1]==b[i])j++;
kmp[i]=j;
//i+1失配后应该如何跳
}
其他应用
求字符串周期
思路:利用kmp数组的含义来解。
kmp数组中储存的是这个字符串前缀和后缀中相同字符串的最长长度
1.一个串的最小循环节长度:len - kmp[len]。
2.若len%(len-kmp[len]) == 0, 则这个字符串的最小周期为len-kmp[len]。一定要注意前提是 len % (len - next[len]) == 0,否则不存在循环周期。
char a[100];
int kmp[100];
main(void)
{
int n;
cin>>n;
_0for(p,n)
{
cin>>a+1;
int j=0;
int flag=0;
int l=strlen(a+1);
memset(kmp,0,sizeof(kmp));
for(int i=2;i<=l;i++)
{
while(j>0&&a[j+1]!=a[i])
j=kmp[j];
if(a[j+1]==a[i])
{
j++;
}
kmp[i]=j;
}
if(l%(l-kmp[l])||kmp[l]==0)
cout<<l<<endl;
else
{
cout<<l-kmp[l]<<endl;
}
if(p!=n-1)cout<<"
";
}
}
字符串求前后缀
题意
The little cat is so famous, that many couples tramp over hill and dale to Byteland, and asked the little cat to give names to their newly-born babies. They seek the name, and at the same time seek the fame. In order to escape from such boring job, the innovative little cat works out an easy but fantastic algorithm:
Step1. Connect the father's name and the mother's name, to a new string S.
Step2. Find a proper prefix-suffix string of S (which is not only the prefix, but also the suffix of S).
Example: Father='ala', Mother='la', we have S = 'ala'+'la' = 'alala'. Potential prefix-suffix strings of S are {'a', 'ala', 'alala'}. Given the string S, could you help the little cat to write a program to calculate the length of possible prefix-suffix strings of S? (He might thank you by giving your baby a name:)
Input
The input contains a number of test cases. Each test case occupies a single line that contains the string S described above.
Restrictions: Only lowercase letters may appear in the input. 1 <= Length of S <= 400000.
Output
For each test case, output a single line with integer numbers in increasing order, denoting the possible length of the new baby's name.
Sample Input
ababcababababcabab
aaaaa
Sample Output
2 4 9 18
1 2 3 4 5
要求出前后缀相同可能的长度
思路:最长当然是l,求得样例1的kmp
0 0 1 2 0 1 2 3 4 3 4 3 4 5 6 7 8 9
j=L
step 1 :
ababcababababcabab
ababcababababcabab
前后移位j-kmp[j],依然匹配而且长度为kmp[j],j=kmp[j]=9(有点周期的感觉)
step2:
ababcabab
ababcabab
step3
前后移位j-kmp[j],依然匹配而且长度为kmp[j],j=kmp[j]=4
abab
abab
step4
前后移位j-kmp[j],依然匹配而且长度为kmp[j],j=kmp[j]=2
ab
ab
最后kmp[j]=0;
这样就求得了所有前后缀相等可能的长度。
int kmp[400005];
char s[400005];
int ans[400005];
main(void)
{
while(cin>>s+1)
{
int j=0;
int l=strlen(s+1);
for(int i=2;i<=l;i++)
{
while(j>0&&s[j+1]!=s[i])j=kmp[j];
if(s[j+1]==s[i])j++;
kmp[i]=j;
}
int cnt=0;
j=l;
while(kmp[j]!=0)
{
ans[++cnt]=kmp[j];
j=kmp[j];
}
for(int i=cnt;i>=1;i--)
printf("%d ",ans[i]);
printf("%d
",l);
}
}