Time Limit: 1000MS | Memory Limit: 65536K | |
Total Submissions: 29759 | Accepted: 11986 |
Description
The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter 'e'. He was a member of the Oulipo group. A quote from the book:
Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…
Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T's is not unusual. And they never use spaces.
So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {'A', 'B', 'C', …, 'Z'} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap.
Input
The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:
- One line with the word W, a string over {'A', 'B', 'C', …, 'Z'}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W).
- One line with the text T, a string over {'A', 'B', 'C', …, 'Z'}, with |W| ≤ |T| ≤ 1,000,000.
Output
For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T.
Sample Input
3 BAPC BAPC AZA AZAZAZA VERDI AVERDXIVYERDIAN
Sample Output
1 3 0
Source
1.按照递推的思想:
根据定义next[0]=-1,假设next[j]=k, 即P[0...k-1]==P[j-k,j-1]
1)若P[j]==P[k],则有P[0..k]==P[j-k,j],很显然,next[j+1]=next[j]+1=k+1;
2)若P[j]!=P[k],则可以把其看做模式匹配的问题,即匹配失败的时候,k值如何移动,显然k=next[k]。
因此可以这样去实现:
1 void getNext(char *p,int *next) 2 { 3 int j,k; 4 next[0]=-1; 5 j=0; 6 k=-1; 7 while(j<strlen(p)-1) 8 { 9 if(k==-1||p[j]==p[k]) //匹配的情况下,p[j]==p[k] 10 { 11 j++; 12 k++; 13 next[j]=k; 14 } 15 else //p[j]!=p[k] 16 k=next[k]; 17 } 18 }
问题: next应该到数组长度……
优化一点:
1 void getNext(char *p,int *next) 2 { 3 int j,k; 4 int lastIndex=strlen(p)-1; 5 next[0]=-1; 6 j=0; 7 k=-1; 8 while(j < lastIndex) 9 { 10 if(k==-1||p[j]==p[k]) //匹配的情况下,p[j]==p[k] 11 { 12 ++j; 13 ++k; 14 15 //若p[j]==p[k],则需要修正 16 if(p[j]==p[k]) 17 next[j]=next[k]; 18 else 19 next[j]=k; 20 } 21 else //p[j]!=p[k] 22 k=next[k]; 23 } 24 }
2.直接求解方法
1 void getNext(char *p,int *next) 2 { 3 int i,j,temp; 4 for(i=0;i<strlen(p);i++) 5 { 6 if(i==0) 7 { 8 next[i]=-1; //next[0]=-1 9 } 10 else if(i==1) 11 { 12 next[i]=0; //next[1]=0 13 } 14 else 15 { 16 temp=i-1; 17 for(j=temp;j>0;j--) 18 { 19 if(equals(p,i,j)) 20 { 21 next[i]=j; //找到最大的k值 22 break; 23 } 24 } 25 if(j==0) 26 next[i]=0; 27 } 28 } 29 } 30 31 bool equals(char *p,int i,int j) //判断p[0...j-1]与p[i-j...i-1]是否相等 32 { 33 int k=0; 34 int s=i-j; 35 for(;k<=j-1&&s<=i-1;k++,s++) 36 { 37 if(p[k]!=p[s]) 38 return false; 39 } 40 return true; 41 }
本题代码:
1 #include<iostream> 2 #include<cstdio> 3 #include<cstring> 4 5 using namespace std; 6 7 char w[10008], t[1000008]; 8 int next[10008]; 9 10 void getNext() 11 { 12 int j, k; 13 int i = strlen(w); 14 j = 0; 15 k = -1; 16 next[0] = -1; 17 18 while(j < i) 19 { 20 if(k == -1 || w[j] == w[k]) 21 { 22 j++; 23 k++; 24 next[j] = k; 25 } 26 else 27 k = next[k]; 28 } 29 } 30 31 int main() 32 { 33 int c; 34 scanf("%d", &c); 35 while(c--) 36 { 37 memset(next, 0, sizeof(next)); 38 scanf("%s", w); 39 scanf("%s", t); 40 getNext(); 41 int i, j, l1, l2; 42 l1 = strlen(w); 43 l2 = strlen(t); 44 i = j = 0; 45 int sum = 0; 46 while(i < l1 && j < l2) 47 { 48 if(w[i] == t[j] || i == -1) 49 { 50 i++; 51 j++; 52 } 53 else 54 i = next[i]; 55 56 if(i == l1) 57 { 58 sum++; 59 i = next[i]; 60 } 61 } 62 printf("%d ", sum); 63 } 64 return 0; 65 }