Problem###
strcmp() is a library function in C/C++ which compares two strings. It takes two strings as input
parameter and decides which one is lexicographically larger or smaller: If the first string is greater then
it returns a positive value, if the second string is greater it returns a negative value and if two strings
are equal it returns a zero. The code that is used to compare two strings in C/C++ library is shown
below:
int strcmp(char *s,char *t)
{
int i;
for(i=0;s[i]==t[i];i++)
if(s[i]==' ') return 0;
return s[i]-t[i];
}
Figure: The standard strcmp() code provided for this problem.
The number of comparisons required to compare two strings in strcmp() function is never returned
by the function. But for this problem you will have to do just that at a larger scale. strcmp() function
continues to compare characters in the same position of the two strings until two different characters
are found or both strings come to an end. Of course it assumes that last character of a string is a null
(‘ ’) character. For example the table below shows what happens when “than” and “that”; “therE”
and “the” are compared using strcmp() function. To understand how 7 comparisons are needed in
both cases please consult the code block given above.
Input###
The input file contains maximum 10 sets of inputs. The description of each set is given below:
Each set starts with an integer N (0 < N < 4001) which denotes the total number of strings. Each
of the next N lines contains one string. Strings contain only alphanumerals (‘0’...‘9’, ‘A’...‘Z’, ‘a’...‘z’)
have a maximum length of 1000, and a minimum length of 1.
Input is terminated by a line containing a single zero.
Output###
For each set of input produce one line of output. This line contains the serial of output followed by
an integer T. This T denotes the total number of comparisons that are required in the strcmp()
function if all the strings are compared with one another exactly once. So for N strings the function
strcmp() will be called exactly N(N−1)/2 times. You have to calculate total number of comparisons
inside the strcmp() function in those N(N−1)/2 calls. You can assume that the value of T will fit safely
in a 64-bit signed integer. Please note that the most straightforward solution (Worst Case Complexity
O(N²*1000) will time out for this problem.
Sample Input###
2
a
b
4
cat
hat
mat
sir
0
Sample Output###
Case 1: 1
Case 2: 6
想法##
这题一开始看还是挺简单的。
一眼看出用trie,到树的每一个分叉计算子树互相比较次数
但是细节好多啊!!!
要分3种情况:
1.两个单词完全相同####
设单词长度为len
则比较次数为2×(len+1)
2.一个单词包含另一个单词####
设短的单词长度为len
则比较次数为2×len+1
3.其他情况####
设两个单词相同部分长度为len
则比较次数为len×2+1
trie每个节点要记下flag(有多少单词在当前点结束)与size(当前点及其子树中共多少单词)
每种情况在计算中都要判断
代码:
(注意:答案要转long long)
#include<cstdio>
#include<iostream>
#include<cstring>
#include<algorithm>
using namespace std;
const int N = 4005;
typedef long long ll;
struct trie {
trie *ch[65];
int flag,size;
void clear(){
flag=size=0;
for(int i=0;i<63;i++) ch[i]=NULL;
}
}pool[N*1005],*root;
int cnt;
int get(char s){
if(s>='0' && s<='9') return s-'0';
else if(s>='A' && s<='Z') return s-'A'+10;
return s-'a'+36;
}
void add(){
char s[1005];
scanf("%s",s);
int len=strlen(s),id;
trie *p=root;
p->size++;
for(int i=0;i<len;i++){
id=get(s[i]);
if(!p->ch[id]){
pool[++cnt].clear();
p->ch[id]=&pool[cnt];
}
p=p->ch[id];
p->size++;
}
p->flag++;
}
ll dfs(trie *p,int dep){
int size=0;
ll ret=0;
size=p->size-p->flag;
for(int i=0;i<63;i++)
if(p->ch[i])
ret+=((ll)size-p->ch[i]->size)*p->ch[i]->size*((ll)dep*2+1)+2*dfs(p->ch[i],dep+1);
ret/=2;
if(p->flag){
ret+=(ll)p->flag*(p->flag-1)*(dep+1);
ret+=(ll)p->flag*size*(dep*2+1);
}
return ret;
}
int n;
int main()
{
int i,kase=0;
root=&pool[++cnt];
while(scanf("%d",&n) && n){
cnt=1;
pool[1].clear();
for(i=0;i<n;i++) add();
printf("Case %d: %lld
",++kase,dfs(root,0));
}
return 0;
}