传送门
题面:
Biologists finally invent techniques of repairing DNA that contains segments causing kinds of inherited diseases. For the sake of simplicity, a DNA is represented as a string containing characters 'A', 'G' , 'C' and 'T'. The repairing techniques are simply to change some characters to eliminate all segments causing diseases. For example, we can repair a DNA "AAGCAG" to "AGGCAC" to eliminate the initial causing disease segments "AAG", "AGC" and "CAG" by changing two characters. Note that the repaired DNA can still contain only characters 'A', 'G', 'C' and 'T'.
You are to help the biologists to repair a DNA by changing least number of characters.
Input
The input consists of multiple test cases. Each test case starts with a line containing one integers N (1 ≤ N ≤ 50), which is the number of DNA segments causing inherited diseases.
The following N lines gives N non-empty strings of length not greater than 20 containing only characters in "AGCT", which are the DNA segments causing inherited disease.
The last line of the test case is a non-empty string of length not greater than 1000 containing only characters in "AGCT", which is the DNA to be repaired.
The last test case is followed by a line containing one zeros.
Output
For each test case, print a line containing the test case number( beginning with 1) followed by the
number of characters which need to be changed. If it's impossible to repair the given DNA, print -1.
Sample Input
2 AAA AAG AAAG 2 A TG TGAATG 4 A G C T AGT 0
Sample Output
Case 1: 1 Case 2: 4 Case 3: -1
题意:
给出一些不合法的模式DNA串,给出一个原串,问最少需要修改多少个字符,使得原串中不包含非法串
题目分析:
因为涉及多串匹配,因此我们考虑使用AC自动机进行求解。
我们可以考虑如下dp,设为当前匹配到长度为i的字符串,是位于Trie树第j个结点所需要替换的最少次数。而在我们在AC自动机上匹配匹配串的过程中,我们可以发现,当且仅当下一个指向的结点与模式串的第i个字符不相同时,我们才将答案加1,此后不断维护最小值即可。
即有状态转移方程:
之后转移即可。
代码:
#include <bits/stdc++.h>
#define maxn 1100
using namespace std;
int dp[maxn][maxn];
int n;
char s[maxn];
char st[maxn];
int tot=0;
const int INF=0x3f3f3f3f;
struct Trie{
int next[maxn][4],fail[maxn],id,root,End[maxn];
int newnode(){
for(int i=0;i<4;i++){
next[id][i]=-1;
}
End[id]=0;
return id++;
}
void inti(){
id=0;
root=newnode();
}
int get_char(char str){
if(str=='A') return 0;
if(str=='T') return 1;
if(str=='C') return 2;
if(str=='G') return 3;
}
void Insert(char *str){
int len=strlen(str);
int now=root;
for(int i=0;i<len;i++){
if(next[now][get_char(str[i])]==-1){
next[now][get_char(str[i])]=newnode();
}
now=next[now][get_char(str[i])];
}
End[now]=1;
}
void build(){
fail[root]=root;
queue<int> que;
for(int i=0;i<4;++i){
if(next[root][i]==-1)
next[root][i]=root;
else{
fail[next[root][i]]=root;
que.push(next[root][i]);
}
}
while(!que.empty()){
int now=que.front();
que.pop();
if(End[fail[now]]) End[now]=true;
for(int i=0;i<4;++i){
if(next[now][i]==-1)
next[now][i]=next[fail[now]][i];
else{
fail[next[now][i]]=next[fail[now]][i];
que.push(next[now][i]);
}
}
}
}
void solve(char *str){
int len=strlen(str);
for(int i=0;i<=len;i++){
for(int j=0;j<id;j++){
dp[i][j]=INF;
}
}
dp[0][0]=0;
for(int i=0;i<len;i++){
for(int j=0;j<id;j++){
if(dp[i][j]!=INF){
for(int k=0;k<4;k++){
if(End[next[j][k]]) continue;
int tmp;
if(get_char(str[i])==k) tmp=dp[i][j];
else tmp=dp[i][j]+1;
dp[i+1][next[j][k]]=min(dp[i+1][next[j][k]],tmp);
}
}
}
}
int res=INF;
for(int i=0;i<id;i++){
res=min(dp[len][i],res);
}
printf("Case %d: ",++tot);
if(res==INF) puts("-1");
else cout<<res<<endl;
}
}ac;
int main()
{
int n;
while(~scanf("%d",&n)){
if(!n) break;
ac.inti();
for(int i=0;i<n;i++){
scanf("%s",st);
ac.Insert(st);
}
scanf("%s",s);
ac.build();
ac.solve(s);
}
}