• 字符串哈希+kmp题


    9.7

    Crazy Search(字符串哈希)

    Many people like to solve hard puzzles some of which may lead them to madness. One such puzzle could be finding a hidden prime number in a given text. Such number could be the number of different substrings of a given size that exist in the text. As you soon will discover, you really need the help of a computer and a good algorithm to solve such a puzzle.
    Your task is to write a program that given the size, N, of the substring, the number of different characters that may occur in the text, NC, and the text itself, determines the number of different substrings of size N that appear in the text.

    As an example, consider N=3, NC=4 and the text "daababac". The different substrings of size 3 that can be found in this text are: "daa"; "aab"; "aba"; "bab"; "bac". Therefore, the answer should be 5.

    Input

    The first line of input consists of two numbers, N and NC, separated by exactly one space. This is followed by the text where the search takes place. You may assume that the maximum number of substrings formed by the possible set of characters does not exceed 16 Millions.

    Output

    The program should output just an integer corresponding to the number of different substrings of size N found in the given text.

    Sample Input

    3 4
    daababac

    Sample Output

    5

    题意:就是给你n(字符长)m(字符种类),然后给你长的字符串,让你查找出以n为长度的不同的字符串有多少个
    #include<iostream>
    #include<cstdio>
    #include<cstring>
    using namespace std;
    const int maxn=16e6;
    bool hash[maxn];
    char word[26];
    int main()
    {
     int n,nc;
     string s;
     cin>>n>>nc>>s;
     memset(hash,0,sizeof(hash));
     memset(word,0,sizeof(word));
     int k=0;
     
     for(int i=0;i<s.length();i++)
     {
      if(word[s[i]-'a']==0)
         word[s[i]-'a']=k++;
      if(k==nc)break;
     }
     
     int len=s.size()-n+1;//daababac三个三个一组则有六组
     int ans=len;//最好的情况,都不相同
     
      //哈希值是唯一的,这里用hash标记
     for(int i=0;i<len;i++)
     {
      int num=0;
      for(int j=i;j<i+n;j++)
        num=num*nc+word[s[j]-'a'];
      num%=maxn;
      if(hash[num])
         ans--;
      else
        hash[num]=1;
     }
     cout<<ans<<endl;
     return 0;
    }
    
    

     9.8

    String  Problem(字符串最大最小表示)

    Give you a string with length N, you can generate N strings by left shifts. For example let consider the string “SKYLONG”, we can generate seven strings:
    String Rank
    SKYLONG 1
    KYLONGS 2
    YLONGSK 3
    LONGSKY 4
    ONGSKYL 5
    NGSKYLO 6
    GSKYLON 7
    and lexicographically first of them is GSKYLON, lexicographically last is YLONGSK, both of them appear only once.
      Your task is easy, calculate the lexicographically fisrt string’s Rank (if there are multiple answers, choose the smallest one), its times, lexicographically last string’s Rank (if there are multiple answers, choose the smallest one), and its times also.

    Input  Each line contains one line the string S with length N (N <= 1000000) formed by lower case letters.OutputOutput four integers separated by one space, lexicographically fisrt string’s Rank (if there are multiple answers, choose the smallest one), the string’s times in the N generated strings, lexicographically last string’s Rank (if there are multiple answers, choose the smallest one), and its times also.Sample Input

    abcder
    aaaaaa
    ababab

    Sample Output

    1 1 6 1
    1 6 1 6
    1 3 2 3

    题解:这道题就是输出所给字符串的最大最小表示的起始位置和数量!
    这道题是kmp以及最大最小表示法的应用,数量我们可以用kmp中求循环节的函数,然后结合最大最小表示法模板
    #include<bits/stdc++.h>
    using namespace std;
    const int maxn=1e6+10;
    char s[maxn];
    int ne[maxn];
    void kmpnext(char s[],int len)
    {
        int i=0,j=-1;
        ne[0]=-1;
        while(i<len)
        {
            if(j==-1||s[i]==s[j])
            {
                i++,j++;
                ne[i]=j;
            }
            else
              j=ne[j];
        }
    }
    int mmstring(int flag,char q[])//最大最小表示法模板
    {
        int len=strlen(q);
        int i=0,j=1,k=0;
        while(i<len&&j<len&&k<len)
        {
            int t=q[(i+k)%len]-q[(j+k)%len];
            if(!t)
               k++;
            else
            {
                if(flag==0)
                {
                    if(t>0)i=i+k+1;
                    else  j=j+k+1;
                }
                else
                {
                    if(t>0)j=j+k+1;
                    else  i=i+k+1;
                }
                if(i==j)  j++;
                k=0;
            }
        }
        if(i<j)return i;
        else  return j;
    }
    int main()
    {
        int minn,maxx;
        while(~scanf("%s",s))
        {
            int len=strlen(s);
            kmpnext(s,len);
            int t=1;
            if(len%(len-ne[len])==0)
               t=len/(len-ne[len]);
            minn=mmstring(0,s);
            maxx=mmstring(1,s);
            printf("%d %d %d %d
    ",minn+1,t,maxx+1,t);
        }
        return 0;
    }
    9.9
    Period(kmp模板)
    For each prefix of a given string S with N characters (each character has an ASCII code between 97 and 126, inclusive), we want to know whether the prefix is a periodic string. That is, for each i (2 <= i <= N) we want to know the largest K > 1 (if there is one) such that the prefix of S with length i can be written as A K ,that is A concatenated K times, for some string A. Of course, we also want to know the period K.
    Input
    The input consists of several test cases. Each test case consists of two lines. The first one contains N (2 <= N <= 1 000 000) – the size of the string S.The second line contains the string S. The input file ends with a line, having the
    number zero on it.
    
    Output
    For each test case, output "Test case #" and the consecutive test case number on a single line; then, for each prefix with length i that has a period K > 1, output the prefix size i and the period K separated by a single space; the prefix sizes must be in increasing order. Print a blank line after each test case.
    
    Sample Input
    3
    aaa
    12
    aabaabaabaab
    0
    Sample Output
    Test case #1
    2 2
    3 3
    
    Test case #2
    2 2
    6 2
    9 3
    12 4
    题解:这道题是kmp模板题,给定字符串,依次查找,判断有循环的位置以及循环个数,主要是用存循环节的函数
    #include<iostream>
    using namespace std;
    const int maxn=1e6+10;
    int ne[maxn]; 
    char s[maxn];
    void kmpnext(char s[],int n)
    {
        int i=0,j=-1;
        ne[0]=-1;
        while(i<n)
        {
            if(j==-1||s[i]==s[j])
            {
                i++,j++;
                int t=i/(i-j);
                if(i%(i-j)==0&&t>1)
                   cout<<i<<" "<<t<<endl;
                if(s[i]!=s[j])ne[i]=j;
                else
                  ne[i]=ne[j];
            }
            else
              j=ne[j];
        }
    }
    int main()
    {
        int x=1,n;
        while(cin>>n&&n)
        {
            cin>>s;
            cout<<"Test case #"<<x<<endl;
            kmpnext(s,n);
            x++;
            cout<<endl;
        }
        return 0;
    }
    9。10
    string matching(扩展kmp)
    String matching is a common type of problem in computer science. One string matching problem is as following:

    Given a string s[0len1]s[0…len−1], please calculate the length of the longest common prefix of s[ilen1]s[i…len−1] and s[0len1]s[0…len−1] for each i>0i>0.

    I believe everyone can do it by brute force.
    The pseudo code of the brute force approach is as the following:



    We are wondering, for any given string, what is the number of compare operations invoked if we use the above algorithm. Please tell us the answer before we attempt to run this algorithm.
    Input
    The first line contains an integer TT, denoting the number of test cases.
    Each test case contains one string in a line consisting of printable ASCII
    characters except space.
    1T301≤T≤30
    * string length 106≤106 for every string
    OutputFor each test, print an integer
    in one line indicating the number of compare operations invoked if we run the
    algorithm in the statement against the input string.
    Sample Input
    3
    _Happy_New_Year_
    ywwyww
    zjczzzjczjczzzjc
    Sample Output
    17
    7
    32
    题意:
    字符串匹配是计算机科学中常见的问题类型。一个字符串匹配问题如下:给定字符串s [0 ... len-1],请计算
    每个i的s [i ... len-1]和s [0 ... len-1]的最长公共前缀的长度> 0。
    我相信每个人都可以通过蛮力来做到这一点。蛮力方法的伪代码如下:我们想知道,
    对于任何给定的字符串,如果我们使用上述算法,调用的比较操作的数量是多少。在我们尝试运行此算法之前,请告诉我们答案

    这道题很懵,看了许多博客,也还是迷迷糊糊的的,扩展kmp,没搞懂,,,,
    • next[i]: T[i]...T[m - 1]与 T 的最长相同前缀长度;
    • extend[i]: S[i]...S[n - 1]与 T 的最长相同前缀长度。
    
    
    #include<cstdio>
    #include<iostream>
    #include<cstring>
    using namespace std;
    const int maxn=1e6+10;
    int Next[maxn];
    int expand[maxn];
    char s[maxn];
    void GNext(char t[],int le,int Next[])
    {
        int a=0,b=0;
        Next[0]=le;
        for(int i=1;i<le;i++)
        {
            if(i>=b||i+Next[i-a]>=b)
            {
                if(i>=b)
                   b=i;
                while(b<le&&t[b]==t[b-i])
                   b++;
                Next[i]=b-i;
                a=i; 
            }
            else
               Next[i]=Next[i-a];
        }
    }
    void Gexpand(char s[],char t[],int expand[],int Next[])
    {
        int a=0,b=0;
        int len1=strlen(s);
        int len2=strlen(t);
        GNext(t,len2,Next);
        for(int i=0;i<len1;i++)
        {
            if(i>=b||i+Next[i-a]>=b)
            {
                if(i>=b)  b=i;
                while(b<len1&&b-i<len2&&s[b]==t[b-i])
                   b++;
                expand[i]=b-i;
                a=i;
            }
            else
               expand[i]=Next[i-a];
        }
    }
    int main()
    {
        int t;
        scanf("%d",&t);
        while(t--)
        {
            scanf("%s",s);
            Gexpand(s+1,s,expand,Next);
            int len=strlen(s);
            len--;
            long long ans=0;
            for(int i=0;i<len;i++)
            {
                if(i+expand[i]<len)
                   ans+=expand[i]+1;
                else
                  ans+=expand[i];
            }
            printf("%lld
    ",ans);
        }
        return 0;
    }
    
    

     总结:这周除了一些思维题,就是和kmp相关算法的题了,最大最小+kmp,扩展kmp,在学习过程中,也看了一些kmp讲解的视频,也搞懂了next存的前后缀,循环节,稍简单的还能做,但一和其他算法联结,就蒙圈了,看题解,但有些地方还是比较晦涩难懂,手动模拟,看视频,有些部分仍旧不能很好的连接!

    
    
    
    
     
  • 相关阅读:
    NFS挂载报错
    关于git的reset指令说明-soft、mixed、hard
    nginx关于限制请求数和连接数
    Jenkins初级使用过程中的异常处理(1)
    Jenkins的初级应用(2)-Invoke Phing targets
    Jenkins的初级应用(1)-Publish Over SSH
    解决Jenkins安装的时区问题
    用Ubuntu快速安装Jenkins
    TIME_WAIT状态过多的排查
    在linux环境下用中文查询数据库
  • 原文地址:https://www.cnblogs.com/ylrwj/p/11478327.html
Copyright © 2020-2023  润新知