• Regular Number 字符串匹配算法 Shift_and

    Using regular expression to define a numeric string is a very common thing. Generally, use the shape as follows: 
    (0|9|7) (5|6) (2) (4|5) 
    Above regular expression matches 4 digits:The first is one of 0,9 and 7. The second is one of 5 and 6. The third is 2. And the fourth is one of 4 and 5. The above regular expression can be successfully matched to 0525, but it cannot be matched to 9634. 
    Now,giving you a regular expression like the above formula,and a long string of numbers,please find out all the substrings of this long string that can be matched to the regular expression. 
    It contains a set of test data.The first line is a positive integer N (1 ≤ N ≤ 1000),on behalf of the regular representation of the N bit string.In the next N lines,the first integer of the i-th line is ai(1≤ai≤10)ai(1≤ai≤10),representing that the i-th position of regular expression has aiai numbers to be selected.Next there are aiai numeric characters. In the last line,there is a numeric string.The length of the string is not more than 5 * 10^6.
    Output all substrings that can be matched by the regular expression. Each substring occupies one line
    Sample Input
    3 0 9 7
    2 5 7
    2 2 5
    2 4 5
    Sample Output


    用D来记录前缀的匹配情况,要使用Shift 算法,需要一个辅助表B。B 是一个字典,key 是问题域字符集中的每个字符,value 是一个n 位无符号整数,记录该字符在模式串T 的哪些位置出现。


      ,Shift-And 算法实现
    Shift-And 匹配过程代码:


    由于位运算在计算机中可以并行进行,每次循环的执行是常数时间的,所以上面代码段的复杂度是 O(m)。

    3,辅助表 B
    上面没有提到如何得到辅助表B。很简单,只要获得模式串T 中每个字符出现的位置。

    using namespace std;
    typedef long long LL;
    typedef unsigned long long ULL;
    const int MAXN = 5e6 + 9;
    #define L 1009
    #define INF 1000000009
    #define eps 0.00000001
    #define MOD 1000
    bitset<1009> B[256], D;
    char str[MAXN];
    int main()
        int n, tmp, t;
        scanf("%d", &n);
        for (int i = 0; i < n; i++)
            scanf("%d", &tmp);
            while (tmp--)
                scanf("%d", &t);
        int l = strlen(str);
        for (int i = 0; i < l; i++)
            D = (D << 1).set(0)&B[str[i] - '0'];
            if (D[n - 1])
                char ch = str[i + 1];
                str[i + 1] = '';
                puts(str + i - n + 1);
                str[i + 1] = ch;
  • 相关阅读:
    设计模式 go语言实践-5 外观模式
    .net 5 preview发布
    设计模式 Vs实践-4 桥接模式
    设计模式 Vs实践-3 装饰器模式
    设计模式 Vs实践-2 抽象工厂模式
    设计模式 Vs实践-1 工厂模式
    powshell 输出字符编码的问题,设置为utf-8
  • 原文地址:https://www.cnblogs.com/joeylee97/p/7373330.html
Copyright © 2020-2023  润新知