牛客网暑期ACM多校训练营（第三场） E Sort String 哈希处理字符串（模板）

链接：https://www.nowcoder.com/acm/contest/141/E
来源：牛客网

Eddy likes to play with string which is a sequence of characters. One day, Eddy has played with a string S for a long time and wonders how could make it more enjoyable. Eddy comes up with following procedure:

1. For each i in [0,|S|-1], let S_i be the substring of S starting from i-th character to the end followed by the substring of first i characters of S. Index of string starts from 0.
2. Group up all the S_i. S_i and S_j will be the same group if and only if S_i=S_j.
3. For each group, let L_j be the list of index i in non-decreasing order of S_i in this group.
4. Sort all the L_j by lexicographical order.

Eddy can't find any efficient way to compute the final result. As one of his best friend, you come to help him compute the answer!

输入描述:

Input contains only one line consisting of a string S.

1≤ |S|≤ 10

⁶


S only contains lowercase English letters(i.e.

$\texttt{"a-z"}$

).

输出描述:

First, output one line containing an integer K indicating the number of lists.
For each following K lines, output each list in lexicographical order.
For each list, output its length followed by the indexes in it separated by a single space.

示例1

输入

复制

abab

输出

复制

2
2 0 2
2 1 3

示例2

输入

复制

deadbeef

输出

复制

8
1 0
1 1
1 2
1 3
1 4
1 5
1 6
1 7

题意：根据原字符串构造新串，对于原字符串从0~|S|-1，i从0开始，从i到最后的子串放到从0到i-1子串的前面。对于这些新构造的子串，从0开始编号，将相同的新字符串们归为一组，输出一共有多少组，每组有多少个新字符串，以及新字符串的编号。
分析：按照题目的意思一步步来
　　（1）对于第一种条件，我们需要将字符串分成n个子串，从i=0开始，将i后面的子串放到前面去，这样生成n个子串
　　（2）然后对于每个子串我们需要将相同的子串放在一个集合内，这样产生m个集合
　　（3）每个集合内的子串按照他在原来的主串中的切割位置按递增排序
　　（4）将集合按照其内的切割位置按照字典序排序
然后我们来分析做法
首先是暴力做法：
　　（1）直接记录每个子串，用string自带的substr函数（看源码这个函数的时间复杂度为O（n））
　　（2）我们用map记录下每个字符串出现的次数，则map里的每个字符串就有了自己对应的集合
　　（3）在用map记录时，我们可以用个vector数组记录下每个字符串切割的位置
　　（4）将vector存在一个结构体里面，然后将vector里的值转成string，通过比较string排序
如果这样暴力做的话很明显会时间超限，内存超限（当n为10^6时，map和结构体内的每个string存了10^6长的字符串会超内存限制）
我们肯定不能通过存字符串来进行操作，于是我们想到了用哈希处理字符串，然后存下每个字符串的哈希值，注意此时要保证的是每个字符串的哈希值独一无二（如果你有时间可以计算下10^6个字符转化成整数最大的值），我是通过几次wa得出用哈希取余的值要大于10^12
用哈希处理字符串后我们可以得到一个哈希的数组，然后我们可以用过一次sort排序将哈希值（字符串的值）按从小到大排序，这样不仅相同的字符串会在一起而且可以得到一个有序的哈希序列
接下来我们直接遍历哈希数组，将相同的值加入同一个集合内（我用了一个数组存集合），判断的时候因为相同的在一起，所以如果后面等于前面i可以一直加，用一个数组保存i加了多少次，那么这个数组的每个值就对应了每个集合的长度，这样方便了我们以后遍历集合
中间还有些细节操作看代码把

#include <map>
#include <set>
#include <stack>
#include <cmath>
#include <queue>
#include <cstdio>
#include <vector>
#include <string>
#include <cstring>
#include <iomanip>
#include <iostream>
#include <algorithm>
#define debug(a) cout << #a << " " << a << endl
using namespace std;
typedef long long ll;
const ll maxn = 1e6+10;
const ll mod = 1000000000000023;
char s[maxn];
ll n, k, ans;
ll a[maxn], len[maxn], vis[maxn];
struct node {
    ll sum, pos;
}h[maxn];
bool cmp( node p, node q ) {
    if( p.sum == q.sum ) {
        return p.pos < q.pos;
    }
    return p.sum < q.sum;
}
int main() {
    ios::sync_with_stdio(0),cin.tie(0),cout.tie(0);
    scanf("%s",s);
    for( ll i = 0; i < maxn; i ++ ) {
        a[i] = len[i] = vis[i] = h[i].sum = h[i].pos = 0;
    }
    ll n = strlen(s);
    ll t = 1;
    for( ll i = 0; i < n; i ++ ) {
        h[0].sum = ( h[0].sum*37 + s[i]-'?' ) % mod;
        if( i ) {
            t = 37*t%mod;
        }
    }
    for( ll i = 1; i < n; i ++ ) {
        h[i].sum = ( (h[i-1].sum-t*(s[i-1]-'?')%mod+mod)*37+s[i-1]-'?' )%mod;
        h[i].pos = i;
    }
    sort( h, h+n, cmp );
    k = 0;
    for( ll i = 0; i < n; i ++ ) {
        t = i;
        while( t+1<n && h[t+1].sum==h[t].sum ) {
            t ++;
        }
        a[++k] = i;
        len[k] = t-i+1;
        vis[h[i].pos] = k;
        i = t;
    }
    ans = 0;
    for( ll i = 0; i < n; i ++ ) {
        if( vis[i] ) {
            ans ++;
        }
    };
    printf("%lld
",ans);
    for( ll i = 0; i < n; i ++ ) {
        if( vis[i] ) {
            printf("%lld",len[vis[i]]);
            for( ll j = a[vis[i]]; j < a[vis[i]]+len[vis[i]]; j ++ ) {
                printf(" %lld",h[j].pos);
            }
            printf("
");
        }
    }
    return 0;
}

彼时当年少，莫负好时光。

相关阅读:
Linux第七周学习总结——可执行程序的装载
 《深入理解计算机系统》第七章读书笔记
 《Linux内核设计与实现》第三章读书笔记
 《Linux内核设计与实现》第十八章读书笔记
 Linux第六周学习总结——进程额管理和进程的创建
 Linux第五周学习总结——扒开系统调用的三层皮（下
 《Linux内核设计与实现》第五章读书笔记
 #Linux第四周学习总结——扒开系统调用的三层皮（上）
《Linux内核设计与实现》第一二章读书笔记
 Linux第三周学习总结——构造一个简单的Linux系统MenuOS
原文地址：https://www.cnblogs.com/l609929321/p/9376715.html