SPOJ - DISUBSTR Distinct Substrings (后缀数组)

时间:2023-03-09 12:55:22
SPOJ - DISUBSTR  Distinct Substrings  (后缀数组)

Given a string, we need to find the total number of its distinct substrings.

Input

T- number of test cases. T<=20;
Each test case consists of one string, whose length is <= 1000

Output

For each test case output one number saying the number of distinct substrings.

Example

Sample Input:
2
CCCCC
ABABA

Sample Output:
5
9

Explanation for the testcase with string ABABA: 
len=1 : A,B
len=2 : AB,BA
len=3 : ABA,BAB
len=4 : ABAB,BABA
len=5 : ABABA
Thus, total number of distinct substrings is 9.

题意:

为字符串的子串个数

思路:

使用后缀数组解决。

按sa遍历后缀数组,每一个后缀的贡献即为n-sa[i]-lcp[i];

这里的lcp就是你们所说的height

#include<iostream>
#include<algorithm>
#include<vector>
#include<stack>
#include<queue>
#include<map>
#include<set>
#include<cstdio>
#include<cstring>
#include<cmath>
#include<ctime> #define fuck(x) cout<<#x<<" = "<<x<<endl;
#define debug(a, x) cout<<#a<<"["<<x<<"] = "<<a[x]<<endl;
#define ls (t<<1)
#define rs ((t<<1)|1)
using namespace std;
typedef long long ll;
typedef unsigned long long ull;
const int maxn = ;
const int maxm = ;
const int inf = 0x3f3f3f3f;
const ll Inf = ;
const int mod = ;
const double eps = 1e-;
const double pi = acos(-); char s[maxn];
int len,Rank[maxn],sa[maxn],k,tmp[maxn];
bool compare_sa(int i, int j) {
if (Rank[i] != Rank[j]) { return Rank[i] < Rank[j]; }
//如果以i开始,长度为k的字符串的长度,已经超出了字符串尾,那么就赋值为-1
//这是因为,在前面所有数据相同的情况下,字符串短的字典序小.
int ri = i + k <= len ? Rank[i + k] : -inf;
int rj = j + k <= len ? Rank[j + k] : -inf;
return ri < rj;
} void construct_sa() {
//初始的RANK为字符的ASCII码
for (int i = ; i <= len; i++) {
sa[i] = i;
Rank[i] = i < len ? s[i] : -inf;
}
for (k = ; k <= len; k *= ) {
sort(sa, sa + len + , compare_sa);
tmp[sa[]] = ;
//全新版本的RANK,tmp用来计算新的rank
//将字典序最小的后缀rank计为0
//sa之中表示的后缀都是有序的,所以将下一个后缀与前一个后缀比较,如果大于前一个后缀,rank就比前一个加一.
//否则就和前一个相等.
for (int i = ; i <= len; i++) {
tmp[sa[i]] = tmp[sa[i - ]] + (compare_sa(sa[i - ], sa[i]) ? : );
}
for (int i = ; i <= len; i++) {
Rank[i] = tmp[i]; }
}
}
int lcp[maxn]; void construct_lcp(){
// for(int i=0;i<=n;i++){Rank[sa[i]]=i;} int h=;
lcp[]=;
for(int i=;i<len;i++){//i为后缀数组起始位置
int j = sa[Rank[i]-];//获取当前后缀的前一个后缀(排序后)
if(h>)h--;
for(;j+h<len&&i+h<len;h++){
if(s[j+h]!=s[i+h])break;
}
lcp[Rank[i]]=h;
}
} int main() {
int T;
scanf("%d",&T);
while (T--){
scanf("%s",s);
len = strlen(s);
construct_sa();
construct_lcp(); int ans=;
for(int i=;i<=len;i++){
ans+=(len-sa[i]-lcp[i]);
}
printf("%d\n",ans);
} return ;
}