HDU 2222 Keywords Search (AC自动机 经典)

时间:2021-08-12 19:59:06

原题

Keywords Search

Time Limit: 1000MS Memory Limit: 131072KB 64bit IO Format: %I64d & %I64u

Description

In the modern time, Search engine came into the life of everybody like Google, Baidu, etc.
Wiskey also wants to bring this feature to his image retrieval system.
Every image have a long description, when users type some keywords to find the image, the system will match the keywords with description of image and show the image which the most keywords be matched.
To simplify the problem, giving you a description of image, and some keywords, you should tell me how many keywords will be match.

Input

First line will contain one integer means how many cases will follow by.
Each case will contain two integers N means the number of keywords and N keywords follow. (N <= 10000)
Each keyword will only contains characters 'a'-'z', and the length will be not longer than 50.
The last line is the description, and the length will be not longer than 1000000.

Output

Print how many keywords are contained in the description.

Sample Input

1
5
she
he
say
shr
her

yasherhs 


Sample Output

3


题意


输出多模式串在主串中匹配的次数。


涉及知识及算法


AC自动机 传送门:点击打开链接

代码

#include <iostream>
#include <cstdio>
#include <cstring>
using namespace std;
const int maxn=1e6+5;
char str2[55];
char str1[maxn];

struct node
{
node *Next[26];
//失配指针 指向与p相同的结点,若没有则指向root
node *fail;
//以该点为单词结尾个数
int sum;
node()
{
sum=0;
fail=NULL;
for(int i=0;i<26;i++)
{
Next[i]=NULL;
}
}
};
node* root;
node *q[maxn];
//字典树的建立
void Insert(char *s)
{
node *p=root;
for(int i=0;s[i];i++)
{
int x=s[i]-'a';
if(p->Next[x]==NULL)
{
p->Next[x]=new node;
}
p=p->Next[x];
}
p->sum++;
}
//用BFS实现构造fail指针
void bulid_fail_pointer()
{
int head=0;
int tail=1;
q[head]=root;
//p仍指向当前匹配的字符
//temp为测试指针 temp的子节点的fail指向的结点即为p结点
//在建立fail指针时有寻找与p字符匹配的结点的作用
node *p,*temp;
while(head<tail)
{
temp=q[head++];
for(int i=0;i<26;i++)
{
if(temp->Next[i])
{
if(temp==root)
{
//第一层的结点的fail只能指向root
temp->Next[i]->fail=root;
}
else
{
p=temp->fail;
while(p)
{
//如果有失配位置
if(p->Next[i])
{
temp->Next[i]->fail=p->Next[i];
break;
}
p=p->fail;
}
//没有找到就都指向根结点
if(p==NULL)
{
temp->Next[i]->fail=root;
}
}
//入队
q[tail++]=temp->Next[i];
}
}
}
}

int ac_automation(char *ch)
{
int cnt=0;
node *p=root;
int len=strlen(ch);
for(int i=0;i<len;i++)
{
int x=ch[i]-'a';
//如果当前不匹配就去fail那里指的地方匹配
while(!p->Next[x]&&p!=root)
{
p=p->fail;
}
p=p->Next[x];
//如果p为NULL说明不匹配,回到根
if(!p)
{
p=root;
}
node *temp=p;
//如果匹配计算个数
while(temp!=root)
{
if(temp->sum>=0)
{
cnt+=temp->sum;
temp->sum=-1;
}
else
{
break;
}
//跳转到fail指针指向的结点
temp=temp->fail;
}
}
return cnt;
}

void Clear(node* a) //释放内存
{
if(a==NULL) return;
else
{
for(int i=0;i<26;i++)
{
Clear(a->Next[i]);
}
}
delete (a);
}

int main()
{
//freopen("in.txt","r",stdin);
int T,n;
scanf("%d",&T);
while(T--)
{
root=new node;
scanf("%d",&n);
while(n--)
{
scanf("%s",str2);
Insert(str2);
}
scanf("%s",str1);
bulid_fail_pointer();
printf("%d\n",ac_automation(str1));
Clear(root);
}
return 0;
}

代码参考了CSDN博主creatorx的文章,链接在上方,表示感谢。