Feeling like learning AC Automation

Yesterday I learned AC Automation, I feel a little touched. Write down that there is a reference for the same problems I encounter in the future.
AC Automation is suitable for multi-pattern string matching problem. The main feature of its algorithm is that it can get all the answers by walking the parent string only once. It is an offline algorithm.The core of the algorithm is to transform the dictionary tree. Individually, AC automaton belongs to a kind of data structure. Dictionary tree is edged by the relationship between prefix and suffix between the same pattern string and different pattern strings to achieve the purpose of traversing the parent string without duplication. The disadvantage is the same as the number of dictionaries. It is easy to cause too large data.Memory overflow.
That's what I think about algorithms, and here's what I think about code

void insert(string a)
{
	int u=0;
	for(int i=0;i<a.size();i++)
	{
		int v=a[i]-'a';
		if(!tree[u][v])
		tree[u][v]=++cnt;
		u=tree[u][v];
	}
	book[u]++;
}

Simple dictionary tree insertion, not to mention much;

void getfail()
{
	queue<int> que;
	for(int i=0;i<26;i++)
	if(tree[0][i])
	que.push(tree[0][i]);
	while(!que.empty())
	{
		int now=que.front();
		que.pop();
		for(int i=0;i<26;i++)
		{
			if(tree[now][i])
			{
				fail[tree[now][i]]=tree[fail[now]][i];
				que.push(tree[now][i]);
			}
			else
			tree[now][i]=tree[fail[now]][i];
		}
	}
}

I think the essence of this operation is

tree[now][i]=tree[fail[now]][i];

If the fail array has a virtual edge, this establishes a real edge, paving the way for the state transition.This step also turns the dictionary tree into a dictionary diagram, which may be looped.

int getans(string a)
{
	int ans=0;
	int now=0;
	for(int i=0;i<a.size();i++)
	{
		now=tree[now][a[i]-'a'];
		for(int j=now;j&&book[j]!=-1;j=fail[j])
		{
			ans+=book[j];
			book[j]=-1;
		}
	}
	return ans;
}

The operation to get the answer: the now pointer points to where you are currently. From this function, it is also clear that the tree array is used for state transition, and the fail array is only used to expand the answer, without affecting the current state.
Here is the complete code. It is important to note that in general AC automata take ten card time, string, cincout is best not to use even if iOS is added:: sync_with_stdio (false) may also time out (something).

#include <iostream>
#include <cstring>
#include <queue>
#define MAX 1000010
using namespace std;
int tree[MAX][26]; //Dictionary Tree
int book[MAX];     //Tag Array
int fail[MAX];     //Add (virtual) edge array
int cnt;
void init()
{
	cnt=0;
	memset(fail,0,sizeof(fail));
	memset(book,0,sizeof(book));
	memset(tree,0,sizeof(tree));
}

void insert(string a)
{
	int u=0;
	for(int i=0;i<a.size();i++)
	{
		int v=a[i]-'a';
		if(!tree[u][v])
		tree[u][v]=++cnt;
		u=tree[u][v];
	}
	book[u]++;
}

void getfail()
{
	queue<int> que;
	for(int i=0;i<26;i++)
	if(tree[0][i])
	que.push(tree[0][i]);
	while(!que.empty())
	{
		int now=que.front();
		que.pop();
		for(int i=0;i<26;i++)
		{
			if(tree[now][i])
			{
				fail[tree[now][i]]=tree[fail[now]][i];
				que.push(tree[now][i]);
			}
			else
			tree[now][i]=tree[fail[now]][i];
		}
	}
}

int getans(string a)
{
	int ans=0;
	int now=0;
	for(int i=0;i<a.size();i++)
	{
		now=tree[now][a[i]-'a'];
		for(int j=now;j&&book[j]!=-1;j=fail[j])
		{
			ans+=book[j];
			book[j]=-1;
		}
	}
	return ans;
}


int main()
{
	init();
	string a;
	while(cin>>a,a!="end")
	insert(a);
	cin>>a;
	fail[0]=0;
	getfail();
	cout<<getans(a);
	return 0;
}

It is a very important ability to use data structure actively. The precondition is to master the principle of data structure construction and the functions of its members.
There's still a long way to go. Go on!

Keywords: iOS

Added by burn1337 on Tue, 27 Aug 2019 05:15:23 +0300