Yesterday I learned AC Automation, I feel a little touched. Write down that there is a reference for the same problems I encounter in the future.
AC Automation is suitable for multi-pattern string matching problem. The main feature of its algorithm is that it can get all the answers by walking the parent string only once. It is an offline algorithm.The core of the algorithm is to transform the dictionary tree. Individually, AC automaton belongs to a kind of data structure. Dictionary tree is edged by the relationship between prefix and suffix between the same pattern string and different pattern strings to achieve the purpose of traversing the parent string without duplication. The disadvantage is the same as the number of dictionaries. It is easy to cause too large data.Memory overflow.
That's what I think about algorithms, and here's what I think about code
void insert(string a) { int u=0; for(int i=0;i<a.size();i++) { int v=a[i]-'a'; if(!tree[u][v]) tree[u][v]=++cnt; u=tree[u][v]; } book[u]++; }
Simple dictionary tree insertion, not to mention much;
void getfail() { queue<int> que; for(int i=0;i<26;i++) if(tree[0][i]) que.push(tree[0][i]); while(!que.empty()) { int now=que.front(); que.pop(); for(int i=0;i<26;i++) { if(tree[now][i]) { fail[tree[now][i]]=tree[fail[now]][i]; que.push(tree[now][i]); } else tree[now][i]=tree[fail[now]][i]; } } }
I think the essence of this operation is
tree[now][i]=tree[fail[now]][i];
If the fail array has a virtual edge, this establishes a real edge, paving the way for the state transition.This step also turns the dictionary tree into a dictionary diagram, which may be looped.
int getans(string a) { int ans=0; int now=0; for(int i=0;i<a.size();i++) { now=tree[now][a[i]-'a']; for(int j=now;j&&book[j]!=-1;j=fail[j]) { ans+=book[j]; book[j]=-1; } } return ans; }
The operation to get the answer: the now pointer points to where you are currently. From this function, it is also clear that the tree array is used for state transition, and the fail array is only used to expand the answer, without affecting the current state.
Here is the complete code. It is important to note that in general AC automata take ten card time, string, cincout is best not to use even if iOS is added:: sync_with_stdio (false) may also time out (something).
#include <iostream> #include <cstring> #include <queue> #define MAX 1000010 using namespace std; int tree[MAX][26]; //Dictionary Tree int book[MAX]; //Tag Array int fail[MAX]; //Add (virtual) edge array int cnt; void init() { cnt=0; memset(fail,0,sizeof(fail)); memset(book,0,sizeof(book)); memset(tree,0,sizeof(tree)); } void insert(string a) { int u=0; for(int i=0;i<a.size();i++) { int v=a[i]-'a'; if(!tree[u][v]) tree[u][v]=++cnt; u=tree[u][v]; } book[u]++; } void getfail() { queue<int> que; for(int i=0;i<26;i++) if(tree[0][i]) que.push(tree[0][i]); while(!que.empty()) { int now=que.front(); que.pop(); for(int i=0;i<26;i++) { if(tree[now][i]) { fail[tree[now][i]]=tree[fail[now]][i]; que.push(tree[now][i]); } else tree[now][i]=tree[fail[now]][i]; } } } int getans(string a) { int ans=0; int now=0; for(int i=0;i<a.size();i++) { now=tree[now][a[i]-'a']; for(int j=now;j&&book[j]!=-1;j=fail[j]) { ans+=book[j]; book[j]=-1; } } return ans; } int main() { init(); string a; while(cin>>a,a!="end") insert(a); cin>>a; fail[0]=0; getfail(); cout<<getans(a); return 0; }
It is a very important ability to use data structure actively. The precondition is to master the principle of data structure construction and the functions of its members.
There's still a long way to go. Go on!