Several solutions to the minimum cyclic node [original]

For any information, human beings always have an impulse to find its most essential composition. For example, for all numbers, we will study prime numbers because prime numbers can not be decomposed, so any integer can be written in the form of prime factor multiplication. For strings, it seems irregular, but due to grammatical reasons, in fact, many strings use few types of characters, that is, the frequency of 26 letters in the alphabet is different. So humans began to study the minimum cyclic section, that is, whether a string is spliced by a cyclic section string. Let's look at the following example:

Pku2406 Power Strings

Find out how many repeated substrings a string consists of. For example, ababab is connected by three ABS, so the answer is 3. For example, abcd is connected by one abcd, so the answer is 1

Format

Input

Multiple groups of data, with "." representing the end of the test, the length of the string given by each group of data is < = 1e6

Output

Such as title

sample input

abcd

aaaa

ababab

sample output

Solution 1:

For this problem, let the read string exist in the character array s, and let its length be len

So you can enumerate the length of the loop section to be i, that is, the first i characters of the character array form the loop section, and then you can verify it. Since character comparison is involved here, hash is used.

#include<iostream>
#include<cstdio>
#include<string>
#include<cstring>
using namespace std;
typedef unsigned long long LL;
const LL base=131;
const int N=1000010;
int n;
LL power[N],sum[N];
bool check(LL v,int k)  //Judge whether s[1]~s[k] is a circular section
{
    for(register int i=1;i+k-1<=n;i+=k){
        if(v!=sum[i+k-1]-sum[i-1]*power[k]) return 0;
    }
    return 1;
}
int main()
{
    power[0]=1;
    for(register int i=1;i<=N-10;++i) //hash preparation
        power[i]=power[i-1]*base;
    char s[N];
    while(scanf("%s",s+1)){
        if(s[1]=='.')break;
        n=strlen(s+1);
        sum[0]=0;
        for(register int i=1;i<=n;++i) sum[i]=sum[i-1]*base+LL(s[i]);
        for(register int i=1;i<=n;++i){
            if(n%i)continue;
            LL expect=sum[i];
            if(check(expect,i)){
                printf("%d\n",n/i);
                break;
            }   
        }
    }
    return 0;
}

Solution 2:

In the above method, we set the length of the loop section as i, of course, i must be the divisor of len. So the whole string is divided into len/i. Then compare the past one by one. Dare to guess, can we not compare so many times?

Let's draw a diagram. The string s is divided as follows:

In order to distinguish, these paragraphs are marked with different colors.

If the first paragraph is the cyclic section we want, we will copy s once and shift i bit to the right

If the paragraph a2-a5 is equal to the following paragraph a1-a4, we can see that

A2=a1,a3=a2,a4=a3,a5=a4.

So the cycle section is A1

After analyzing this property, we only need a comparison between characters to know whether a prefix of the string is a cyclic section of the whole string.

#include<bits/stdc++.h>
using namespace std;
typedef unsigned long long ull;
char a[2000000];
int len=0;
ull sum[2000000],power[2000000];
ull get(int l,int r)
{
    return sum[r]-sum[l-1]*power[r-l+1];
}
int main()
{
    while(true)
    {
        scanf("%s",a+1);
        len=strlen(a+1);
        if(a[1]=='.'&&len==1)break;
        memset(sum,0,sizeof(sum));
        memset(power,0,sizeof(power));
        for(int i=1;i<=len;i++)
             sum[i]=sum[i-1]*193+ull(a[i])+1;
        power[0]=1;
        for(int i=1;i<=len;i++)
             power[i]=power[i-1]*193;
        for(int i=1;i<=len;i++) //Specifies the length of the loop section
        {
            if(len%i!=0)continue;
            else
            {
                ull a1=get(1,len-i),a2=get(i+1,len);
              //Note: take the prefix with length len-i to see if it is equal to the suffix with length len-i
                if(a1==a2)
                {
                    printf("%d\n",len/i);//Get the number of cyclic sections
                    break;
                }
            }
        }
    }
     return 0;
}

Sol3:

In problem solution 2, the number of comparisons is reduced, and it seems that there is no optimization. We turn our attention to the element of the length of the circular joint. In the previous methods, we only require the length i of the loop section to be the divisor of the total length len, so the number of segments divided is k=len/i, without considering the composition of the read string. Obviously, we can count the number of occurrences of each letter in the string. We might as well set it as sum1... Sum26. When we divide the whole string into k segments according to the loop section, we "divide" these letters into k segments, so K is gcd(len,sum1,sum2.... sum26) at most. If the detection is unsuccessful, it should also be the divisor of gcd(len,sum1,sum2.... sum26), So far, we have more accurately constrained the K range, and the code is omitted.

Added by troy_mccormick on Tue, 30 Nov 2021 00:58:19 +0200

Programming VIP

Several solutions to the minimum cyclic node [original]

Popular Keywords