\(\mathscr{Description}\)
Link.
given the string \ (S \), find the starting subscript of the minimum representation of each prefix of \ (S \) (if there are multiple, take the smallest).
\(|S|\le3\times10^6\).
\(\mathscr{Solution}\)
notice the obvious fact that for a prefix \ (S[:i] \) and two starting subscripts \ (p,q \), if there is \ (s [P: I] < s [Q: I] \), there is \ (s [P: J] < s [Q: J] \) in all \ (J > I \). In other words, the final answer \ (r_i \) must meet \ (r_i\in B_i=\arg\min_{p\le i}\{S[p:i] \} \) and \ (B_{i+1}\subseteq B_i\cup\{i+1 \} \), so we can get \ (B_{i+1} \) by removing the bad starting position in \ (B_i\cup\{i+1 \} \).
Unfortunately, \ (|B_m|=\mathcal O(m) \) needs to be further streamlined. This prompts us to reflect on whether "taking the minimum of \ (S[p:i] \) is a necessary and sufficient condition for \ (p \) to become \ (r_i \)". The answer is No. Consider the suffixes \ (S[p:i] \) and \ (S[q:i] ~ (p < q) \). If \ (p,q\in B_i \), there is \ (S[p:p+i-q+1]=S[q:i] \), that is, \ (S[q:i] \) is a border of \ (S[p:i] \). From the relevant properties of border, it is natural to think of the case of studying \ (|S[q:i] | > \ frac{|S[p:i] |}{2} \), when \ (S[p:i] \) has a period \ (T=|S[p:i]|-|S[q:i] | \). Take \ (S[r:i]=S[i-T+1:i] \), which can explain that \ (S[p:i] \) is not greater than \ (S[q:i] \) and \ (S[r:i] \) at the same time. As shown in the figure:
if we want to make \ (S[p:i] < s [Q: I] \), we need to add a character \ (s {I + 1} = x \) (red dot) to the suffix to make \ (x > y \) (blue dot). However, once there is \ (y < x \), we can make the last cycle \ (S[r:i] \) take the red dot all the way to the beginning of \ (S[p:i] \) and get \ (S[r:i + 1] < S[p:i + 1] \), so the conclusion is tenable. \(\square\)
therefore, it is maintained that there is no new set \ (B_i'\subseteq B_i \) satisfying the above \ (p,q \) relationship. Obviously, \ (| B_m'|=\mathcal O(\log m) \), so the total maintenance process is \ (\ mathcal O(n\log n) \).
in addition, when finding the answer \ (r_i \), it is still necessary to enumerate each subscript in \ (B_i '\). Using the property of prefix equality, it is found that we only need to complete the quick comparison between \ (S \) substring and \ (S \) prefix, so we can achieve \ (\ mathcal O(1) \) with Z-function. To sum up, the final complexity is \ (\ mathcal O(n\log n) \).
at this time, God wants to ask, ha, why don't you write \ (\ mathcal O(n) \)?
looking at this problem again, we naturally associate it with Lyndon Word, which describes "the minimum representation is equal to itself", and Duval algorithm provides a \ (\ mathcal O(n) \) idea to investigate Lyndon Word, so we can try to transform the problem to Lyndon. Let the Lyndon of string \ (s=S[:i] \) be decomposed into
Where \ (w_1 > w_2 > \ cdots > w_m \). Let's first re describe the key conclusion in the practice of \ (\ mathcal O(n\log n) \): obviously, the suffix of the minimum representation is at the beginning of a Lyndon Word. If \ (w_i^kw_{i+1}^{k_{i+1}}\cdots w_m^{k_m} \) is taken out, we can assert: \ (k=k_i \).
now put on Duval and set \ (s=s'u^ku '\), where \ (u^k \) is Lyndon Word and \ (U' \) is the prefix of \ (U \) that has not been extended. The previous \ (s' \) is not important. Let's study it by remembering \ (t=u^ku '\), and consider adding the character \ (C = s {| s | + 1} \):
- \(t_{|u '| + 1} < C \), the answer \ (p_{|s|+1} \) is obviously \ (t_1 \) in the subscript corresponding to the original string;
- \(t_{|u '| + 1} > C \), considering Duval's process, we must first classify a prefix cycle as Lyndon, and then update \ (p_{|s|+1} \);
- \(t_{|u'|+1}=c \), continue to coincide with the circular section, and two possible optimal solutions: \ (p=i \) and \ (p \) take the optimal starting position of \ (U \) at the corresponding position in \ (U' \). In the second case, although \ (U \) is Lyndon Word, the prefix of \ (S \) may be less than the suffix of \ (U \). Use the above Z-function method to support \ (\ mathcal O(1) \) comparison.
finally, almost like Duval, we solved the problem in \ (\ mathcal O(n) \) time.
\(\mathscr{Code}\)
- \(\mathcal O(n\log n)\):
/*+Rainybunny+*/ #include <bits/stdc++.h> #define rep(i, l, r) for (int i = l, rep##i = r; i <= rep##i; ++i) #define per(i, r, l) for (int i = r, per##i = l; i >= per##i; --i) inline void wint(const int x) { if (9 < x) wint(x / 10); putchar(x % 10 ^ '0'); } const int MAXN = 3e6, MAXG = 100; int n, z[MAXN + 5]; char s[MAXN + 5]; int gcnt, good[MAXG + 5]; // indices that may be the answer. inline void calcZ() { z[1] = n; for (int i = 2, l = 0, r = 0; i <= n; ++i) { if (i <= r) z[i] = std::min(z[i - l + 1], r - i + 1); while (i + z[i] <= n && s[i + z[i]] == s[1 + z[i]]) ++z[i]; if (i + z[i] - 1 > r) r = i + z[l = i] - 1; } } inline void adapt(const int k) { auto compare = [&](const int u, const int v)->int { return u < v ? (s[u + k - v] == s[k] ? 0 : s[u + k - v] < s[k] ? -1 : 1) : (s[k] == s[v + k - u] ? 0 : s[k] < s[v + k - u] ? -1 : 1); }; static int tmp[MAXG + 5]; rep (i, 1, gcnt) tmp[i] = good[i]; int ocnt = gcnt, pst = good[gcnt]; good[gcnt = 1] = pst; per (i, ocnt - 1, 1) { if (int t = compare(pst, tmp[i]); t > 0) { good[gcnt = 1] = pst = tmp[i]; } else if (!t) { pst = tmp[i]; while (gcnt && k - good[gcnt] + 1 << 1 > k - pst + 1) --gcnt; good[++gcnt] = pst; } } std::reverse(good + 1, good + gcnt + 1); } inline int best(const int k) { // compare S[l:r] with S[:r-l+1]. auto compare = [](const int l, const int r)->int { if (z[l] >= r - l + 1) return 0; return s[l + z[l]] < s[z[l] + 1] ? -1 : 1; }; int ans = good[1]; rep (i, 2, gcnt) { int cur = good[i]; if (int f1 = compare(ans + k - cur + 1, k); f1 > 0) ans = cur; else if (!f1) { int f2 = compare(cur - ans + 1, cur - 1); // cur <> ans. if (f2 < 0) ans = cur; } } return ans; } int main() { scanf("%s", s + 1), n = strlen(s + 1), calcZ(); rep (i, 1, n) { good[++gcnt] = i, adapt(i); wint(best(i)), putchar(i < n ? ' ' : '\n'); } return 0; }
- \(\ mathcal O(n) \) (currently the optimal solution of the valley):
/*+Rainybunny+*/ #include <bits/stdc++.h> #define rep(i, l, r) for (int i = l, rep##i = r; i <= rep##i; ++i) #define per(i, r, l) for (int i = r, per##i = l; i >= per##i; --i) inline void wint(const int x) { if (9 < x) wint(x / 10); putchar(x % 10 ^ '0'); } const int MAXN = 3e6; int n, z[MAXN + 5], ans[MAXN + 5]; char s[MAXN + 5]; inline void calcZ() { z[1] = n; for (int i = 2, l = 0, r = 0; i <= n; ++i) { if (i <= r) z[i] = std::min(z[i - l + 1], r - i + 1); while (i + z[i] <= n && s[i + z[i]] == s[1 + z[i]]) ++z[i]; if (i + z[i] - 1 > r) r = i + z[l = i] - 1; } } inline void duval() { auto compare = [](const int u, const int v, const int k)->int { int p = u + k - v + 1, q = k - p + 2; if (z[p] < k - p + 1) { return s[p + z[p]] < s[1 + z[p]] ? -1 : 1; } else { return z[q] < u ? s[q + z[q]] < s[1 + z[q]] ? 1 : -1 : 0; } }; for (int i = 1; i <= n;) { int j = i + 1, k = i; if (!ans[i]) ans[i] = i; while (j <= n && s[k] <= s[j]) { if (s[k] < s[j]) { if (!ans[j]) ans[j] = i; k = i; } else { if (!ans[j]) { if (ans[k] < i) ans[j] = i; else { ans[j] = compare(i, j - k + ans[k], j) <= 0 ? i : j - k + ans[k]; } } ++k; } ++j; } i += (j - i) / (j - k) * (j - k); } } int main() { n = fread(s + 1, 1, MAXN + 3, stdin); while (s[n] < 'a' || 'z' < s[n]) --n; calcZ(), duval(); rep (i, 1, n) wint(ans[i]), putchar(i < n ? ' ' : '\n'); return 0; }