Design and implementation of mode algorithm

Problem Description:

  • Condition: given a multiple set s Containing n elements, the number of times each element appears in S is called the multiplicity of the element. The element with the largest multiplicity in the multiple set s is called mode.
    For example, S={1, 2, 2, 3, 5}. The mode of multiple set S is 2 and its multiplicity is 3.
  • Note that this multiple set is ordered, and the sorting algorithm is not examined
  • Task: for a given multiple set s composed of n natural numbers, program to calculate the mode and multiplicity of S. It cannot be traversed completely at once.

data input

The input data consists of a file named input TXT text file. Number of elements in multiple sets S on line 1 of the file n; In the next N lines, each line has a natural number.

Result output

At the end of the program, the calculation results are output to the file output Txt. The output file has two lines, the first line gives the number of digits, and the second line is the number of duplicates.

Input and output samples

input.txt

	6
	1
	2
	2
	2
	3
	5

output.txt

	2
	3

Train of thought analysis:

  • First of all, traversal must not work. The time complexity is O (n). Combined with the divide and conquer idea of rigid learning, the large problem is decomposed into small problems.
  • Common examples of divide and conquer algorithms are used to think, merge and sort, divide the overall long into short ones, and then divide the problem scale when merging and growing. The multiplication of large numbers is also the scale of the problem. Similarly, can this problem be considered from the perspective of the scale of the problem?
  • For reference, through

Implementation code

#include <iostream>
#include <fstream>
#include <string>

using namespace std;
const int MAX_SIZE = 1000;


/*
	Function: times is to return the number of times the number corresponding to the current index appears in the array num
	Parameter: num: array for recursion
		 size: The actual size of the element participating in the comparison
		 start,end: Is the starting paragraph of the left and right indexes involved in the comparison
		 index: Is the index that gets the occurrence element
		 up: Upper bound on the number of occurrences of an element
		 down: Is the lower bound of the element
*/
void getTimes(int* num, int size, int& times, int index, int& up, int& down) {
	up = index - 1, down = index + 1;
	while (up >= 0) {
		//Traverse the left side of the target number
		if (num[up] == num[index]) {
			times++;
			up--;
		}

		//Develop launch conditions
		if (num[up] != num[index]) {
			break;
		}
	}

	while (down < size)
	{
		//Traverse the right side of the target number
		if (num[down] == num[index]) {
			times++;
			down++;
		}

		//Develop launch conditions
		if (num[down] != num[index]) {
			break;
		}
	}

	return;
}

/*
	Description: the process of recursion
	Parameter: num: array for recursion
		  size: The actual size of the element participating in the comparison
		  start,end: Is the starting paragraph of the left and right indexes involved in the comparison
*/
int getDivide(int* num, int size, int start, int end, int& value) {
	while (start < end) {

		//Count the current median times, and determine the starting point and end point of the corresponding comparison on the left and right sides
		int middle = (start + end) / 2, up = 0, down = 0, times = 1, compare = 0;//Times is the number of times the number corresponding to the middle appears. up is the lower limit on the left and down is the upper limit on the right
		int left = 0, right = 0;//Left and right here are used to save the maximum number of occurrences on the left and right sides
		
		getTimes(num, size, times, middle, up, down);
		value = num[middle];

		//Determine whether recursion to the left and right sides is required
		if (times <= (up - start) + 1) {
			//Recursion of the left half
			left = getDivide(num, size, start, up, compare);
			if (left > times) {
				times = left;
				value = compare;
			}
		}

		//Determine whether recursion on the right is required
		if (times < (end - down) + 1) {
			//The right half is recursive
			right = getDivide(num, size, down, end, compare);
			//If the value of the right half is greater than times, replace it
			if (right > times) {
				times = right;
				value = compare;
			}
		}

		return times;

	}
}

/*
	Description: an external function called recursively. num is the original array and the actual length of the size array
	Parameter: num is the original array and the size of the size array
	Note: the last element of the array is used to hold the value with the most occurrences
*/
int RecursionOuter(int* num, int size) {
	int value = 0, times = 0;
	if (size == 0)
	{
		return 0;
	}
	else if (size == 1)
	{
		return 1;
	}
	else {
		times = getDivide(num, size - 1, 0, size - 2, value);
		num[size - 1] = value;
		return times;
	}


}


int main() {
	//Read in file
	ifstream infile;
	infile.open("data.txt");
	//File exception
	
	if (!infile.is_open()) {
		cerr << "file cannot open!" << endl;
		return -1;
	}
	
	//Read in each line and use line to save the temporary string
	//index is used to save the corresponding array
	//Num is the corresponding array. The first item is the number of numbers in the file, followed by specific numbers
	string  line;
	int index = 0, nums[MAX_SIZE];
	while (!infile.eof()) {
		getline(infile, line);
		nums[index] = atoi(line.c_str());
		index++;
	}
	infile.close();

	//Data is to save the actual data
	//Size corresponds to the size of the array
	int data[MAX_SIZE] = { 0 },size = nums[0];
	//Cout < < the number actually saved to it is < < endl;
	for (int i = 1; i < index;i ++) {
		data[i-1] = nums[i];
	}
	//cout << "final result" << endl;

	cout << "Multiplicity:"<<RecursionOuter(data, size + 1) << endl;
	cout<<"Value:" << data[size] << endl;
}
Analysis and summary:
  • Looking at my code, I think I'm rubbish. My God, how can I write so long and complex? Modify it yourself, brother! I'm really uncomfortable. I'll show you someone else's. It's concise and more pleasing to the eye than a sister! Rare. I saw it for the first time. It's awesome to write C + + so simple.
    Original link
#include<cstdio>
#include<cstring>
#include<iostream>
#include<algorithm>



using namespace std;
typedef long long ll;
typedef pair<int,int>pi;

const int maxn=1e5+100;
const int N=1e6+100;
const int M=1e5+100;
const int mod=1e9+7;

int n,m,T;
int a[N];
int ans=0; //Multiplicity of mode
int idx=0; //Subscript of mode


void split(int l,int r) //Divide and conquer algorithm
{
    if(l>r)return;

    int ll=l; //Record the original l position
    int rr=r; //Record the original r position

    int mid=(l+r)>>1;

    for(; l<mid&&a[l]!=a[mid]; l++); //Find the leftmost mode

    for(; r>mid&&a[r]!=a[mid]; r--); //Find the most right mode
    
    //After two for loops, the mode number is r-l+1

    if(ans<=r-l+1) //Update answer
    {
        if(ans==r-l+1)
        {
            idx=min(mid,idx);
        }
        else
            idx=mid;
        ans=r-l+1;
    }

    if(l-1-ll+1>=ans) //prune
        split(ll,l-1); // Divide and conquer the left part

    if(rr-r-1+1>=ans) //prune
        split(r+1,rr); // Divide and conquer the right part

}



void solve()
{
    scanf("%d",&n);
    for(int i=0; i<n; i++)
        scanf("%d",a+i);

    sort(a,a+n); //Sort first to facilitate the statistics of modes

    int l=0;
    int r=n-1;

    split(l,r);

    printf("%d %d\n",a[idx],ans);

}


int main()
{
#ifdef LOCAL
    freopen("in.txt","r",stdin);
    freopen("out.txt","w",stdout);
#endif
    solve();
}
Learn from
  • About file input and output streams, this is great. It's much easier to import the contents of the file directly into the console!
 freopen("in.txt","r",stdin);
 freopen("out.txt","w",stdout);
  • For those that may be used in recursion, use global variables to name them, and avoid confusing references and pointers, making yourself a little masked. The author sets the multiplicity and the corresponding value as global variables
  • It's much more convenient to use pair control to return two values. Unlike me, it's too cumbersome to return two values with the last value of the array and the original return value
  • If you want to use the algorithm package, don't write everything yourself. It's so hard

Analysis and summary

  • Most of it is to rewrite this by yourself, because there are many places that are not clear, which is really a problem.

Keywords: Algorithm

Added by heropage on Sat, 19 Feb 2022 15:54:36 +0200