Frequency school and Bayesian school
Speaking of probability and statistics, we have to mention frequency school and Bayesian school, two different probability schools evolved through different understanding of probability.
Frequency school
-
Core idea: the parameter to be obtained is a certain value. Although it is unknown, it will not change due to the change of the sample. The sample data is generated randomly. Therefore, when the data sample is infinite, the calculated frequency is the probability. The main focus is to study the sample space and analyze the distribution of samples
-
Extended application: maximum likelihood estimation (MLE)
Bayesian school
-
Core idea: the parameters to be obtained are random variables, while the samples are fixed. The focus is mainly on the distribution of parameters.
In the Bayesian school, the parameters {are random variables and change with the sample information, so the Bayesian school puts forward a fixed mode of thinking: a priori distribution and a posteriori distribution of sample information.
-
Extended application: maximum a posteriori estimation (MAP)
-
Bayesian formula
Assuming that the A priori probability of A is P(A), the A priori probability of B is P(B), the A posteriori probability of A under the condition of B is P(A|B), and the A posteriori probability of B under the condition of A is P(B|A), then
By simplification:
Where A represents A prediction result; B represents A set of observation data; P(B) represents A priori probability, that is, the probability of A before B is observed; P(B|A) represents A posteriori probability, that is, the probability of A after B is observed; P(B|A) is the likelihood function; P(B) is the model evidence
The formula can be understood as a posteriori probability, a priori probability adjustment factor. In the above formula, a posteriori probability is, a priori probability is, and the adjustment factor is
Naive Bayes classifier
Based on Bayesian theorem, an unusually naive classification algorithm naive Bayesian classifier is extended. Its basic idea is: under given conditions, calculate the probability of each possible category and take the maximum prediction value. From the above ideas, it can be clearly understood that naive Bayes is suitable for discrete data. Its mathematical description will be given below.
If , is the input and , is the characteristic attribute of , assuming that its characteristic attributes are independent of each other, which type of , should be predicted
According to the idea of Bayesian classifier, it should be the category with the greatest probability, that is, if there is, according to Bayesian theorem, the probability of each category is:
It can be found that its denominator is independent of i, so the probabilities of different categories can be compared only by molecules.
Implementation of naive Bayesian classifier in python
import pandas as pd def load_data(path,sep=',',encoding='utf=8'): '''Read data, input data needs to have a header return dataframe''' filetype = path.split('.')[-1] if filetype in ['csv', 'txt']: data = pd.read_csv(path, sep=sep, encoding=encoding) if filetype == 'xslx': data = pd.read_excel(path) return data def cal_prob(data,col,res): '''Calculate occurrence frequency''' count_all = len(data[col]) count_res = len(data[data[col]==res]) return count_res/count_all def cal_prio_prob(data, label): '''Calculate a priori probability return {res1:prob1, res2:prob2,...}''' prio_prob = {} for res in data[label].unique(): prob = cal_prob(data, label, res) prio_prob[res] = prob return prio_prob def cal_likelihood_prob(data, label, input): '''Calculated likelihood function return {res1:prob1, res2:prob2,...}''' likelihood_prob = {} for res in data[label].unique(): data_p = data[data[label]==res] prob = 1 for col in data: if col != label: prob = prob * cal_prob(data_p, col, input[col]) likelihood_prob[res] = prob return likelihood_prob def bayes_classifier(path, label, input): '''Compare output most likely categories''' data = load_data(path) prio_prob = cal_prio_prob(data, label) likelihood_prob = cal_likelihood_prob(data, label, input) max = 0 for c in prio_prob.keys(): prob = prio_prob[c] * likelihood_prob[c] print(f'{c}The result is{prob}') if prob > max: cla = c max = prob print(f'stay{input}In this case,{label}The result is{cla}') if __name__=='__main__': path = 'weather.csv' label = 'PlayTennis' input = {'Outlook':'Sunny', 'Temperature':'Cool','Humidity':'High','Wind':'Strong'} bayes_classifier(path, label, input)
reference resources Bayesian formula from shallow to deep