Artificial intelligence ② -- logical regression and iris set

Catalogue

 

I Manual push to achieve logistic regression gradient descent

II Classification of iris data set by logistic regression

     1. Introduction to iris data set

     2. What is a linear classifier

     3. Main steps of designing linear classifier

     4. Rough definition of logistic regression

     5. In logistic regression model sklearn Use of

     6. Implementation of linear multi classification

     7. Related code

Zhengwen

I Manual push to achieve logistic regression gradient descent (picture below)

II Classification of iris data set by logistic regression

       1. Introduction to iris dataset

Iris data set is a classic data set for entry. Iris data set is a commonly used classification experimental data set, which was collected and sorted by Fisher,1936. Iris, also known as iris flower data set, is a kind of data set for multivariate analysis. Iris iris data set is} a classic data set. There are 150 records in 3 categories, 50 data in each category, and each record has four signs.            2. Introduction to linear classifier

Linear classifier uses a "hyperplane" to separate positive and negative samples, such as:

(1) the positive and negative samples on the two-dimensional plane are classified by a straight line;
   (2) positive and negative samples in three-dimensional space are classified by a plane;
   (3) positive and negative samples in N-dimensional space are classified by a hyperplane;
   common linear classifiers include LR, Bayesian classification, single-layer perceptron, linear regression, SVM (linear kernel), etc

III Main steps of designing linear classifier

         ①. Collect a set of samples with category marks X={x1,x2,..., xN}
         ②. A criterion function J is determined as needed, its value reflects the performance of the classifier, and its extreme value solution corresponds to the "best" decision
         ③. The extremum solutions w * and w0 * of criterion function J are obtained by optimization technology, so as to determine the discriminant function and complete the classifier design
         ④. Obtain the linear discriminant function g(x)=wT+w0 or g(x)=a*Ty. For the unknown sample x, calculate g(x) and judge its category

IV Rough definition of logistic regression
Logistic Regression is used to deal with the regression problem in which the dependent variable is classified variable. The common problems are dichotomous and binomial distribution problems. It can also deal with multi classification problems (actually a probability model)

V Use of Sklearn in logistic regression model

        1). Import model

from sklearn.linear_model import LogisticRegression  #Import logistic regression model 

       2). fit() training} note: call the method of fit(x,y) to train the model, where x is the attribute of the data and Y is the type

clf = LogisticRegression()
print(clf)
clf.fit(train_feature,label)

      3). Prediction (prediction with trained model)

predict['label'] = clf.predict(predict_feature)

Vi Implementation of linear multi classification

      1). Linear classification using Jupyter Notebook (Anaconda)

      2). Multi classification linear coding

① import the required function library

#Import the required function library
import numpy as np
from sklearn.linear_model import LogisticRegression
import matplotlib.pyplot as plt
import matplotlib as mpl
from sklearn import datasets
from sklearn import preprocessing
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

② obtain iris data set

# Get the required dataset
iris=datasets.load_iris()
#There are four columns of data in each row, and each column is mapped to a feature_ Corresponding value in names
X=iris.data
print(X)
#The classification result value corresponding to each row of data (that is, the label value of each row of data) is [0,1,2]
Y=iris.target
print(Y)

③ data processing

#Normalization processing
X = StandardScaler().fit_transform(X)
print(X)

④ training model (using fit() function for training)

lr = LogisticRegression()   # Logistic regression model
lr.fit(X, Y)        # Calculate the regression parameters according to the data [x,y]

⑤ drawing graph (scatter diagram)

N, M = 500, 500     # How many values are sampled horizontally and vertically
x1_min, x1_max = X[:, 0].min(), X[:, 0].max()   # Range of column 0
x2_min, x2_max = X[:, 1].min(), X[:, 1].max()   # Range of column 1
t1 = np.linspace(x1_min, x1_max, N)
t2 = np.linspace(x2_min, x2_max, M)
x1, x2 = np.meshgrid(t1, t2)                    # Generate mesh sample points
x_test = np.stack((x1.flat, x2.flat), axis=1)   # Test point

cm_light = mpl.colors.ListedColormap(['#77E0A0', '#FF8080', '#A0A0FF'])
cm_dark = mpl.colors.ListedColormap(['g', 'r', 'b'])
y_hat = lr.predict(x_test)       # Estimate
y_hat = y_hat.reshape(x1.shape)                 # Make it the same as the entered shape
plt.pcolormesh(x1, x2, y_hat, cmap=cm_light)     # Display of predicted values
plt.scatter(X[:, 0], X[:, 1], c=Y.ravel(), edgecolors='k', s=50, cmap=cm_dark)    
plt.xlabel('petal length')
plt.ylabel('petal width')
plt.xlim(x1_min, x1_max)
plt.ylim(x2_min, x2_max)
plt.grid()
plt.show()

⑥ prediction model

y_hat = lr.predict(X)
Y = Y.reshape(-1)
result = y_hat == Y
print(y_hat)
print(result)
acc = np.mean(result)
print('Accuracy: %.2f%%' % (100 * acc))

⑦ complete code and result display

Code line

#Import related packages
import numpy as np
from sklearn.linear_model import LogisticRegression
import matplotlib.pyplot as plt
import matplotlib as mpl
from sklearn import datasets
from sklearn import preprocessing
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
# Get the required dataset
iris = datasets.load_iris()
#There are four columns of data in each row, and each column is mapped to a feature_ Corresponding value in names
X = iris.data
print(X)
#The classification result value corresponding to each row of data (that is, the label value of each row of data) is [0,1,2]
Y = iris.target
print(Y)
#Normalization processing
X = StandardScaler().fit_transform(X)
print(X)
lr = LogisticRegression()   # Logistic regression model
lr.fit(X, Y)        # Calculate the regression parameters according to the data [x,y]
N, M = 500, 500     # How many values are sampled horizontally and vertically
x1_min, x1_max = X[:, 0].min(), X[:, 0].max()   # Range of column 0
x2_min, x2_max = X[:, 1].min(), X[:, 1].max()   # Range of column 1
t1 = np.linspace(x1_min, x1_max, N)
t2 = np.linspace(x2_min, x2_max, M)
x1, x2 = np.meshgrid(t1, t2)                    # Generate mesh sample points
x_test = np.stack((x1.flat, x2.flat), axis=1)   # Test point
cm_light = mpl.colors.ListedColormap(['#77E0A0', '#FF8080', '#A0A0FF'])
cm_dark = mpl.colors.ListedColormap(['g', 'r', 'b'])
y_hat = lr.predict(x_test)       # Estimate
y_hat = y_hat.reshape(x1.shape)                 # Make it the same as the entered shape
plt.pcolormesh(x1, x2, y_hat, cmap=cm_light)     # Display of predicted values
plt.scatter(X[:, 0], X[:, 1], c=Y.ravel(), edgecolors='k', s=50, cmap=cm_dark)
plt.xlabel('petal length')
plt.ylabel('petal width')
plt.xlim(x1_min, x1_max)
plt.ylim(x2_min, x2_max)
plt.grid()
plt.show()
y_hat = lr.predict(X)
Y = Y.reshape(-1)
result = y_hat == Y
print(y_hat)
print(result)
acc = np.mean(result)
print('Accuracy: %.2f%%' % (100 * acc))

Result display

 

Machine learning -- linear multi classification of iris data set_ Harriet's blog - CSDN blog

            

     

Keywords: Algorithm Machine Learning AI logistic regressive

Added by goltoof on Sun, 16 Jan 2022 16:57:05 +0200