Summary of Wu Enda's machine learning code and related knowledge points -- ex2 (1. Logical regression)

View data

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data=pd.read_csv("code/ex2-logistic regression/ex2data1.txt",names=['Exam 1', 'Exam 2', 'Admitted'])
data.head()

positive=data[data["Admitted"].isin(["1"])]
negative=data[data["Admitted"].isin(["0"])]
fig,ax=plt.subplots(figsize=[12,8])
ax.scatter(positive['Exam 1'], positive['Exam 2'], s=50, c='b', marker='o', label='positive')
ax.scatter(negative['Exam 1'], negative['Exam 2'], s=50, c='r', marker='x', label='negative')
ax.legend()
ax.set_xlabel('Exam 1 Score')
ax.set_ylabel('Exam 2 Score')
plt.show()

About pandas.isin() A kind of :
isin() accepts a list and determines whether the elements in the column are in the list.
Give an example:

>>> df
          A         B         C         D  E
0 -0.018330  2.093506 -0.086293 -2.150479  a
1  0.104931 -0.271810 -0.054599  0.361612  a
2  0.590216  0.218049  0.157213  0.643540  c
3 -0.254449 -0.593278 -0.150455 -0.244485  b
>>> df.E.isin(['a','c'])
0     True
1     True
2     True
3    False
Name: E, dtype: bool
--------
Copyright notice: This is the original article of CSDN blogger "lzw2016", following CC 4.0 BY-SA copyright agreement. Please attach the original source link and this notice for reprint.
Original link: https://blog.csdn.net/lzw2016/article/details/80472649

After data visualization:

sigmoid function

def sigmoid(z):
    return 1/(1+np.exp(-z))

Cost function

For linear regression models, the error we define is the sum of the squares of all model errors:
But for the logistic regression model, theBy substituting the above cost function, we will get a nonconvex function, which leads to many local minima in our cost function, which will affect the gradient descent algorithm to find the global minima.
So we redefine the cost function:

data.insert(0,"ones",1)#Add a column x0=1 so that the number of X is the same as theta
cols=data.shape[1]
X=data.iloc[:,0:cols-1]
Y=data.iloc[:,cols-1:cols]
X=np.array(X.values)
Y=np.array(Y.values)
theta=np.zeros(3)
def cost(theta,X,Y):
    theta=np.matrix(theta)
    z=np.dot(X,theta.T)
    m=len(X)
    cost=1/m*np.sum(np.multiply(-Y,np.log(sigmoid(z)))-np.multiply((1-Y),np.log(1-sigmoid(z))))
    return cost

cost(theta,X,Y)

theta=np.matrix(theta) changes theta from (3,) to (3, 1)
The former is one-dimensional, the latter is two-dimensional.
For example:
The result of NP. Sum (two-dimensional matrix, axis=1) is one-dimensional
NP. Sum (two-dimensional matrix, axis=1,keepdims=True) results in two-dimensional
On the difference between Python list, Numpy array and matrix

gradient descent

def gradient(theta,X,Y):
    theta = np.matrix(theta)
    X = np.matrix(X)
    Y = np.matrix(Y)
    parameters=int(theta.ravel().shape[1])
    grads=np.zeros(parameters)
    z=np.dot(X,theta.T)
    error=sigmoid(z)-Y
    for i in range(parameters):
        term=np.multiply(error,X[:,i])
        grads[i]=np.sum(term)/len(X)
    return grads
gradient(theta,X,Y)                     


Note that we are not actually performing gradient descent in this function, we are only calculating a gradient step. In the exercise, an Octave function called "fminunc" is used to optimize the function to calculate the cost and gradient parameters. Because we use Python, we can do the same with SciPy's "optimize" namespace.

import scipy.optimize as opt
res = opt.minimize(fun=cost, x0=theta, args=(X, Y), method='Newton-CG', jac=gradient)
print(res)


About scipy.optimize:

scipy.optimize.minimize(fun, x0, args=(), method=None, jac=None, hess=None, hessp=None, bounds=None, constraints=(), tol=None, callback=None, options=None)[source]ΒΆ
  • fun: minimized objective function
  • x0: initialized parameters
  • args: parameters passed to the target function
  • Optimization method:
  • bounds: limits on variables for l-bfgs-b, TNC, slsqp and
    trust-constr methods.)
  • Constraints: Constraints definition (only for COBYLA, SLSQP and
    trust-constr)
    ......
    scipy.optimize.minimize

Forecast

def predict(theta, X):
    z=np.dot(X,theta.T)
    probs = sigmoid(z)
    return [1 if x >= 0.5 else 0 for x in probs]
theta_min = np.matrix(res.x)
predictions = predict(theta_min, X)
correct = [1 if ((a == 1 and b == 1) or (a == 0 and b == 0)) else 0 for (a, b) in zip(predictions, Y)]
accuracy = (sum(map(int, correct)) % len(correct))
print ('accuracy = {0}%'.format(accuracy))

accuracy = 89%

Drawing decision boundaries

coef = -(res.x / res.x[2])  # find the equation
print(coef)

x = np.arange(130, step=0.1)
y = coef[0] + coef[1]*x
data.describe()  # find the range of x and y

coef[0] is intercept, about 125

fig, ax = plt.subplots(figsize=(12,8))
ax.scatter(positive['Exam 1'], positive['Exam 2'], s=50, c='b', marker='o', label='Admitted')
ax.scatter(negative['Exam 1'], negative['Exam 2'], s=50, c='r', marker='x', label='Not Admitted')
ax.plot(x,y,'grey')
ax.legend()
ax.set_xlabel('Exam 1 Score')
ax.set_ylabel('Exam 2 Score')
plt.show()

Published 3 original articles, praised 0 and visited 24
Private letter follow

Keywords: Python

Added by mark_kccs on Thu, 27 Feb 2020 09:39:41 +0200