What is perceptron
I won't reproduce the formula. There are so many on the Internet
(1) Recommended to see Dr. Li Hang's "statistical learning methods"
(2) Or: https://www.jianshu.com/p/c91087e6e1ea
The second article is small in length, but it basically introduces the original form of the perceptron. If you want to know more about the dual form of the perceptron, you can search the data by yourself
In the above two recommendations, two feature data are used for classification, that is, data points and hyperplanes can be presented in two-dimensional coordinate system
So it is the same with my code. The code is still very extensible. You can modify the data in a higher dimension by yourself
But beyond 3D features, it will not be able to combine all dimensions to visualize at the same time
-------------------------
It is strongly recommended to read carefully and learn to read the code again!!! Don't swallow the whole thing. I thought I understood it before. I was stupid when I actually wrote the code
Think about how to update parameters, how to store data, how to use numpy matrix
- In the initialization function, the parameter matrix is a three-dimensional matrix, and the last one is offset b
- To create a numpy array, the floating-point type must be declared. Otherwise, when the learning rate is not 1, the parameter update will be automatically rounded, and the iteration cannot be completed.
- When including included data, it should also be declared as floating-point type, otherwise, the data will automatically become integer int32
def __init__(self): self.learn_rate = 0.5 # Feel free to set up self.w_b = np.array([[0], [0], [0]], dtype='float32') # Put three parameters in a matrix (w1, w2, b) self.t_data = None self.t_data_c = None # Because we need to update the three parameters together, the formula in the observation book is actually y times x 1, x 2, 1, and then add it to the parameter matrix, so we need to build a matrix where y values are all 1
- Data collection, the last one of each data is 1 or - 1, as a classification mark, and then each data is stored in a row in the matrix
- After understanding how to update the parameters, you will find that two W1 and W2 are multiplied by y, and then+=
- b is adding y directly
- In order to facilitate the update, we set up two train ﹣ data matrices. The second one changes y all to 1. When it is a misclassification point, it is used to update the data
def collect_data(self): collect_1 = [] collect_2 = [] while True: try: # Using exception processing to end input, I didn't think of any other good way.. data = map(float, input("Input data x1 x2 y(Space partition)\ //Enter any letter and hit enter to end:“).split(' ')) data = list(data) collect_1.append(copy.copy(data)) # The test found that the change of the next line will affect the previous line, and the two points to the same variable address, so we have to copy it data[2] = 1 collect_2.append(data) except ValueError: print("Data collection finished!") break self.t_data = np.array(collect_1, dtype='float32') self.t_data_c = np.array(collect_2, dtype='float32')
- Parameter update iteration
- The ultimate goal is that the number of misclassification points is 0, so every time a misclassification is detected, it will be mistaked + = 1, until it is 0 after a certain update, and jump out of the while
def gradient_descent(self): print("Start iteration...") number = 0 while True: number += 1 # Count the number of misclassification points every time, until there is no point, jump out of the loop, and the iteration ends mistake = 0 for line, line_c in zip(self.t_data, self.t_data_c): # One is used to judge whether it is misclassified, and the other is used to update the parameter matrix if line[2] * np.dot(line_c, self.w_b)[0] <= 0: line_c = line_c * line[2] * self.learn_rate print(line_c) mistake += 1 self.w_b[0][0] += line_c[0] self.w_b[1][0] += line_c[1] self.w_b[2][0] += line_c[2] if mistake == 0: break print('The first{}Secondary iteration\n parameter w: {}'.format(number, self.w_b)) print('-----------------') print("Iteration complete!") print('This learning rate:{},Iterations:{}'.format(self.learn_rate, number))
- Data visualization
- Note here that when writing the straight line formula, y (actually x2) may have a coefficient of 0, but obviously the divisor cannot be 0, so this situation needs to be judged separately
if not self.w_b[1][0]: # In the test, it is found that the coefficient of X ﹣ U2 is 0, which is equivalent to the divisor of 0 when drawing x = -1 * self.w_b[2][0] / self.w_b[0][0] plt.axvline(x, color='g') else: x = np.linspace(-10, 10, 10) y = -1 * self.w_b[0][0] / self.w_b[1][0] * x + -1 * self.w_b[2][0] / self.w_b[1][0] plt.plot(x, y, color='g')
It's almost like this
-------------------------
The complete code is as follows
import numpy as np import copy import matplotlib.pyplot as plt # To create a numpy array, you must declare the floating-point type. Otherwise, when the learning rate is not 1, the parameter update will be automatically rounded, and the iteration cannot be completed # When including included data, it should also be declared as floating-point type, otherwise, the data will automatically become integer int32 class Perceptron: def __init__(self): self.learn_rate = 0.5 # Feel free to set up self.w_b = np.array([[0], [0], [0]], dtype='float32') # Put three parameters in a matrix (w1, w2, b) self.t_data = None self.t_data_c = None # Because we need to update the three parameters together, the formula in the observation book is actually y times x 1, x 2, 1, and then add it to the parameter matrix, so we need to build a matrix where y values are all 1 def collect_data(self): collect_1 = [] collect_2 = [] while True: try: # Using exception processing to end input, I didn't think of any other good way.. data = map(float, input("Input data x1 x2 y(Space partition)\ //Enter any letter and hit enter to end:“).split(' ')) data = list(data) collect_1.append(copy.copy(data)) # The test found that the change of the next line will affect the previous line, and the two points to the same variable address, so we have to copy it data[2] = 1 collect_2.append(data) except ValueError: print("Data collection finished!") break self.t_data = np.array(collect_1, dtype='float32') self.t_data_c = np.array(collect_2, dtype='float32') def gradient_descent(self): print("Start iteration...") number = 0 while True: number += 1 # Count the number of misclassification points every time, until there is no point, jump out of the loop, and the iteration ends mistake = 0 for line, line_c in zip(self.t_data, self.t_data_c): # One is used to judge whether it is misclassified, and the other is used to update the parameter matrix if line[2] * np.dot(line_c, self.w_b)[0] <= 0: line_c = line_c * line[2] * self.learn_rate # The update method corresponds to the second note above print(line_c) mistake += 1 self.w_b[0][0] += line_c[0] self.w_b[1][0] += line_c[1] self.w_b[2][0] += line_c[2] if mistake == 0: break print('The first{}Secondary iteration\n parameter w: {}'.format(number, self.w_b)) print('-----------------') print("Iteration complete!") print('This learning rate:{},Iterations:{}'.format(self.learn_rate, number)) def visualize(self): # The Y in the following drawing is not the previous y, in fact, it's the so-called x'u 2.. plt.figure(figsize=(8, 4)) x = 0 y = 0 if not self.w_b[1][0]: # In the test, it is found that the coefficient of X ﹣ U2 is 0, which is equivalent to the divisor of 0 when drawing x = -1 * self.w_b[2][0] / self.w_b[0][0] plt.axvline(x, color='g') else: x = np.linspace(-10, 10, 10) y = -1 * self.w_b[0][0] / self.w_b[1][0] * x + -1 * self.w_b[2][0] / self.w_b[1][0] plt.plot(x, y, color='g') for i in self.t_data: if i[2] == 1: plt.scatter(i[0], i[1], c='r', s=5) else: plt.scatter(i[0], i[1], c='b', s=5) plt.xlim(-10, 10) plt.ylim(-10, 10) plt.xlabel('x(1)') plt.ylabel('x(2)') plt.show() p = Perceptron() p.collect_data() p.gradient_descent() p.visualize()