# Machine learning: perceptron algorithm (no library tuning, pure Python code)

### What is perceptron

I won't reproduce the formula. There are so many on the Internet

(1) Recommended to see Dr. Li Hang's "statistical learning methods"
(2) Or: https://www.jianshu.com/p/c91087e6e1ea

The second article is small in length, but it basically introduces the original form of the perceptron. If you want to know more about the dual form of the perceptron, you can search the data by yourself

In the above two recommendations, two feature data are used for classification, that is, data points and hyperplanes can be presented in two-dimensional coordinate system
So it is the same with my code. The code is still very extensible. You can modify the data in a higher dimension by yourself
But beyond 3D features, it will not be able to combine all dimensions to visualize at the same time

-------------------------

It is strongly recommended to read carefully and learn to read the code again!!! Don't swallow the whole thing. I thought I understood it before. I was stupid when I actually wrote the code
Think about how to update parameters, how to store data, how to use numpy matrix

1. In the initialization function, the parameter matrix is a three-dimensional matrix, and the last one is offset b
• To create a numpy array, the floating-point type must be declared. Otherwise, when the learning rate is not 1, the parameter update will be automatically rounded, and the iteration cannot be completed.
• When including included data, it should also be declared as floating-point type, otherwise, the data will automatically become integer int32
```    def __init__(self):
self.learn_rate = 0.5  # Feel free to set up
self.w_b = np.array([,
,
], dtype='float32')  # Put three parameters in a matrix (w1, w2, b)
self.t_data = None
self.t_data_c = None  # Because we need to update the three parameters together, the formula in the observation book is actually y times x 1, x 2, 1, and then add it to the parameter matrix, so we need to build a matrix where y values are all 1

```
1. Data collection, the last one of each data is 1 or - 1, as a classification mark, and then each data is stored in a row in the matrix
• After understanding how to update the parameters, you will find that two W1 and W2 are multiplied by y, and then+=
• b is adding y directly
• In order to facilitate the update, we set up two train ﹣ data matrices. The second one changes y all to 1. When it is a misclassification point, it is used to update the data
```    def collect_data(self):
collect_1 = []
collect_2 = []
while True:
try:  # Using exception processing to end input, I didn't think of any other good way..
data = map(float, input("Input data x1 x2 y(Space partition)\
//Enter any letter and hit enter to end:“).split(' '))
data = list(data)
collect_1.append(copy.copy(data))  # The test found that the change of the next line will affect the previous line, and the two points to the same variable address, so we have to copy it
data = 1
collect_2.append(data)
except ValueError:
print("Data collection finished!")
break
self.t_data = np.array(collect_1, dtype='float32')
self.t_data_c = np.array(collect_2, dtype='float32')

```
1. Parameter update iteration
• The ultimate goal is that the number of misclassification points is 0, so every time a misclassification is detected, it will be mistaked + = 1, until it is 0 after a certain update, and jump out of the while
```def gradient_descent(self):
print("Start iteration...")
number = 0
while True:
number += 1
# Count the number of misclassification points every time, until there is no point, jump out of the loop, and the iteration ends
mistake = 0
for line, line_c in zip(self.t_data, self.t_data_c):  # One is used to judge whether it is misclassified, and the other is used to update the parameter matrix
if line * np.dot(line_c, self.w_b) <= 0:
line_c = line_c * line * self.learn_rate
print(line_c)
mistake += 1
self.w_b += line_c
self.w_b += line_c
self.w_b += line_c
if mistake == 0:
break
print('The first{}Secondary iteration\n parameter w: {}'.format(number, self.w_b))
print('-----------------')
print("Iteration complete!")
print('This learning rate:{}，Iterations:{}'.format(self.learn_rate, number))
```
1. Data visualization
• Note here that when writing the straight line formula, y (actually x2) may have a coefficient of 0, but obviously the divisor cannot be 0, so this situation needs to be judged separately
```if not self.w_b:  # In the test, it is found that the coefficient of X ﹣ U2 is 0, which is equivalent to the divisor of 0 when drawing
x = -1 * self.w_b / self.w_b
plt.axvline(x, color='g')
else:
x = np.linspace(-10, 10, 10)
y = -1 * self.w_b / self.w_b * x + -1 * self.w_b / self.w_b
plt.plot(x, y, color='g')
```

It's almost like this

-------------------------

The complete code is as follows

```import numpy as np
import copy
import matplotlib.pyplot as plt

# To create a numpy array, you must declare the floating-point type. Otherwise, when the learning rate is not 1, the parameter update will be automatically rounded, and the iteration cannot be completed
# When including included data, it should also be declared as floating-point type, otherwise, the data will automatically become integer int32

class Perceptron:
def __init__(self):
self.learn_rate = 0.5  # Feel free to set up
self.w_b = np.array([,
,
], dtype='float32')  # Put three parameters in a matrix (w1, w2, b)
self.t_data = None
self.t_data_c = None  # Because we need to update the three parameters together, the formula in the observation book is actually y times x 1, x 2, 1, and then add it to the parameter matrix, so we need to build a matrix where y values are all 1

def collect_data(self):
collect_1 = []
collect_2 = []
while True:
try:  # Using exception processing to end input, I didn't think of any other good way..
data = map(float, input("Input data x1 x2 y(Space partition)\
//Enter any letter and hit enter to end:“).split(' '))
data = list(data)
collect_1.append(copy.copy(data))  # The test found that the change of the next line will affect the previous line, and the two points to the same variable address, so we have to copy it
data = 1
collect_2.append(data)
except ValueError:
print("Data collection finished!")
break
self.t_data = np.array(collect_1, dtype='float32')
self.t_data_c = np.array(collect_2, dtype='float32')

print("Start iteration...")
number = 0
while True:
number += 1
# Count the number of misclassification points every time, until there is no point, jump out of the loop, and the iteration ends
mistake = 0
for line, line_c in zip(self.t_data, self.t_data_c):  # One is used to judge whether it is misclassified, and the other is used to update the parameter matrix
if line * np.dot(line_c, self.w_b) <= 0:
line_c = line_c * line * self.learn_rate  # The update method corresponds to the second note above
print(line_c)
mistake += 1
self.w_b += line_c
self.w_b += line_c
self.w_b += line_c
if mistake == 0:
break
print('The first{}Secondary iteration\n parameter w: {}'.format(number, self.w_b))
print('-----------------')
print("Iteration complete!")
print('This learning rate:{}，Iterations:{}'.format(self.learn_rate, number))

def visualize(self):  # The Y in the following drawing is not the previous y, in fact, it's the so-called x'u 2..
plt.figure(figsize=(8, 4))
x = 0
y = 0
if not self.w_b:  # In the test, it is found that the coefficient of X ﹣ U2 is 0, which is equivalent to the divisor of 0 when drawing
x = -1 * self.w_b / self.w_b
plt.axvline(x, color='g')
else:
x = np.linspace(-10, 10, 10)
y = -1 * self.w_b / self.w_b * x + -1 * self.w_b / self.w_b
plt.plot(x, y, color='g')
for i in self.t_data:
if i == 1:
plt.scatter(i, i, c='r', s=5)
else:
plt.scatter(i, i, c='b', s=5)
plt.xlim(-10, 10)
plt.ylim(-10, 10)
plt.xlabel('x(1)')
plt.ylabel('x(2)')
plt.show()

p = Perceptron()
p.collect_data()  