Good courses should be shared with more people: AI video list - shangxuetang After clicking any one of them, you will find that they will provide the download address of Baidu online disk integrated with a series of courses, including video + code + information, free high-quality resources. Of course, there are a lot of sharing now. All kinds of MOOCS, blogs, forums, etc. can easily find all kinds of knowledge. Where we can go is in ourselves. I hope I can keep on, come on!

Gradient descent method

Look at this, Jane's book: Deep to shallow -- gradient descent method and its implementation

Batch gradient decline

· initialize W, i.e. random W, to give initial value

· iteration in the direction of negative gradient, the updated w makes the loss function J(w) smaller

· if the W dimension is hundreds of dimensions, it is also possible to calculate SVD directly, and the gradient descent algorithm is generally used when the W dimension is more than hundreds of dimensions

·

# Batch gradient decline import numpy as np # Create your own data, ha ha X = 2 * np.random.rand(100, 1) y = 4 + 3 * X + np.random.randn(100, 1) X_b = np.c_[np.ones((100, 1)), X] learning_rate = 0.1 # Learning rate, step size=Learning rate x gradient n_iterations = 1000 # Iteration times,Generally, no threshold value is set, only super parameters and iterations are set m = 100 # m One sample theta = np.random.randn(2, 1) # Initialization parameters theta，w0，...,wn count = 0 # count for iteration in range(n_iterations): count += 1 # Seeking gradient gradients = 1/m * X_b.T.dot(X_b.dot(theta)-y) # Iterative updating theta value theta = theta - learning_rate * gradients # print(count, theta) print(count, theta)

Random gradient descent

· preferred random gradient descent

· sometimes random gradient descent can jump out of local minimum value

import numpy as np X = 2 * np.random.rand(100, 1) y = 4 + 3 * X + np.random.randn(100, 1) X_b = np.c_[np.ones((100, 1)), X] n_epochs = 500 t0, t1 = 5, 50 m = 100 def learning_schedule(t): return t0/(t + t1) # Random initialization parameter value theta = np.random.randn(2, 1) for epoch in range(n_epochs): for i in range(m): random_index = np.random.randint(m) xi = X_b[random_index:random_index+1] yi = y[random_index:random_index+1] gradients = 2*xi.T.dot(xi.dot(theta)-yi) learning_rate = learning_schedule(epoch*m + i) theta = theta - learning_rate * gradients print(theta)