# Gradient descent, over fitting and normalization

Good courses should be shared with more people: AI video list - shangxuetang After clicking any one of them, you will find that they will provide the download address of Baidu online disk integrated with a series of courses, including video + code + information, free high-quality resources. Of course, there are a lot of sharing now. All kinds of MOOCS, blogs, forums, etc. can easily find all kinds of knowledge. Where we can go is in ourselves. I hope I can keep on, come on!

Look at this, Jane's book: Deep to shallow -- gradient descent method and its implementation

· initialize W, i.e. random W, to give initial value

· iteration in the direction of negative gradient, the updated w makes the loss function J(w) smaller

· if the W dimension is hundreds of dimensions, it is also possible to calculate SVD directly, and the gradient descent algorithm is generally used when the W dimension is more than hundreds of dimensions

·

```# Batch gradient decline
import numpy as np

# Create your own data, ha ha
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)
X_b = np.c_[np.ones((100, 1)), X]

learning_rate = 0.1     # Learning rate, step size=Learning rate x gradient
n_iterations = 1000     # Iteration times,Generally, no threshold value is set, only super parameters and iterations are set
m = 100     # m One sample

theta = np.random.randn(2, 1)   # Initialization parameters theta，w0，...,wn
count = 0   # count

for iteration in range(n_iterations):
count += 1
# Iterative updating theta value
theta = theta - learning_rate * gradients
# print(count, theta)

print(count, theta)```

· sometimes random gradient descent can jump out of local minimum value

```import numpy as np

X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)
X_b = np.c_[np.ones((100, 1)), X]

n_epochs = 500
t0, t1 = 5, 50
m = 100

def learning_schedule(t):
return t0/(t + t1)

# Random initialization parameter value
theta = np.random.randn(2, 1)

for epoch in range(n_epochs):
for i in range(m):
random_index = np.random.randint(m)
xi = X_b[random_index:random_index+1]
yi = y[random_index:random_index+1]