梯度下降与线性回归
對(duì)于代價(jià)函數(shù):
loss=∑i(y^?yi)2loss=\sum_i{(\hat{y}-y_i)}^2loss=∑i?(y^??yi?)2
loss=∑i(w?xi+b?yi)2loss=\sum_i{(w*x_i+b-y_i)}^2loss=∑i?(w?xi?+b?yi?)2
最常見的代價(jià)函數(shù):均方差代價(jià)函數(shù)(Mean-Square Error,MSE):
loss=12N∑iN(w?xi+b?yi)2loss=\frac{1}{2N}\sum_i^N{(w*x_i+b-y_i)}^2loss=2N1?∑iN?(w?xi?+b?yi?)2
一 梯度下降(Gradient Descent)
在學(xué)習(xí)梯度下降原理之前首先我們需要知道這幾個(gè)數(shù)學(xué)概念:導(dǎo)數(shù)、偏導(dǎo)數(shù)、方向?qū)?shù)和梯度。關(guān)于梯度下面直接說(shuō)結(jié)論:
梯度下降算法基本原理:
既然在變量空間的某一點(diǎn)處,函數(shù)沿著梯度方向具有最大的變化率,那么在優(yōu)化代價(jià)函數(shù)的時(shí)候,就可以沿著負(fù)梯度方向去減小代價(jià)函數(shù)的值。
梯度下降數(shù)學(xué)原理:
w′=w?lr??loss?ww\prime=w-lr*\frac{\partial loss}{\partial w}w′=w?lr??w?loss?
b′=b?lr??loss?bb\prime=b-lr*\frac{\partial loss}{\partial b}b′=b?lr??b?loss?
其中l(wèi)r表示的是學(xué)習(xí)率:LearningRate
"""" 用一元線性回歸解決回歸問(wèn)題: y = wx + b """ import numpy as np import matplotlib.pyplot as plt # 畫圖正常顯示中文 plt.rcParams['font.sans-serif'] = ['SimHei'] # 用來(lái)正常顯示中文標(biāo)簽 plt.rcParams['axes.unicode_minus'] = False # 用來(lái)正常顯示負(fù)號(hào)def compute_error(b, w, points):totalError = 0for i in range(0, len(points)):x = points[i, 0]y = points[i, 1]# computer mean-squared-errortotalError += (y - (w * x + b)) ** 2# average loss for each pointreturn totalError / float(len(points))def step_gradient(b_current, w_current, points, learningRate):b_gradient = 0w_gradient = 0N = float(len(points))for i in range(0, len(points)):x = points[i, 0]y = points[i, 1]# 計(jì)算梯度 grad_b = 2(wx+b-y) grad_w = 2(wx+b-y)*xb_gradient += (2 / N) * ((w_current * x + b_current) - y)w_gradient += (2 / N) * x * ((w_current * x + b_current) - y)# update w'new_b = b_current - (learningRate * b_gradient)new_w = w_current - (learningRate * w_gradient)return [new_b, new_w]def gradient_descent_runner(points, starting_b, starting_w, learning_rate, num_iterations):b = starting_bw = starting_w# update for several timesfor i in range(num_iterations):b, w = step_gradient(b, w, np.array(points), learning_rate)return [b, w]def plot_scatter(data):x_data = data[:, 0]y_data = data[:, 1]plt.scatter(x_data, y_data)plt.title("訓(xùn)練數(shù)據(jù)集散點(diǎn)分布")plt.xlabel("自變量:x")plt.ylabel("因變量:y")plt.savefig("scatter.png")# plt.show()def plot_result(data, w, b):x_data = data[:, 0]y_data = data[:, 1]plt.scatter(x_data, y_data, c='b')plt.plot(x_data, w * x_data + b, 'r')plt.title("訓(xùn)練擬合結(jié)果")plt.xlabel("自變量:x")plt.ylabel("因變量:y")plt.savefig("result.png")# plt.show()def run():# numpy讀取CSV文件points = np.genfromtxt("data.csv", delimiter=",")# 繪制數(shù)據(jù)散點(diǎn)圖plot_scatter(points)# 設(shè)置學(xué)習(xí)率learning_rate = 0.0001# 權(quán)值初始化initial_b = 0initial_w = 0# 迭代次數(shù)num_iterations = 1000print("Starting b = {0}, w = {1}, error = {2}".format(initial_b, initial_w,compute_error(initial_b, initial_w, points)))print("Running...")[b, w] = gradient_descent_runner(points, initial_b, initial_w, learning_rate, num_iterations)print("After {0} iterations b = {1}, w = {2}, error = {3}".format(num_iterations, b, w, compute_error(b, w, points)))plot_result(points, w, b)if __name__ == '__main__':run()運(yùn)行結(jié)果:
Starting b = 0, w = 0, error = 5565.107834483211
Running…
After 1000 iterations b = 0.08893651993741346, w = 1.4777440851894448, error = 112.61481011613473
數(shù)據(jù)集:
訓(xùn)練擬合結(jié)果:
總結(jié)
- 上一篇: 消费者不再追求顶配,美国市场 iPhon
- 下一篇: 小红书自营店铺“福利社”今日正式关闭,集