深度学习(2)回归问题
深度學習(2)回歸問題
- 一. 問題提出與解析
- 1. Machine Learning
- 2. Continuous Prediction
- 3. Linear Equation
- 4. With Noise?
- 5. Find w′w'w′,b′b'b′
- 6. Gradient Descent
- 二. 回歸問題實戰
- 1. 步驟
- 2. Step1: Compute Loss
- 3. Step2: Compute Gradient and update
- 4. Step3: Set w=w′w=w'w=w′and loop
- 5. 代碼
一. 問題提出與解析
1. Machine Learning
- make decisions
- going left/right →\to→ discrete
- increase/decrease →\to→ continuous
2. Continuous Prediction
- fθ:x→yf_θ:x→yfθ?:x→y
- x:inputdatax:input datax:inputdata
- f(x):predictionf(x):predictionf(x):prediction
- y:realdata,ground?truthy:real data,ground-truthy:realdata,ground?truth
3. Linear Equation
- y=w*x+b
- 1.567=w*1+b
- 3.043=w*2+b
→\to→ Closed Form Solution
- w=1.477
- b=0.089
4. With Noise?
- y=w*x+b+?
- ? ~ N(0,1)
- 1.567=w*1+b+eps
- 3.043=w*2+b+eps
- 4.519=w*2+b+eps
- …
→\to→ - Y=(WX+b)
For Example
- w?
- b?
5. Find w′w'w′,b′b'b′
- [(WX+b?Y)]2[(WX+b-Y)]^2[(WX+b?Y)]2
- loss=∑i(w?xi+b?yi)2loss=\sum_i{(w*x_i+b-y_i)^2}loss=∑i?(w?xi?+b?yi?)2
- MinimizelossMinimize\ lossMinimize?loss
- w′?x+b′→yw'*x+b'→yw′?x+b′→y
6. Gradient Descent
(1) 1-D
w′=w′?lr?dydww'=w'-lr*\frac{dy}{dw}w′=w′?lr?dwdy?
x′=x?0.005?dydwx'=x-0.005*\frac{dy}{dw}x′=x?0.005?dwdy?
可以看到,函數的導數始終指向函數值變大的方向,因此,如果要求losslossloss函數的極小值的話,就需要沿導數的反方向前進,即?lr?dydw-lr*\frac{dy}{dw}?lr?dwdy?,衰減因子lrlrlr的引入是為了防止步長變大,跨度太大。
(2) 2-D
Findw′,b′w',b'w′,b′
- loss=∑i(w?xi+b?yi)2loss=\sum_i{(w*x_i+b-y_i)^2}loss=∑i?(w?xi?+b?yi?)2
- 分別對w和b求偏導數,然后沿著偏導數的反向前進,即:
- w′=w?lr??loss?ww'=w-lr*\frac{?loss}{?w}w′=w?lr??w?loss?
- b′=b?lr??loss?bb'=b-lr*\frac{?loss}{?b}b′=b?lr??b?loss?
- w′?x+b′→yw'*x+b'→yw′?x+b′→y
Learning Process
Loss surface
二. 回歸問題實戰
1. 步驟
(1) 根據隨機初始化的w,x,b,yw,x,b,yw,x,b,y的數值來計算LossFunctionLoss\ FunctionLoss?Function;
(2) 根據當前的w,x,b,yw,x,b,yw,x,b,y的值來計算梯度;
(3) 更新梯度,將w′w'w′賦值給www,如此往復循環;
(4) 最后面的w′w'w′和b′b'b′就會作為模型的參數。
2. Step1: Compute Loss
共有100個點,每個點有兩個維度,所以數據集維度為[100,2][100,2][100,2],按照[(x0,y0),(x1,y1),…,(x99,y99)][(x_0,y_0 ),(x_1,y_1 ),…,(x_{99},y_{99} )][(x0?,y0?),(x1?,y1?),…,(x99?,y99?)]排列,則損失函數為:
loss=[(w0x0+b0?y0)]2+[(w0x1+b0?y1)]2+?+[(w0x99+b0?y99)]2loss=[(w_0 x_0+b_0-y_0)]^2+[(w_0 x_1+b_0-y_1)]^2+?+[(w_0 x_{99}+b_0-y_{99})]^2loss=[(w0?x0?+b0??y0?)]2+[(w0?x1?+b0??y1?)]2+?+[(w0?x99?+b0??y99?)]2
即:
loss=∑i(w?xi+b?yi)2loss=\sum_i(w*x_i+b-y_i)^2loss=i∑?(w?xi?+b?yi?)2
初始值設w0=b0=0w_0=b_0=0w0?=b0?=0。
(1) b和w的初始值都為0,points是傳入的100個點,是data.csv里的數據;
(2) len(points)就是傳入數據點的個數,即100; range(0, len(points))就代表從0循環到100;
(3) x=points[i, 0]表示取第i個點中的第0個值,即第一個元素,相當于p[i][0]; 同理,y=points[i, 1]表示取第i個點中的第1個值,即第二個元素,相當于p[i][1];
(4) totalError為總損失值,除以是len(points)是平均損失值。
3. Step2: Compute Gradient and update
loss0=(wx0+b?y0)2loss_0=(wx_0+b-y_0)^2loss0?=(wx0?+b?y0?)2
?loss0?w=2(wx0+b?y0)x0\frac{?loss_0}{?w}=2(wx_0+b-y_0)x_0?w?loss0??=2(wx0?+b?y0?)x0?
?loss?w=2∑(wxi+b?yi)xi\frac{?loss}{?w}=2\sum(wx_i+b-y_i)x_i ?w?loss?=2∑(wxi?+b?yi?)xi?
?loss?b=2∑(wxi+b?yi)\frac{?loss}{?b}=2\sum(wx_i+b-y_i)?b?loss?=2∑(wxi?+b?yi?)
w′=w?lr??loss?ww'=w-lr*\frac{?loss}{?w}w′=w?lr??w?loss?
b′=b?lr??loss?bb'=b-lr*\frac{?loss}{?b}b′=b?lr??b?loss?
4. Step3: Set w=w′w=w'w=w′and loop
w←w′w←w'w←w′
b←b′b←b'b←b′
計算出最終的w和b的值就可以帶入模型進行預測了:
w′x+b′→predictw' x+b'→predictw′x+b′→predict
5. 代碼
import numpy as np# y = wx + b def compute_error_for_line_given_points(b, w, points):totalError = 0for i in range(0, len(points)):x = points[i, 0]y = points[i, 1]# computer mean-squared-errortotalError += (y - (w * x + b)) ** 2# average loss for each pointreturn totalError / float(len(points))def step_gradient(b_current, w_current, points, learningRate):b_gradient = 0w_gradient = 0N = float(len(points))for i in range(0, len(points)):x = points[i, 0]y = points[i, 1]# grad_b = 2(wx+b-y)b_gradient += (2 / N) * ((w_current * x + b_current) - y)# grad_w = 2(wx+b-y)*xw_gradient += (2 / N) * x * ((w_current * x + b_current) - y)# update w'new_b = b_current - (learningRate * b_gradient)new_w = w_current - (learningRate * w_gradient)return [new_b, new_w]def gradient_descent_runner(points, starting_b, starting_w, learning_rate, num_iterations):b = starting_bw = starting_w# update for several timesfor i in range(num_iterations):b, w = step_gradient(b, w, np.array(points), learning_rate)return [b, w]def run():points = np.genfromtxt("data.csv", delimiter=",")learning_rate = 0.0001initial_b = 0 # initial y-intercept guessinitial_w = 0 # initial slope guessnum_iterations = 1000print("Starting gradient descent at b = {0}, w = {1}, error = {2}".format(initial_b, initial_w,compute_error_for_line_given_points(initial_b, initial_w, points)))print("Running...")[b, w] = gradient_descent_runner(points, initial_b, initial_w, learning_rate, num_iterations)print("After {0} iterations b = {1}, w = {2}, error = {3}".format(num_iterations, b, w,compute_error_for_line_given_points(b, w, points)))if __name__ == '__main__':run()運行結果如下:
可以看到,在w=0,b=0w=0,b=0w=0,b=0的時候,損失值error≈5565.11error≈5565.11error≈5565.11;
在1000輪迭代后,w≈1.48,b≈0.09w≈1.48,b≈0.09w≈1.48,b≈0.09,損失值error≈112.61error≈112.61error≈112.61,要大大小于原來的損失值。
參考文獻:
[1] 龍良曲:《深度學習與TensorFlow2入門實戰》
總結
以上是生活随笔為你收集整理的深度学习(2)回归问题的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 我,一名记者,面对ChatGPT慌的一比
- 下一篇: 猜一猜以下哪种动物属于秦岭四宝?蚂蚁庄园