當(dāng)前位置：首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

4.7 程序示例--算法诊断-机器学习笔记-斯坦福吴恩达教授

發(fā)布時(shí)間：2025/4/5 编程问答 13 豆豆

生活随笔收集整理的這篇文章主要介紹了 4.7 程序示例--算法诊断-机器学习笔记-斯坦福吴恩达教授小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

程序示例–算法診斷

我們手頭有一份大壩水的流量與水位關(guān)系的數(shù)據(jù)，首先我們將其劃分為訓(xùn)練集、交叉驗(yàn)證集和測(cè)試集：

# coding: utf-8 # algorithm_analysis/diagnose.py """算法診斷 """ import linear_regression import numpy as np from scipy.io import loadmat import matplotlib.pyplot as plt from sklearn.preprocessing import PolynomialFeaturesdata = loadmat('data/water.mat') ##### # 數(shù)據(jù)集劃分 ##### # 訓(xùn)練集 X = np.mat(data['X']) # 為X添加偏置 X = np.concatenate((np.ones((X.shape[0], 1)), X), axis=1) y = np.mat(data['y']) # 交叉驗(yàn)證集 Xval = np.mat(data['Xval']) Xval = np.concatenate((np.ones((Xval.shape[0], 1)), Xval), axis=1) yval = np.mat(data['yval']) # 測(cè)試集 Xtest = np.mat(data['Xtest']) Xtest = np.concatenate((np.ones((Xtest.shape[0], 1)), Xtest), axis=1) ytest = np.mat(data['ytest'])

接著，我們會(huì)使用訓(xùn)練集來(lái)獲得線性回歸的擬合曲線，并且觀測(cè)隨樣本大小 $m$ 變化的學(xué)習(xí)曲線：

# algorithm_analysis/diagnose.py def diagnoseLR():"""線性回歸診斷"""initTheta = np.mat(np.ones((X.shape[1], 1)))result, timeConsumed = linear_regression.gradient(X, y, rate=0.001, maxLoop=5000, epsilon=0.1, initTheta=initTheta)theta, errors = result# 繪制擬合成果Xmin = X[:, 1].min()Xmax = X[:, 1].max()ymax = y[:, 0].max()ymin = y[:, 0].min()fitX = np.mat(np.linspace(Xmin, Xmax, 20).reshape(-1, 1))fitX = np.concatenate((np.ones((fitX.shape[0], 1)), fitX), axis=1)h = fitX * thetaplt.xlim(Xmin, Xmax)plt.ylim(ymin, ymax)# 繪制訓(xùn)練樣本plt.scatter(X[:, 1].flatten().A[0], y[:, 0].flatten().A[0],marker='x',color='r', linewidth=2)# 繪制擬合曲線plt.plot(fitX[:, 1], h, color='b')plt.xlabel('Change in water level(x)')plt.ylabel('Water flowing out of the dam(y)')plt.show()# 繪制隨樣本規(guī)模學(xué)習(xí)曲線m, n = X.shapetrainErrors = np.zeros((1,m))valErrors = np.zeros((1,m))for i in range(m):Xtrain = X[0:i+1]ytrain = y[0:i+1]res, timeConsumed = linear_regression.gradient(Xtrain, ytrain, rate=0.001, maxLoop=5000, epsilon=0.1)theta, errors = restrainErrors[0,i] = errors[-1]valErrors[0,i] = linear_regression.J(theta, Xval, yval)plt.plot(np.arange(1,m+1).ravel(), trainErrors.ravel(), color='b', label='Training Error')plt.plot(np.arange(1,m+1).ravel(), valErrors.ravel(), color='g', label='Validation Error')plt.title('Learning curve for linear regression')plt.xlabel('Number of training examples')plt.ylabel('Error')plt.legend()plt.show()

通過(guò)觀測(cè)學(xué)習(xí)曲線，我們估計(jì)算法出現(xiàn)了**高偏差（High Bias）**情況，因此，我們通過(guò)多項(xiàng)式回歸來(lái)提高擬合精度：

# algorithm_analysis/diagnose.py def diagnosePR():"""多項(xiàng)式回歸診斷"""# 多項(xiàng)式回歸poly = PolynomialFeatures(degree=8)XX, XXval, XXtest = [linear_regression.normalize(np.mat(poly.fit_transform(data[:, 1:]))) for data in [X, Xval, Xtest]]initTheta = np.mat(np.ones((XX.shape[1], 1)))theLambdas = [1.0, 0.001, 0.003, 0.01, 0.003, 0.1, 0.3, 1.0, 3.0, 10.0]numTheLambdas = len(theLambdas)trainErrors = np.zeros((1, numTheLambdas))valErrors = np.zeros((1, numTheLambdas))thetas = []for idx, theLambda in enumerate(theLambdas):res, timeConsumed = linear_regression.gradient(XX, y, rate=0.3, maxLoop=500, epsilon=0.01,theLambda=theLambda, initTheta=initTheta)theta, errors = resthetas.append(theta)trainErrors[0, idx] = errors[-1]valErrors[0, idx] = linear_regression.J(theta, XXval, yval, theLambda=theLambda)bestLambda = theLambdas[np.argmin(valErrors)]theta = thetas[np.argmin(valErrors)]error = np.min(valErrors)# # 繪制隨樣本規(guī)模學(xué)習(xí)曲線plt.plot(np.arange(1, numTheLambdas + 1).ravel(),trainErrors.ravel(), color='b', label='Training Error')plt.plot(np.arange(1, numTheLambdas + 1).ravel(),valErrors.ravel(), color='g', label='Validation Error')plt.title('Learning curve for polynomial regression')plt.xlabel('lambda')plt.ylabel('Error')plt.legend()plt.show()# 繪制擬合曲線fitX = np.mat(np.linspace(-60, 45).reshape(-1, 1))fitX = np.concatenate((np.ones((fitX.shape[0], 1)), fitX), axis=1)fitXX = linear_regression.normalize(np.mat(poly.fit_transform(fitX[:, 1:])))h = fitXX * thetaplt.title('Polynomial regression learning curve(lambda=%.3f) \n validation error=%.3f' % (bestLambda, error))plt.scatter(X[:, 1].ravel(), y[:, 0].flatten().A[0], marker='x', color='r', linewidth=3)plt.plot(fitX[:, 1], h, color='b')plt.show()

由于多項(xiàng)式回歸可能引起過(guò)擬合問(wèn)題，因此我們還考慮了正規(guī)化，并且獲得了隨不同的正規(guī)化參數(shù) $λ$ 變化的學(xué)習(xí)曲線:

借此知道了在 $λ$ =0.001 的時(shí)候，交叉驗(yàn)證集誤差最小，此時(shí)我們繪制擬合曲線：

總結(jié)

以上是生活随笔為你收集整理的4.7 程序示例--算法诊断-机器学习笔记-斯坦福吴恩达教授的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： 4.6 大数据集-机器学习笔记-斯坦福吴
下一篇： 5.1 代价函数-机器学习笔记-斯坦福吴

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

编程问答

4.7 程序示例--算法诊断-机器学习笔记-斯坦福吴恩达教授

程序示例–算法診斷

總結(jié)