DL之DNN:自定义MultiLayerNet(5*100+ReLU+SGD/Momentum/AdaGrad/Adam四种最优化)对MNIST数据集训练进而比较不同方法的性能
生活随笔
收集整理的這篇文章主要介紹了
DL之DNN:自定义MultiLayerNet(5*100+ReLU+SGD/Momentum/AdaGrad/Adam四种最优化)对MNIST数据集训练进而比较不同方法的性能
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
DL之DNN:自定義MultiLayerNet(5*100+ReLU+SGD/Momentum/AdaGrad/Adam四種最優化)對MNIST數據集訓練進而比較不同方法的性能
?
?
目錄
輸出結果
設計思路
核心代碼
?
?
輸出結果
===========iteration:0=========== SGD:2.289282108880558 Momentum:2.2858501933777964 AdaGrad:2.135969407893337 Adam:2.2214629551644443 ===========iteration:100=========== SGD:1.549948593098733 Momentum:0.2630614409487161 AdaGrad:0.1280980906681204 Adam:0.21268580798960957 ===========iteration:200=========== SGD:0.7668413651485669 Momentum:0.19974263379725932 AdaGrad:0.0688320187945635 Adam:0.12737004371824456 ===========iteration:300=========== SGD:0.46630711328743457 Momentum:0.17680542175883507 AdaGrad:0.0580940990397764 Adam:0.12930303058268838 ===========iteration:400=========== SGD:0.34526365067568743 Momentum:0.08914404106297127 AdaGrad:0.038093353912494965 Adam:0.06415424083978832 ===========iteration:500=========== SGD:0.3588584559967853 Momentum:0.1299949652623088 AdaGrad:0.040978421988412894 Adam:0.058780880102566074 ===========iteration:600=========== SGD:0.38273120367667224 Momentum:0.14074766142608885 AdaGrad:0.08641723451090685 Adam:0.11339321858037713 ===========iteration:700=========== SGD:0.381094901742027 Momentum:0.1566582072807326 AdaGrad:0.08844650332208387 Adam:0.10485802139218811 ===========iteration:800=========== SGD:0.25722603754213674 Momentum:0.07897119725740888 AdaGrad:0.04960128385990466 Adam:0.0835996553542796 ===========iteration:900=========== SGD:0.33273148769731326 Momentum:0.19612162874621766 AdaGrad:0.03441995281224886 Adam:0.12248261979926914 ===========iteration:1000=========== SGD:0.26394416793465253 Momentum:0.10157776537129978 AdaGrad:0.04761303979039287 Adam:0.046994040537976525 ===========iteration:1100=========== SGD:0.23894569840123672 Momentum:0.09093030644899333 AdaGrad:0.07018006635107976 Adam:0.07879622117292093 ===========iteration:1200=========== SGD:0.24382935069334477 Momentum:0.08324889705863456 AdaGrad:0.04484659272127939 Adam:0.0719509559060747 ===========iteration:1300=========== SGD:0.21307958354960485 Momentum:0.07030166296163001 AdaGrad:0.022552468995955182 Adam:0.049860815437560935 ===========iteration:1400=========== SGD:0.3110486414209358 Momentum:0.13117004626934742 AdaGrad:0.07351569965620054 Adam:0.09723751626189574 ===========iteration:1500=========== SGD:0.2087589466947655 Momentum:0.09088929766254576 AdaGrad:0.027825434320282873 Adam:0.06352715244823183 ===========iteration:1600=========== SGD:0.12783635178644553 Momentum:0.053366262737818 AdaGrad:0.012093087503155344 Adam:0.021385013278486315 ===========iteration:1700=========== SGD:0.21476134194349975 Momentum:0.08453161462373757 AdaGrad:0.054955557126319256 Adam:0.035257261368372185 ===========iteration:1800=========== SGD:0.3415964018415049 Momentum:0.13866704706781385 AdaGrad:0.04585298765046911 Adam:0.06437669858445684 ===========iteration:1900=========== SGD:0.13530674587479818 Momentum:0.03958142222010819 AdaGrad:0.019096102635470277 Adam:0.02185864115092371?
設計思路
?
核心代碼
#T1、SGD算法 class SGD: '……'def update(self, params, grads):for key in params.keys():params[key] -= self.lr * grads[key] #T2、Momentum算法 import numpy as np class Momentum: '……' def update(self, params, grads):if self.v is None:self.v = {}for key, val in params.items(): self.v[key] = np.zeros_like(val)for key in params.keys():self.v[key] = self.momentum*self.v[key] - self.lr*grads[key] params[key] += self.v[key]#T3、AdaGrad算法 '……'def update(self, params, grads):if self.h is None:self.h = {}for key, val in params.items():self.h[key] = np.zeros_like(val)for key in params.keys():self.h[key] += grads[key] * grads[key]params[key] -= self.lr * grads[key] / (np.sqrt(self.h[key]) + 1e-7) #T4、Adam算法 '……'def update(self, params, grads):if self.m is None:self.m, self.v = {}, {}for key, val in params.items():self.m[key] = np.zeros_like(val)self.v[key] = np.zeros_like(val)self.iter += 1lr_t = self.lr * np.sqrt(1.0 - self.beta2**self.iter) / (1.0 - self.beta1**self.iter) for key in params.keys():self.m[key] += (1 - self.beta1) * (grads[key] - self.m[key])self.v[key] += (1 - self.beta2) * (grads[key]**2 - self.v[key])params[key] -= lr_t * self.m[key] / (np.sqrt(self.v[key]) + 1e-7)networks = {} train_loss = {} for key in optimizers.keys():networks[key] = MultiLayerNet( input_size=784, hidden_size_list=[10, 10, 10, 10], output_size=10)train_loss[key] = [] for i in range(max_iterations):batch_mask = np.random.choice(train_size, batch_size)x_batch = x_train[batch_mask]t_batch = t_train[batch_mask]for key in optimizers.keys():grads = networks[key].gradient(x_batch, t_batch) optimizers[key].update(networks[key].params, grads) loss = networks[key].loss(x_batch, t_batch)train_loss[key].append(loss)if i % 100 == 0:print( "===========" + "iteration:" + str(i) + "===========")for key in optimizers.keys():loss = networks[key].loss(x_batch, t_batch)print(key + ":" + str(loss))相關文章
DL之DNN:自定義五層DNN(5*100+ReLU+SGD/Momentum/AdaGrad/Adam四種最優化)對MNIST數據集訓練進而比較不同方法的性能
?
《新程序員》:云原生和全面數字化實踐50位技術專家共同創作,文字、視頻、音頻交互閱讀總結
以上是生活随笔為你收集整理的DL之DNN:自定义MultiLayerNet(5*100+ReLU+SGD/Momentum/AdaGrad/Adam四种最优化)对MNIST数据集训练进而比较不同方法的性能的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: AI公开课:19.04.04李航—字节跳
- 下一篇: DL之DNN优化技术:DNN优化器的参数