吴恩达深度学习课程第二章第三周编程作业(pytorch实现)
文章目錄
- 聲明
- 一、問題描述
- 二、編程實現
- 1.加載數據集
- 2.使用mini-batch
- 3.利用pytorch搭建神經網絡
- 3.1 利用torch.nn簡單封裝模型
- 3.2 定義優化算法和損失函數
- 4.整體代碼
聲明
??本博客只是記錄一下本人在深度學習過程中的學習筆記和編程經驗,大部分代碼是參考了【中文】【吳恩達課后編程作業】Course 2 - 改善深層神經網絡 - 第三周作業這篇博客,對其代碼實現了復現,但是原博客中代碼使用的是tensorflow,而我在學習生活中主要用到的是pytorch,所以此次作業我使用pytorch框架來完成。因此,代碼或文字表述中還存在一些問題,請見諒,之前的博客也是主要參考這個大佬。下文中的完整代碼已經上傳到百度網盤中,提取碼:gp3h。
??所以開始作業前,請大家安裝好pytorch的環境,我代碼是在服務器上利用gpu加速運行的,但是cpu版本的pytorch也能運行,只是速度會比較慢。
一、問題描述
??這周作業的任務是利用softmax層完成一個多分類問題,利用神經網絡識別圖片中手指比劃的數字,大致如下:
二、編程實現
1.加載數據集
??用matplotlib繪制數據集中的數據,可以查看圖片:
from tf_utils import load_dataset import matplotlib.pyplot as pltX_train_orig , Y_train_orig , X_test_orig , Y_test_orig , classes = load_dataset()index = 12 plt.imshow(X_train_orig[index]) plt.show()圖片如下:
??通過上述代碼,我們得到的X_train_orig的維度為(1080,64,64,3),在之前的作業中我們知道,(64,64,3)表示的是一張圖片的信息,而1080表示訓練集中的樣本數量。
def data_processing():X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset()X_train_flatten = X_train_orig.reshape(X_train_orig.shape[0], -1).T / 255X_test_flatten = X_test_orig.reshape(X_test_orig.shape[0], -1).T / 255return X_train_flatten, Y_train_orig, X_test_flatten, Y_test_orig, classes??數據處理后訓練集中的維度變為(12288,1080),其中12288=64x64x3,而標簽集的維度在下文中細說。
2.使用mini-batch
??在之前的編程作業中已經對mini-batch的使用有了較為全面的了解,而且mini-batch并不是本次作業的重點,在這里就貼出劃分mini-batch的代碼,不再做進一步解釋:
def random_mini_batches(X, Y, mini_batch_size=64, seed=0):"""Creates a list of random minibatches from (X, Y)Arguments:X -- input data, of shape (input size, number of examples)Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples)mini_batch_size - size of the mini-batches, integerseed -- this is only for the purpose of grading, so that you're "random minibatches are the same as ours.Returns:mini_batches -- list of synchronous (mini_batch_X, mini_batch_Y)"""m = X.shape[1] # number of training examplesmini_batches = []np.random.seed(seed)# Step 1: Shuffle (X, Y)permutation = list(np.random.permutation(m))shuffled_X = X[:, permutation]shuffled_Y = Y[:, permutation].reshape((Y.shape[0], m))# Step 2: Partition (shuffled_X, shuffled_Y). Minus the end case.num_complete_minibatches = math.floor(m / mini_batch_size) # number of mini batches of size mini_batch_size in your partitionningfor k in range(0, num_complete_minibatches):mini_batch_X = shuffled_X[:, k * mini_batch_size: k * mini_batch_size + mini_batch_size]mini_batch_Y = shuffled_Y[:, k * mini_batch_size: k * mini_batch_size + mini_batch_size]mini_batch = (mini_batch_X, mini_batch_Y)mini_batches.append(mini_batch)# Handling the end case (last mini-batch < mini_batch_size)if m % mini_batch_size != 0:mini_batch_X = shuffled_X[:, num_complete_minibatches * mini_batch_size: m]mini_batch_Y = shuffled_Y[:, num_complete_minibatches * mini_batch_size: m]mini_batch = (mini_batch_X, mini_batch_Y)mini_batches.append(mini_batch)return mini_batches3.利用pytorch搭建神經網絡
??我們需要搭建的神經網絡結構為:LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SOFTMAX。可以看出,與之前相比只是把輸出層的激活函數換成softmax函數,隨之而變的是輸出層的神經元個數,因為是六分類,對應的神經元個數為6。
3.1 利用torch.nn簡單封裝模型
class Model(torch.nn.Module):def __init__(self, N_in, h1, h2, D_out):super(Model, self).__init__()self.linear1 = torch.nn.Linear(N_in, h1)self.relu1 = torch.nn.ReLU()self.linear2 = torch.nn.Linear(h1, h2)self.relu2 = torch.nn.ReLU()self.linear3 = torch.nn.Linear(h2, D_out)self.model = torch.nn.Sequential(self.linear1, self.relu1, self.linear2, self.relu2, self.linear3)def forward(self, x):return self.model(x)??根據題目要求定義需要的計算層,并作為參數依次傳入 Sequential 函數內,傳入順序決定了計算順序,千萬不能弄錯。
??定義一個前向傳播的函數,可以看出,利用pytorch做前向傳播極大的減少了代碼量。
3.2 定義優化算法和損失函數
optimizer = torch.optim.Adam(m.model.parameters(), lr=learning_rate) loss_fn = torch.nn.CrossEntropyLoss()??優化算法這里采用的是Adam優化算法,直接使用torch.optim包里面的函數即可,記住需要把神經網絡的參數還有定義的學習率傳入到函數里面。
??損失函數這里使用的是交叉熵函數,關于交叉熵背后的數學原理相信大家已經在視頻中有了大致了解,在這里就不再做過多解釋,但是使用pytorch封裝好的交叉熵函數時需要注意參數的傳入。
??通過前向傳播,我們得到輸出層的結果為(n,6),這里的n表示的時輸入的樣本數量,而每一列的6個數據表示的是樣本屬于六個類別的概率,這應該很好理解。
??計算損失時,我們需要將預測標簽值y_pred和實際標簽值y傳入損失函數中,y_pred的維度為(n,6),而y的維度為(n,),沒錯,我們要將樣本的實際標簽值設置成1維,交叉熵函數會在內部將y轉換為one-hot形式,y的維度會變成(n,6)。而在tensorflow框架中,損失函數不會幫我們完成one-hot的轉換,我們要自己完成。
??還有一點需要指出,CrossEntropyLoss 在內部完成了softmax的功能,所以不需要在前向傳播的過程中定義softmax計算層。
4.整體代碼
import torchnum = torch.cuda.device_count() device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")from tf_utils import load_dataset, random_mini_batches, convert_to_one_hot from model import Modeldef data_processing():X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset()X_train_flatten = X_train_orig.reshape(X_train_orig.shape[0], -1).T / 255X_test_flatten = X_test_orig.reshape(X_test_orig.shape[0], -1).T / 255return X_train_flatten, Y_train_orig, X_test_flatten, Y_test_orig, classesif __name__ == "__main__":X_train_flatten, Y_train, X_test_flatten, Y_test, classes = data_processing()X_train_flatten = torch.from_numpy(X_train_flatten).to(torch.float32).to(device)Y_train = torch.from_numpy(Y_train).to(torch.float32).to(device)X_test_flatten = torch.from_numpy(X_test_flatten).to(torch.float32).to(device)Y_test = torch.from_numpy(Y_test).to(torch.float32).to(device)D_in, h1, h2, D_out = 12288, 25, 12, 6m = Model(D_in, h1, h2, D_out)m.to(device)epoch_num = 1500learning_rate = 0.0001minibatch_size = 32seed = 3costs = []optimizer = torch.optim.Adam(m.model.parameters(), lr=learning_rate)loss_fn = torch.nn.CrossEntropyLoss()for epoch in range(epoch_num):epoch_cost = 0num_minibatches = int(X_train_flatten.size()[1] / minibatch_size)minibatches = random_mini_batches(X_train_flatten, Y_train, minibatch_size, seed)for minibatch in minibatches:(minibatch_X, minibatch_Y) = minibatchy_pred = m.forward(minibatch_X.T)y = minibatch_Y.Ty = y.view(-1)loss = loss_fn(y_pred, y.long())epoch_cost = epoch_cost + loss.item()optimizer.zero_grad()loss.backward()optimizer.step()epoch_cost = epoch_cost / (num_minibatches + 1)if epoch % 5 == 0:costs.append(epoch_cost)# 是否打印:if epoch % 100 == 0:print("epoch = " + str(epoch) + " epoch_cost = " + str(epoch_cost))??損失函數計算結果:
epoch = 0 epoch_cost = 1.8013256788253784 epoch = 100 epoch_cost = 0.8971561684327967 epoch = 200 epoch_cost = 0.6031410886960871 epoch = 300 epoch_cost = 0.396172211450689 epoch = 400 epoch_cost = 0.2640543882461155 epoch = 500 epoch_cost = 0.17116783581235828 epoch = 600 epoch_cost = 0.10572761395836577 epoch = 700 epoch_cost = 0.060585571726893675 epoch = 800 epoch_cost = 0.03220567786518265 epoch = 900 epoch_cost = 0.01613416599438471 epoch = 1000 epoch_cost = 0.007416377563084311 epoch = 1100 epoch_cost = 0.0030659845283748034 epoch = 1200 epoch_cost = 0.0027029767036711905 epoch = 1300 epoch_cost = 0.0013640667637125315 epoch = 1400 epoch_cost = 0.0005838543190346921總結
以上是生活随笔為你收集整理的吴恩达深度学习课程第二章第三周编程作业(pytorch实现)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 吴恩达深度学习课程第二章第一周编程作业
- 下一篇: 学硕论文选题计算机,研究生计算机论文题目