2.深度学习练习:Logistic Regression with a Neural Network mindset
本文節(jié)選自吳恩達(dá)老師《深度學(xué)習(xí)專項(xiàng)課程》編程作業(yè),在此表示感謝。
課程鏈接:https://www.deeplearning.ai/deep-learning-specialization/
You will learn to:
- Build the general architecture of a learning algorithm, including:
- Initializing parameters
- Calculating the cost function and its gradient
- Using an optimization algorithm (gradient descent)
- Gather all three functions above into a main model function, in the right order.
你將會(huì)學(xué)到深度學(xué)習(xí)的通用結(jié)構(gòu),包括:
- 參數(shù)初始化
- 計(jì)算損失函數(shù)和梯度
- 使用優(yōu)化算法(梯度下降)
目錄
1 - Packages
2 - Overview of the Problem set
3 - General Architecture of the learning algorithm(掌握)
4 - Building the parts of our algorithm
4.1 - Helper functions
4.2 - Initializing parameters
4.3 - Forward and Backward propagation(掌握)
4.4 - Optimization
4.5 - Predict
5 - Merge all functions into a model
1 - Packages
First, let's run the cell below to import all the packages that you will need during this assignment.
- numpy?is the fundamental package for scientific computing with Python.
- h5py?is a common package to interact with a dataset that is stored on an H5 file.
- matplotlib?is a famous library to plot graphs in Python.
- PIL?and?scipy?are used here to test your model with your own picture at the end.
2 - Overview of the Problem set
Problem Statement: You are given a dataset ("data.h5") containing:
- a training set of m_train images labeled as cat (y=1) or non-cat (y=0)
- a test set of m_test images labeled as cat or non-cat
- each image is of shape (num_px, num_px, 3) where 3 is for the 3 channels (RGB). Thus, each image is square (height = num_px) and (width = num_px).
每一張圖片的shape:(num_px, num_px, 3),3代表3個(gè)通道(RGB)。
You will build a simple image-recognition algorithm that can correctly classify pictures as cat or non-cat.
Let's get more familiar with the dataset. Load the data by running the following code.
Many software bugs in deep learning come from having matrix/vector dimensions that don't fit. If you can keep your matrix/vector dimensions straight you will go a long way toward eliminating many bugs.
Exercise:?Find the values for:
- m_train (number of training examples)
- m_test (number of test examples)
- num_px (= height = width of a training image)
Remember that?train_set_x_orig?is a numpy-array of shape (m_train, num_px, num_px, 3). For instance, you can access?m_train?by writing?train_set_x_orig.shape[0].
m_train = train_set_x_orig.shape[0] m_test = test_set_x_orig.shape[0] num_px = train_set_x_orig.shape[1]For convenience, you should now reshape images of shape (num_px, num_px, 3) in a numpy-array of shape (num_px????num_px????3, 1). After this, our training (and test) dataset is a numpy-array where each column represents a flattened image. There should be m_train (respectively m_test) columns.
Exercise:?Reshape the training and test data sets so that images of size (num_px, num_px, 3) are flattened into single vectors of shape (num_px????num_px????3, 1).
A trick when you want to flatten a matrix X of shape (a,b,c,d) to a matrix X_flatten of shape (b??c??d, a) is to use:
X_flatten = X.reshape(X.shape[0], -1).T # X.T is the transpose of X train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0], -1).T test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0], -1).TLet's standardize our dataset.
train_set_x = train_set_x_flatten/255. test_set_x = test_set_x_flatten/255.**What you need to remember:**
Common steps for pre-processing a new dataset are:
- Figure out the dimensions and shapes of the problem (m_train, m_test, num_px, ...)
- Reshape the datasets such that each example is now a vector of size (num_px * num_px * 3, 1)
- "Standardize" the data
新數(shù)據(jù)集常用的預(yù)處理順序是:
- 查看訓(xùn)練集、測(cè)試集大小和樣本的形狀(m_train,m_test,num_px等)
- 重塑數(shù)據(jù)集,使每個(gè)樣本變成為一個(gè)大小為(num_px * num_px * 3, 1)的向量
- 數(shù)據(jù)“標(biāo)準(zhǔn)化”處理
3 - General Architecture of the learning algorithm(掌握)
You will build a Logistic Regression, using a Neural Network mindset. The following Figure explains why?Logistic Regression is actually a very simple Neural Network!
Mathematical expression of the algorithm:
For one example?:
?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
??????????????????????????????????????????????????????????????????????????????????????????????????
The cost is then computed by summing over all training examples:
Key steps: In this exercise, you will carry out the following steps:
- - Initialize the parameters of the model
- - Learn the parameters for the model by minimizing the cost ?
- - Use the learned parameters to make predictions (on the test set)
- - Analyse the results and conclude
關(guān)鍵步驟:在這個(gè)練習(xí)中,你將會(huì)進(jìn)行以下步驟:
- 初始化模型參數(shù)
- 通過(guò)降低損失來(lái)學(xué)習(xí)模型參數(shù)
- 使用學(xué)習(xí)到的參數(shù)來(lái)預(yù)測(cè)(在測(cè)試集上預(yù)測(cè))
- 分析預(yù)測(cè)結(jié)果和總結(jié)
4 - Building the parts of our algorithm
The main steps for building a Neural Network are:
- Calculate current loss (forward propagation)
- Calculate current gradient (backward propagation)
- Update parameters (gradient descent)
You often build 1-3 separately and integrate them into one function we call?model().
4.1 - Helper functions
Exercise: Using your code from "Python Basics", implement?sigmoid()
def sigmoid(z):"""Compute the sigmoid of zArguments:z -- A scalar or numpy array of any size.Return:s -- sigmoid(z)"""s = 1 / (1 + np.exp(-z))return s4.2 - Initializing parameters
def initialize_with_zeros(dim):"""This function creates a vector of zeros of shape (dim, 1) for w and initializes b to 0.Argument:dim -- size of the w vector we want (or number of parameters in this case)Returns:w -- initialized vector of shape (dim, 1)b -- initialized scalar (corresponds to the bias)"""w = np.zeros((dim, 1))b = 0assert(w.shape == (dim, 1))assert(isinstance(b, float) or isinstance(b, int))return w, b4.3 - Forward and Backward propagation(掌握)
Now that your parameters are initialized, you can do the "forward" and "backward" propagation steps for learning the parameters.
Exercise:?Implement a function?propagate()?that computes the cost function and its gradient.
Hints:
Forward Propagation:
- You get X
- You compute
- ?You calculate the cost function:
Here are the two formulas you will be using:???
4.4 - Optimization
Exercise:?Write down the optimization function. The goal is to learn??and??by?minimizing the cost function?. For a parameter?, the update rule is??where??is the learning rate.
def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False):"""This function optimizes w and b by running a gradient descent algorithmArguments:w -- weights, a numpy array of size (num_px * num_px * 3, 1)b -- bias, a scalarX -- data of shape (num_px * num_px * 3, number of examples)Y -- true "label" vector (containing 0 if non-cat, 1 if cat), of shape (1, number of examples)num_iterations -- number of iterations of the optimization looplearning_rate -- learning rate of the gradient descent update ruleprint_cost -- True to print the loss every 100 stepsReturns:params -- dictionary containing the weights w and bias bgrads -- dictionary containing the gradients of the weights and bias with respect to the cost functioncosts -- list of all the costs computed during the optimization, this will be used to plot the learning curve.Tips:You basically need to write down two steps and iterate through them:1) Calculate the cost and the gradient for the current parameters. Use propagate().2) Update the parameters using gradient descent rule for w and b."""costs = []for i in range(num_iterations):# Cost and gradient calculation (≈ 1-4 lines of code)grads, cost = propagate(w, b, X, Y)# Retrieve derivatives from gradsdw = grads["dw"]db = grads["db"]# update rule (≈ 2 lines of code)w = w - learning_rate * dwb = b - learning_rate * db# Record the costsif i % 100 == 0:costs.append(cost)# Print the cost every 100 training examplesif print_cost and i % 100 == 0:print ("Cost after iteration %i: %f" %(i, cost))params = {"w": w,"b": b}grads = {"dw": dw,"db": db}return params, grads, costs4.5 - Predict
Exercise:?The previous function will output the learned w and b. We are able to use w and b to predict the labels for a dataset X. Implement the?predict()?function. There is two steps to computing predictions:
1.Calculate?
2.?Convert the entries of a into 0 (if activation <= 0.5) or 1 (if activation > 0.5), stores the predictions in a vector `Y_prediction`. If you wish, you can use an `if`/`else` statement in a `for` loop (though there is also a way to vectorize this).?
def predict(w, b, X):'''Predict whether the label is 0 or 1 using learned logistic regression parameters (w, b)Arguments:w -- weights, a numpy array of size (num_px * num_px * 3, 1)b -- bias, a scalarX -- data of size (num_px * num_px * 3, number of examples)Returns:Y_prediction -- a numpy array (vector) containing all predictions (0/1) for the examples in X'''m = X.shape[1]Y_prediction = np.zeros((1,m))w = w.reshape(X.shape[0], 1)# Compute vector "A" predicting the probabilities of a cat being present in the pictureA = sigmoid(np.dot(w.T, X) + b)for i in range(A.shape[1]):# Convert probabilities A[0,i] to actual predictions p[0,i]if A[0,i] > 0.5:Y_prediction[0, i] = 1 else: Y_prediction[0, i] = 0assert(Y_prediction.shape == (1, m))return Y_prediction**What to remember:** You've implemented several functions that:
- Initialize (w,b)
- Optimize the loss iteratively to learn parameters (w,b):
- computing the cost and its gradient
- updating the parameters using gradient descent
- Use the learned (w,b) to predict the labels for a given set of examples
需要記住的,你已經(jīng)實(shí)現(xiàn)了一些函數(shù):
- 初始化(w,b)
- 迭代優(yōu)化損失來(lái)學(xué)習(xí)參數(shù)(w,b)
- 計(jì)算損失和其梯度
- 使用梯度下降來(lái)更新參數(shù)
- 使用學(xué)習(xí)到的參數(shù)來(lái)預(yù)測(cè)給定樣本的標(biāo)簽值
5 - Merge all functions into a model
You will now see how the overall model is structured by putting together all the building blocks (functions implemented in the previous parts) together, in the right order.
Exercise:?Implement the model function. Use the following notation:
- Y_prediction for your predictions on the test set
- Y_prediction_train for your predictions on the train set
- w, costs, grads for the outputs of optimize()
總結(jié)
以上是生活随笔為你收集整理的2.深度学习练习:Logistic Regression with a Neural Network mindset的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 亚马逊要关闭电子书店 捡漏 Kindle
- 下一篇: 美国股市大跌,是因为美债收益率的飙升,吓