对话系统(二)-普通神经网络
原理
流程
x{x}x:輸入數據(20,5){(20, 5)}(20,5)
w1{w_{1}}w1?:第一層權重(5,3){(5, 3)}(5,3)
w2{w_{2}}w2?:第二層權重(3,2){(3, 2)}(3,2)
a1{a_{1}}a1?:乘積(20,3){(20, 3)}(20,3)
h1{h_{1}}h1?:過激活函數(20,3){(20, 3)}(20,3)
a2{a_{2}}a2?:乘積(20,2){(20, 2)}(20,2)
h2{h_{2}}h2?:過激活函數(20,2){(20, 2)}(20,2)
正向傳播
x{x}x
a1=x?w1{a_{1}}={x}*{w_{1}}a1?=x?w1?
h1=sigmoid(a1){h_{1}}=sigmoid({a_{1}})h1?=sigmoid(a1?)
a2=h1?w2{a_{2}}={h_{1}}*{w_{2}}a2?=h1??w2?
h2=sigmoid(a2){h_{2}}=sigmoid({a_{2}})h2?=sigmoid(a2?)
推導
損失函數logloss:J=?1m∑(ylog?y^+(1?y)log?(1?y^))\displaystyle J=-\frac{1}{m}\sum(y\log{\hat{y}}+(1-y)\log(1-\hat{y}))J=?m1?∑(ylogy^?+(1?y)log(1?y^?))
?J?w2=?J?h2??h2?a2??a2?w2\displaystyle\frac{\partial{J}}{\partial{w_2}}=\frac{\partial{J}}{\partial{h_2}}*\frac{\partial{h_{2}}}{\partial{a_2}}*\frac{\partial{a_{2}}}{\partial{w_{2}}}?w2??J?=?h2??J???a2??h2????w2??a2??
?J?w1=?J?h2??h2?a2??a2?h1??h1?a1??a1?w1\displaystyle\frac{\partial{J}}{\partial{w_1}}=\frac{\partial{J}}{\partial{h_2}}*\frac{\partial{h_{2}}}{\partial{a_2}}*\frac{\partial{a_{2}}}{\partial{h_{1}}}*\frac{\partial{h_{1}}}{\partial{a_{1}}}*\frac{\partial{a_{1}}}{\partial{w_{1}}}?w1??J?=?h2??J???a2??h2????h1??a2????a1??h1????w1??a1??
其中公共部分(前兩個偏導)為:?J?h2??h2?a2\displaystyle\frac{\partial{J}}{\partial{h_2}}*\frac{\partial{h_{2}}}{\partial{a_2}}?h2??J???a2??h2??
?J?h2=?1m?y?h2h2(1?h2)\displaystyle\frac{\partial{J}}{\partial{h_2}}=-\frac{1}{m}*\frac{y-h_{2}}{h_{2}(1-h_{2})}?h2??J?=?m1??h2?(1?h2?)y?h2??
?h2?a2=h2(1?h2)\displaystyle\frac{\partial{h_{2}}}{\partial{a_2}}=h_{2}(1-h_{2})?a2??h2??=h2?(1?h2?)
?a2?w2=h1\displaystyle\frac{\partial{a_{2}}}{\partial{w_{2}}}=h_{1}?w2??a2??=h1?
?a2?h1=w2\displaystyle\frac{\partial{a_{2}}}{\partial{h_{1}}}=w_{2}?h1??a2??=w2?
?h1?a1=h1?(1?h1)\displaystyle\frac{\partial{h_{1}}}{\partial{a_{1}}}=h_{1}*(1-h_{1})?a1??h1??=h1??(1?h1?)
?a1?w1=x\displaystyle\frac{\partial{a_{1}}}{\partial{w_{1}}}=x?w1??a1??=x
代碼
用numpy實現
import numpy as nptrain_x_dim = 5 sample_1_num = 10 sample_0_num = 10 weight1_dim = 3 weight2_dim = 2train_x_1 = np.random.rand(sample_1_num, train_x_dim) train_x_0 = np.random.rand(sample_0_num, train_x_dim)*10train_y_1 = np.ones(sample_1_num) train_y_0 = np.zeros(sample_0_num)weight1 = np.random.rand(train_x_dim, weight1_dim)def sigmoid(x):return 1/(1+np.exp(-x))a1 = np.dot(train_x_1, weight1) h1 = sigmoid(a1)weight2 = np.random.rand(weight1_dim, weight2_dim) a2 = np.dot(h1, weight2) h2 = sigmoid(a2)def sigmoid_derv(x):return sigmoid(x)*(1-sigmoid(x))用tf實現
from tensorflow import keras # load data fashion_mnist = keras.datasets.fashion_mnist (train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data() # build model model = keras.Sequential([keras.layers.Flatten(input_shape=(28, 28)),keras.layers.Dense(128, activation=tf.nn.relu),keras.layers.Dense(10, activation=tf.nn.softmax) ]) # compile model model.compile(optimizer=tf.train.AdamOptimizer(),loss='sparse_categorical_crossentropy',metrics=['accuracy']) # train model model.fit(train_images, train_labels, epochs=5) # evaluate test_loss, test_acc = model.evaluate(test_images, test_labels)總結
以上是生活随笔為你收集整理的对话系统(二)-普通神经网络的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 对话系统(一)-概览
- 下一篇: 数据结构与算法(一):链表