當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

rnn 递归神经网络_递归神经网络rnn的简单解释

發布時間：2024/3/26 编程问答 28 豆豆

生活随笔收集整理的這篇文章主要介紹了 rnn 递归神经网络_递归神经网络rnn的简单解释小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

rnn 遞歸神經網絡

Recurrent neural network is a type of neural network used to deal specifically with sequential data. Actually what makes RNN so powerful is the fact that it doesn't take into consideration just the actual input but also the previous input which allows it to memorize what happens previously. To get a better intuition on RNN let’s take the example of text classification, for this task we can use the classic machine learning algorithms like naive bayes but the problem with this algorithm, it takes a sentence as a set of independent words and precisely the frequency of each word without worrying about the composition of words or the order of words in a sentence which makes a huge difference to form the meaning of a sentence. RNN unlike those classic algorithms, works well on sequence data because it takes the word i as input and combine with the output of word i-1, the same thing would be applied for word i+1 and this is the reason it’s called recurrent neural network because clearly the neural network apply the same operations on each word i of the sentence.

遞歸神經網絡是一種用于專門處理順序數據的神經網絡。實際上，使RNN如此強大的原因是它不僅考慮了實際輸入，還考慮了先前的輸入，從而使它能夠記住先前發生的事情。為了更好地了解RNN，讓我們以文本分類為例，對于此任務，我們可以使用經典的機器學習算法(如樸素貝葉斯(Naive Bayes))，但該算法的問題是將一個句子作為一組獨立的單詞并精確地將頻率無需擔心單詞的組成或句子中單詞的順序，而這會極大地影響句子的含義。 RNN與那些經典算法不同，它在序列數據上效果很好，因為它將詞i作為輸入并與詞i-1的輸出結合在一起，對詞i + 1也會應用相同的東西，這就是其被稱為遞歸神經的原因因為顯然神經網絡對句子的每個單詞i都應用相同的操作。

As you might be thinking enough bla bla show us how they work, and that's exactly what I’d do in the next part :

您可能已經想夠了，bla bla告訴我們它們是如何工作的，而這正是我在下一部分中所做的：

RNN的工作原理： (How RNN works :)

In order to understand how RNN works under the hood, let’s take an example of NLP application Named entity recognition, this technique is used to detect names in a sentence :

為了了解RNN的工作原理，讓我們以NLP應用“命名實體識別”為例，該技術用于檢測句子中的名稱：

In the examples above, for each instance of training (sentence) we map each word with an output, if the word is name(john, Ellen …) we map it to 1. Otherwise, we map it to 0. So to train RNN on sentences to recognize names within, the RNN architecture would be something like that :

在上面的示例中，對于每個訓練實例(句子)，我們都將每個單詞映射到一個輸出，如果單詞是name(john，Ellen…)，則將其映射到1。否則，將其映射到0。因此要訓練RNN在識別內部名稱的句子上，RNN體系結構將是這樣的：

正向傳播： (Forward propagation :)

for this training example, we have 5 words which means 5 steps so for each step t we calculate a, y using the shared weights Wa,Wx, Wy, ba, by :

在此訓練示例中，我們有5個單詞，表示5個步驟，因此對于每個步驟t，我們使用共享權重Wa，Wx，Wy，ba通過以下公式計算a，y：

And generally the equations would be :

通常，等式為：

Then, we calculate the cost function to represent the relation between the real output y and the output predicted ? for each time step t:

然后，我們計算成本函數以表示每個時間步長t的實際輸出y與預測輸出?之間的關系：

Now, we’ll sum over the cost function of each word to calculate the loss function :

現在，我們將對每個單詞的成本函數求和以計算損失函數：

反向傳播： (Back propagation :)

Back propagation is like going back in time to compute derivative of the loss function with respect to parameters Wa, Wx, Wy, ba, by using the chain rule to simplify the calculus. After getting the derivatives, we update the parameters using descent gradient :

反向傳播就像通過使用鏈式規則簡化計算來回溯以計算損耗函數相對于參數Wa，Wx，Wy，ba的導數。得到導數后，我們使用下降梯度更新參數：

After multiple iterations using several training examples, we’d be able to minimize the loss function and the predicted output would converge to the real output. Thus, we’ll use the optimized weights to detect names through future sentences.

在使用幾個訓練示例進行多次迭代之后，我們將能夠使損失函數最小化，并且預測輸出將收斂到實際輸出。因此，我們將使用優化的權重來通過將來的句子檢測名稱。

For more articles check :

有關更多文章，請檢查：

翻譯自: https://medium.com/swlh/simple-explanation-of-recurrent-neural-network-rnn-1285749cc363