當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

深度学总结：RNN训练需要注意地方：pytorch每一个batch训练之前需要把hidden = hidden.data，否者反向传播的梯度会遍历以前的timestep

發(fā)布時(shí)間：2024/9/15 编程问答 25 豆豆

生活随笔收集整理的這篇文章主要介紹了深度学总结：RNN训练需要注意地方：pytorch每一个batch训练之前需要把hidden = hidden.data，否者反向传播的梯度会遍历以前的timestep 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

pytorch每一個(gè)batch訓(xùn)練之前需要把hidden = hidden.data，否者反向傳播的梯度會(huì)遍歷以前的timestep

tensorflow也有把new_state更新，但是沒有明顯detach的操作，預(yù)計(jì)是tensorflow自己機(jī)制默認(rèn)backpropagation一個(gè)timestep的梯度：

for e in range(epochs):# Train networknew_state = sess.run(model.initial_state)loss = 0for x, y in get_batches(encoded, batch_size, num_steps):counter += 1start = time.time()feed = {model.inputs: x,model.targets: y,model.keep_prob: keep_prob,model.initial_state: new_state}batch_loss, new_state, _ = sess.run([model.loss, model.final_state, model.optimizer], feed_dict=feed)

pytorch每一個(gè)batch訓(xùn)練之前需要把hidden = hidden.data，否者反向傳播的梯度會(huì)遍歷以前的timestep，它是自動(dòng)求導(dǎo)，需要專門把那個(gè)state提出來一下，這樣就相當(dāng)于detach了，反向梯度到這里就停止了。

# train the RNN def train(rnn, n_steps, print_every):# initialize the hidden statehidden = None for batch_i, step in enumerate(range(n_steps)):# defining the training data time_steps = np.linspace(step * np.pi, (step+1)*np.pi, seq_length + 1)data = np.sin(time_steps)data.resize((seq_length + 1, 1)) # input_size=1x = data[:-1]y = data[1:]# convert data into Tensorsx_tensor = torch.Tensor(x).unsqueeze(0) # unsqueeze gives a 1, batch_size dimensiony_tensor = torch.Tensor(y)# outputs from the rnnprediction, hidden = rnn(x_tensor, hidden)## Representing Memory ### make a new variable for hidden and detach the hidden state from its history# this way, we don't backpropagate through the entire historyhidden = hidden.data# calculate the lossloss = criterion(prediction, y_tensor)# zero gradientsoptimizer.zero_grad()# perform backprop and update weightsloss.backward()optimizer.step()# display loss and predictionsif batch_i%print_every == 0: print('Loss: ', loss.item())plt.plot(time_steps[1:], x, 'r.') # inputplt.plot(time_steps[1:], prediction.data.numpy().flatten(), 'b.') # predictionsplt.show()return rnn

總結(jié)

以上是生活随笔為你收集整理的深度学总结：RNN训练需要注意地方：pytorch每一个batch训练之前需要把hidden = hidden.data，否者反向传播的梯度会遍历以前的timestep的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：深度学总结：Image Style Tr
下一篇：深度学习总结：cycleGAN原理，实现