深度学总结:RNN训练需要注意地方:pytorch每一个batch训练之前需要把hidden = hidden.data,否者反向传播的梯度会遍历以前的timestep
生活随笔
收集整理的這篇文章主要介紹了
深度学总结:RNN训练需要注意地方:pytorch每一个batch训练之前需要把hidden = hidden.data,否者反向传播的梯度会遍历以前的timestep
小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
pytorch每一個(gè)batch訓(xùn)練之前需要把hidden = hidden.data,否者反向傳播的梯度會(huì)遍歷以前的timestep
tensorflow也有把new_state更新,但是沒有明顯detach的操作,預(yù)計(jì)是tensorflow自己機(jī)制默認(rèn)backpropagation一個(gè)timestep的梯度:
for e in range(epochs):# Train networknew_state = sess.run(model.initial_state)loss = 0for x, y in get_batches(encoded, batch_size, num_steps):counter += 1start = time.time()feed = {model.inputs: x,model.targets: y,model.keep_prob: keep_prob,model.initial_state: new_state}batch_loss, new_state, _ = sess.run([model.loss, model.final_state, model.optimizer], feed_dict=feed)pytorch每一個(gè)batch訓(xùn)練之前需要把hidden = hidden.data,否者反向傳播的梯度會(huì)遍歷以前的timestep,它是自動(dòng)求導(dǎo),需要專門把那個(gè)state提出來一下,這樣就相當(dāng)于detach了,反向梯度到這里就停止了。
# train the RNN def train(rnn, n_steps, print_every):# initialize the hidden statehidden = None for batch_i, step in enumerate(range(n_steps)):# defining the training data time_steps = np.linspace(step * np.pi, (step+1)*np.pi, seq_length + 1)data = np.sin(time_steps)data.resize((seq_length + 1, 1)) # input_size=1x = data[:-1]y = data[1:]# convert data into Tensorsx_tensor = torch.Tensor(x).unsqueeze(0) # unsqueeze gives a 1, batch_size dimensiony_tensor = torch.Tensor(y)# outputs from the rnnprediction, hidden = rnn(x_tensor, hidden)## Representing Memory ### make a new variable for hidden and detach the hidden state from its history# this way, we don't backpropagate through the entire historyhidden = hidden.data# calculate the lossloss = criterion(prediction, y_tensor)# zero gradientsoptimizer.zero_grad()# perform backprop and update weightsloss.backward()optimizer.step()# display loss and predictionsif batch_i%print_every == 0: print('Loss: ', loss.item())plt.plot(time_steps[1:], x, 'r.') # inputplt.plot(time_steps[1:], prediction.data.numpy().flatten(), 'b.') # predictionsplt.show()return rnn總結(jié)
以上是生活随笔為你收集整理的深度学总结:RNN训练需要注意地方:pytorch每一个batch训练之前需要把hidden = hidden.data,否者反向传播的梯度会遍历以前的timestep的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 深度学总结:Image Style Tr
- 下一篇: 深度学习总结:cycleGAN原理,实现