生活随笔
收集整理的這篇文章主要介紹了
PyTorch深度学习实践
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
根據學習情況隨時更新。 2020.08.14更新完成。
參考課程-劉二大人《PyTorch深度學習實踐》
文章目錄 (一)課程概述 (二)線性模型 (三)梯度下降算法 (四)反向傳播 (五)用PyTorch實現線性回歸 (六)Logistic回歸 (七)多維特征輸入的分類問題 (八)加載數據集 (九)多分類問題 (十)卷積神經網絡 (十一)循環神經網絡
(一)課程概述
本章知識部分涉及到計算圖的正向傳播(表達式計算)和反向傳播(表達式求導),無需贅述。
(二)線性模型
機器學習的過程 訓練時如果y值已知的學習是監督學習。 數據分為訓練集(用于學習)和測試集(用于比對)。 在訓練過程中,為了避免過擬合,將訓練集中的一部分劃分為開發集。
樣本函數、代價函數(吳課程筆記已經記錄)
(三)梯度下降算法
核心公式 梯度下降法的本質是貪心法。由于函數是凸函數,可以保證貪心法求解得到的是最優解,此時局部最優解即是全局最優解。 此外,學習率α不能過大,否則可能造成不收斂。
梯度下降算法的例子(還是要自己寫寫才學得會)
def compute ( x
) : return w
* x
def cost ( xs
, ys
) : sum = 0 for x
, y
in zip ( xs
, ys
) : sum += ( compute
( x
) - y
) ** 2 return sum / len ( xs
) def gradient ( xs
, ys
) : grad
= 0 for x
, y
in zip ( xs
, ys
) : grad
+= 2 * x
* ( w
* x
- y
) return grad
/ len ( xs
) w
= 1.0
xs
= [ 1.0 , 2.0 , 3.0 ]
ys
= [ 2.0 , 4.0 , 6.0 ]
print ( 'f(4) before training:' + str ( compute
( 4 ) ) )
for i
in range ( 1000 ) : tempCost
= cost
( xs
, ys
) tempGradient
= gradient
( xs
, ys
) w
-= 0.01 * tempGradient
print ( 'cost:' + str ( tempCost
) + '\t' + 'w:' + str ( w
) )
print ( 'f(4) after training:' + str ( compute
( 4 ) ) )
隨機梯度下降 采用單個輸入而不是所有輸入的均值。 梯度下降法計算x(i),x(i+1)的梯度時可以并行計算(因為采用均值),而隨機梯度下降法計算x(i),x(i+1)的梯度時w存在前后依賴關系,不能并行。 和梯度下降法相比,性能更好但時間復雜度也更高。因此通常采用折中的方案(Batch:批量隨機梯度下降)。
(四)反向傳播
直觀概念 正向傳播是從輸入向輸出計算,反向傳播是從輸出向輸入逐步求梯度(本質是求導鏈式法則)。 得到dL/dx后,x作為輸出可以繼續向前傳播。
圖中是一個兩層的神經網絡,在每層的輸出中添加一個非線性函數(此處是sigmoid函數,在吳課程中有介紹)。
利用PyTorch構建計算圖 PyTorch中,Tensor類是最基本的單元,主要存放 權值w(data) 和損失函數對w的導數 dLoss/dw(grad)。 需要注意的是,grad也是Tensor類型。Tensor類型在運算時會建立計算圖,并在反向傳播完成后釋放計算圖。如果單純地使用x的grad的數值進行計算時,要寫成x.grad.data的形式。
import torch
def compute ( x
) : return w
* x
def loss ( x
, y
) : return ( compute
( x
) - y
) ** 2 x_data
= [ 1.0 , 2.0 , 3.0 ]
y_data
= [ 2.0 , 4.0 , 6.0 ]
w
= torch
. Tensor
( [ 1.0 ] )
w
. requires_grad
= True
for i
in range ( 100 ) : for x
, y
in zip ( x_data
, y_data
) : lossResult
= loss
( x
, y
) lossResult
. backward
( ) w
. data
-= 0.01 * w
. grad
. dataw
. grad
. data
. zero_
( ) print ( 'Time:' + str ( i
+ 1 ) + '\tw:' + str ( w
. data
) + '\tresult:' + str ( compute
( 4 ) . data
) )
(五)用PyTorch實現線性回歸
設計步驟: -準備數據集 -設計模型,計算y的估計值(構造計算圖) -使用PyTorch API構造損失函數、優化器 -訓練:前饋算損失、反饋算梯度、更新權值
一個Python傳遞參數的語法:
實例:
import torch
x_data
= torch
. Tensor
( [ [ 1.0 ] , [ 2.0 ] , [ 3.0 ] ] )
y_data
= torch
. Tensor
( [ [ 2.0 ] , [ 4.0 ] , [ 6.0 ] ] )
class LinearModule ( torch
. nn
. Module
) : def __init__ ( self
) : super ( LinearModule
, self
) . __init__
( ) self
. linear
= torch
. nn
. Linear
( 1 , 1 ) def forward ( self
, x
) : y_pred
= self
. linear
( x
) return y_predmodel
= LinearModule
( )
criterion
= torch
. nn
. MSELoss
( size_average
= False )
optimizer
= torch
. optim
. SGD
( model
. parameters
( ) , lr
= 0.01 )
for epoch
in range ( 1000 ) : y_pred
= model
( x_data
) loss
= criterion
( y_pred
, y_data
) print ( epoch
, loss
. data
) optimizer
. zero_grad
( ) loss
. backward
( ) optimizer
. step
( )
print ( model
. linear
. weight
. item
( ) )
print ( model
. linear
. bias
. item
( ) )
(六)Logistic回歸
分類問題 -Logistic回歸屬于分類問題。分類問題求解是對應情況的概率,并取最大值作為分類結果。 -對于前例,輸出由多少分轉為能否通過/通過考試的概率。 -使用Logistic函數(標記為σ),將實數空間的輸出值映射到 [0,1] 中。此時損失函數也需要改變-(ylog(y’)+(1-y)log(1-y’))。
import torch
class LogisticModel ( torch
. nn
. Module
) : def __init__ ( self
) : super ( LogisticModel
, self
) . __init__
( ) self
. linear
= torch
. nn
. Linear
( 1 , 1 ) def forward ( self
, x
) : y_pred
= torch
. nn
. functional
. sigmoid
( self
. linear
( x
) ) return y_predcriterion
= torch
. nn
. BCELoss
( size_average
= False )
(七)多維特征輸入的分類問題
多維輸入的Logistic模型:每個樣本對應一個向量,單個樣本的輸出y’仍然是概率(即y’∈[0,1])。相比一維輸入,w*x由實數運算變為向量內積運算。 將N個方程合并成矩陣的運算,利用向量化的特性加速訓練過程。
Sigmoid函數按向量計算的形式,依次應用到每個元素。
分層構造實例
import torch
import numpy
as npxy
= np
. loadtxt
( 'diabetes.csv.gz' , delimiter
= ',' , dtype
= np
. float32
)
x_data
= torch
. from_numpy
( xy
[ : , : - 1 ] )
y_data
= torch
. from_numpy
( xy
[ : , [ - 1 ] ] ) class MultiDimensionLogisticModel ( torch
. nn
. Module
) : def __init__ ( self
) : super ( MultiDimensionLogisticModel
, self
) . __init__
( ) self
. linear1
= torch
. nn
. Linear
( 8 , 6 ) self
. linear2
= torch
. nn
. Linear
( 6 , 4 ) self
. linear3
= torch
. nn
. Linear
( 4 , 1 ) self
. sigmoid
= torch
. nn
. Sigmoid
( ) def forward ( self
, x
) : x
= self
. sigmoid
( self
. linear1
( x
) ) x
= self
. sigmoid
( self
. linear2
( x
) ) x
= self
. sigmoid
( self
. linear3
( x
) ) return xmodel
= MultiDimensionLogisticModel
( )
criterion
= torch
. nn
. BCELoss
( size_average
= False )
optimizer
= torch
. optim
. SGD
( model
. parameters
( ) , lr
= 0.1 ) for epoch
in range ( 1000 ) : y_pred
= model
( x_data
) loss
= criterion
( y_pred
, y_data
) print ( epoch
, loss
. data
) optimizer
. zero_grad
( ) loss
. backward
( ) optimizer
. step
( )
(八)加載數據集
采用DataLoader處理輸入時,需要額外定義一個繼承自Dataset(該類是抽象類)的類,并實現_init_(), _getitem_(), _len_()方法。
概念定義 -Epoch:對所有樣本的一次正向傳播和反向傳播。 -Batch Size:每次正向傳播的樣本數。 -Iteration:Mini Batch的進行次數。 如:樣本數為10000,BatchSize為1000,則Iteration為10 在Mini Batch下的訓練步驟:
for epoch
in range ( training_epochs
) : for i
in range ( total_batch
) :
Dataset和DataLoader DataLoader迭代時以Batch為基本元素,其中Batch的大小在創建Loader時指定。 Dataset每次獲取一個樣本,DataLoader每次獲取Batch個樣本。
實例
import torch
import numpy
as np
from torch
. utils
. data
import Dataset
from torch
. utils
. data
import DataLoader
class DiabetesDataset ( Dataset
) : def __init__ ( self
, filepath
) : xy
= np
. loadtxt
( filepath
, delimiter
= ',' , dtype
= np
. float32
) self
. x_data
= torch
. from_numpy
( xy
[ : , : - 1 ] ) self
. y_data
= torch
. from_numpy
( xy
[ : , [ - 1 ] ] ) self
. len = xy
. shape
[ 0 ] def __getitem__ ( self
, index
) : return self
. x_data
[ index
] , self
. y_data
[ index
] def __len__ ( self
) : return self
. len dataset
= DiabetesDataset
( 'diabetes.csv.gz' )
train_loader
= DataLoader
( dataset
= dataset
, batch_size
= 32 , shuffle
= True , num_workers
= 2 )
class MultiDimensionLogisticModel ( torch
. nn
. Module
) : def __init__ ( self
) : super ( MultiDimensionLogisticModel
, self
) . __init__
( ) self
. linear1
= torch
. nn
. Linear
( 8 , 6 ) self
. linear2
= torch
. nn
. Linear
( 6 , 4 ) self
. linear3
= torch
. nn
. Linear
( 4 , 1 ) self
. sigmoid
= torch
. nn
. Sigmoid
( ) def forward ( self
, x
) : x
= self
. sigmoid
( self
. linear1
( x
) ) x
= self
. sigmoid
( self
. linear2
( x
) ) x
= self
. sigmoid
( self
. linear3
( x
) ) return xmodel
= MultiDimensionLogisticModel
( )
criterion
= torch
. nn
. BCELoss
( size_average
= False )
optimizer
= torch
. optim
. SGD
( model
. parameters
( ) , lr
= 0.01 )
for epoch
in range ( 100 ) : for i
, data
in enumerate ( train_loader
, 0 ) : x
, y
= data y_pred
= model
( x
) loss
= criterion
( y_pred
, y
) print ( epoch
, i
, loss
. item
( ) ) optimizer
. zero_grad
( ) loss
. backward
( ) optimizer
. step
( )
(九)多分類問題
Softmax層的引入 相比較二分類問題,多分類需要輸出每種情況的對應概率。如果按照二分類的方法,可能造成所有情況概率之和不為1的情況。改進措施是,將輸出最終結果前的Sigmoid層轉換成Softmax層。
引入Softmax層的具體操作 ①使用numpy直接計算
import numpy
as npy
= np
. array
( [ 1 , 0 , 0 ] )
z
= np
. array
( [ 0.2 , 0.1 , - 0.1 ] )
y_pred
= np
. exp
( z
) / np
. exp
( z
) . sum ( )
loss
= ( - y
* np
. log
( y_pred
) ) . sum ( )
②使用NLLLoss損失函數,輸入Softmax和Log處理后的數據,以及標簽值,輸出Loss。
③使用CrossEntropyLoss(交叉熵損失)函數。
import torchy
= torch
. LongTensor
( [ 2 , 0 , 1 ] )
z_1
= torch
. Tensor
( [ [ 0.1 , 0.2 , 0.9 ] , [ 1.1 , 0.1 , 0.2 ] , [ 0.2 , 2.1 , 0.1 ] ] )
z_2
= torch
. Tensor
( [ [ 0.9 , 0.2 , 0.1 ] , [ 0.1 , 0.1 , 0.5 ] , [ 0.2 , 0.1 , 0.7 ] ] )
criterion
= torch
. nn
. CrossEntropyLoss
( )
print ( criterion
( z_1
, y
) , criterion
( z_2
, y
) )
PyTorch圖像多分類問題 ①數據輸入 一般讀入形式為whc,在PyTorch中為了便于處理,將其轉為cwh。
transform
= transforms
. Compose
( [ transforms
. ToTensor
( ) , transforms
. Normalize
( ( 0.1307 , ) , ( 0.3081 , ) )
] )
②設計模型 view將 N個 1維 28*28的輸入,轉化為 N個 1行 的數據(每行784個)。 注意最后一層不進行激活。
(十)卷積神經網絡
圖像存儲形式 RGB:輸入圖片是3*w*h的形式,頻道對應紅/綠/藍,每個元素取值0~255. 矢量:存儲圖像的繪制信息。特點是放大不會出現像素塊。
圖像的卷積概述 對圖像進行卷積操作,本質上是對圖像的某塊的所有頻道(即一個Batch)進行操作。操作會改變圖像的c,w,h值。
單通道的卷積操作 用Kernel陣遍歷輸入陣,對應進行數值乘法再求和,得到輸出陣的一個元素。
多通道的卷積操作 本質上是將單通道的卷積累加。需要注意的是,輸入Kernel的通道數要和輸入的通道數相等。 該操作將n通道的輸入轉化成1通道的輸出。
如果需要得到m通道的輸出,則要準備m份的卷積陣。此時對應的權值是 m*n*k_width*k_height的四維量。
卷積操作實例
import torchin_channels
, out_channels
= 5 , 10
width
, height
= 100 , 100
kernel_size
= 3
batch_size
= 1 input = torch
. randn
( batch_size
, in_channels
, width
, height
)
conv_layer
= torch
. nn
. Conv2d
( in_channels
, out_channels
, kernel_size
= kernel_size
)
output
= conv_layer
( input )
卷積層的Padding操作 在輸入陣的周圍進行填充(一般是0),以得到目標形狀的輸出。
import torch
input = [ 3 , 4 , 6 , 5 , 7 , 2 , 4 , 6 , 8 , 2 , 1 , 6 , 7 , 8 , 4 , 9 , 7 , 4 , 6 , 2 , 3 , 7 , 5 , 4 , 1 ]
input = torch
. Tensor
( input ) . view
( 1 , 1 , 5 , 5 )
conv_layer
= torch
. nn
. Conv2d
( 1 , 1 , kernel_size
= 3 , padding
= 1 , bias
= False )
kernel
= torch
. Tensor
( [ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 ] ) . view
( 1 , 1 , 3 , 3 )
conv_layer
. weight
. data
= kernel
. data
output
= conv_layer
( input )
步長stride 將上述代碼的padding=1改為stride=2,即可得到2*2的結果。
Max Pooling層 kernel_size為2的MaxPooling可以將輸入的行和列都減少為原來的1/2. 該操作和通道數無關,操作前后不改變通道數。
import torch
input = [ 3 , 4 , 6 , 5 , 2 , 4 , 6 , 8 , 1 , 6 , 7 , 8 , 9 , 7 , 4 , 6 , ]
input = torch
. Tensor
( input ) . view
( 1 , 1 , 4 , 4 )
maxpooling_layer
= torch
. nn
. MaxPool2d
( kernel_size
= 2 )
output
= maxpooling_layer
( input )
卷積神經網絡實例 注意:進行最后一次 Conv-Pooling-ReLU后進入線性層。
import torch
import torch
. nn
. functional
as F
class Net ( torch
. nn
. Module
) : def __init__ ( self
) : super ( Net
, self
) . __init__
( ) self
. conv1
= torch
. nn
. Conv2d
( 1 , 10 , kernel_size
= 5 ) self
. conv2
= torch
. nn
. Conv2d
( 10 , 20 , kernel_size
= 5 ) self
. pooling
= torch
. nn
. MaxPool2d
( 2 ) self
. linear
= torch
. nn
. Linear
( 320 , 10 ) def forward ( self
, x
) : batch_size
= x
. size
( 0 ) x
= self
. pooling
( F
. relu
( self
. conv1
( x
) ) ) x
= self
. pooling
( F
. relu
( self
. conv2
( x
) ) ) x
= x
. view
( batch_size
, - 1 ) x
= self
. linear
( x
) return xmodel
= Net
( )
使用GPU進行計算
device
= torch
. device
( "cuda:0" if torch
. cuda
. is_available
( ) else "cpu" )
model
. to
( device
)
input , target
= input . to
( device
) , target
. to
( device
)
1x1卷積 本質上是將多個通道的對應元素進行信息融合。 目的是,只改變通道數,不改變輸入的寬度和高度,以減小運算量。
Inception Module 對同一輸入進行不同卷積,合并最終結果,取最優解。 不同路徑只能改變頻道數,不能改變寬度和高度(因為最終需要合并)。 Concatenate操作將不同的卷積結果沿著通道方向合并。
self
. branch_pool
= nn
. Conv2d
( in_channels
, 24 , kernel_size
= 1 ) branch_pool
= F
. avg_pool2d
( x
, kernel_size
= 3 , stride
= 1 , padding
= 1 )
branch_pool
= self
. branch_pool
( branch_pool
)
self
. branch1x1
= nn
. Conv2d
( in_channels
, 16 , kernel_size
= 1 ) branch1x1
= self
. branch1x1
( x
)
self
. branch5x5_1
= nn
. Conv2d
( in_channels
, 16 , kernel_size
= 1 )
self
. branch5x5_2
= nn
. Conv2d
( 16 , 24 , kernel_size
= 5 , padding
= 2 ) branch5x5
= self
. branch5x5_1
( x
)
branch5x5
= self
. branch5x5_2
( branch5x5
)
self
. branch3x3_1
= nn
. Conv2d
( in_channels
, 16 , kernel_size
= 1 )
self
. branch3x3_2
= nn
. Conv2d
( 16 , 24 , kernel_size
= 3 , padding
= 1 )
self
. branch3x3_3
= nn
. Conv2d
( 24 , 24 , kernel_size
= 3 , padding
= 1 ) branch3x3
= self
. branch3x3_1
( x
)
branch3x3
= self
. branch3x3_2
( branch3x3
)
outputs
= [ branch1x1
, branch5x5
, branch3x3
, branch_pool
]
return torch
. cat
( outputs
, dim
= 1 )
使用實例
import torch
from torch
import nn
import torch
. nn
. functional
as F
class InceptionA ( nn
. Module
) : def __init__ ( self
, in_channels
) : super ( InceptionA
, self
) . __init__
( ) self
. branch1x1
= nn
. Conv2d
( in_channels
, 16 , kernel_size
= 1 ) self
. branch5x5_1
= nn
. Conv2d
( in_channels
, 16 , kernel_size
= 1 ) self
. branch5x5_2
= nn
. Conv2d
( 16 , 24 , kernel_size
= 5 , padding
= 2 ) self
. branch3x3_1
= nn
. Conv2d
( in_channels
, 16 , kernel_size
= 1 ) self
. branch3x3_2
= nn
. Conv2d
( 16 , 24 , kernel_size
= 3 , padding
= 1 ) self
. branch3x3_3
= nn
. Conv2d
( 24 , 24 , kernel_size
= 3 , padding
= 1 ) self
. branch_pool
= nn
. Conv2d
( in_channels
, 24 , kernel_size
= 1 ) def forward ( self
, x
) : branch1x1
= self
. branch1x1
( x
) branch5x5
= self
. branch5x5_1
( x
) branch5x5
= self
. branch5x5_2
( branch5x5
) branch3x3
= self
. branch3x3_1
( x
) branch3x3
= self
. branch3x3_2
( branch3x3
) branch3x3
= self
. branch3x3_3
( branch3x3
) branch_pool
= F
. avg_pool2d
( x
, kernel_size
= 3 , stride
= 1 , padding
= 1 ) branch_pool
= self
. branch_pool
( branch_pool
) outputs
= [ branch1x1
, branch5x5
, branch3x3
, branch_pool
] return torch
. cat
( outputs
, dim
= 1 )
class Net ( nn
. Module
) : def __init__ ( self
) : super ( Net
, self
) . __init__
( ) self
. conv1
= nn
. Conv2d
( 1 , 10 , kernel_size
= 5 ) self
. conv2
= nn
. Conv2d
( 88 , 20 , kernel_size
= 5 ) self
. incep1
= InceptionA
( in_channels
= 10 ) self
. incep2
= InceptionA
( in_channels
= 20 ) self
. mp
= nn
. MaxPool2d
( 2 ) self
. linear
= nn
. Linear
( 1408 , 10 ) def forward ( self
, x
) : in_size
= x
. size
( 0 ) x
= F
. relu
( self
. mp
( self
. conv1
( x
) ) ) x
= self
. incep1
( x
) x
= F
. relu
( self
. mp
( self
. conv2
( x
) ) ) x
= self
. incep2
( x
) x
= x
. view
( in_size
, - 1 ) x
= self
. linear
( x
) return x
Residual Block 不改變輸入的c,w,h. 例:進行兩次卷積操作的網絡
from torch
import nn
import torch
. nn
. functional
as F
class ResidualBlock ( nn
. Module
) : def __init__ ( self
, channel
) : super ( ResidualBlock
, self
) . __init__
( ) self
. conv1
= nn
. Conv2d
( channel
, channel
, kernel_size
= 3 , padding
= 1 ) self
. conv2
= nn
. Conv2d
( channel
, channel
, kernel_size
= 3 , padding
= 1 ) def forward ( self
, x
) : y
= F
. relu
( self
. conv1
( x
) ) y
= self
. conv2
( y
) y
= F
. relu
( x
+ y
) return y
class Net ( nn
. Module
) : def __init__ ( self
) : super ( Net
, self
) . __init__
( ) self
. conv1
= nn
. Conv2d
( 1 , 16 , kernel_size
= 5 ) self
. conv2
= nn
. Conv2d
( 16 , 32 , kernel_size
= 5 ) self
. rb1
= ResidualBlock
( 16 ) self
. rb2
= ResidualBlock
( 32 ) self
. mp
= nn
. MaxPool2d
( 2 ) self
. linear
= nn
. Linear
( 512 , 10 ) def forward ( self
, x
) : in_size
= x
. size
( 0 ) x
= self
. mp
( F
. relu
( self
. conv1
( x
) ) ) x
= self
. rb1
( x
) x
= self
. mp
( F
. relu
( self
. conv2
( x
) ) ) x
= self
. rb2
( x
) x
= x
. view
( in_size
, - 1 ) x
= self
. linear
( x
) return x
(十一)循環神經網絡
RNN用于處理輸入具有序列關系,且前項對后項有影響的問題。比如根據前幾天的溫度、氣壓、天氣對未來的天氣進行預測。
RNNCell 本質是線性層。同一個Cell在RNN中循環使用。 構造RNNCell需要輸入inputSize和hiddenSize。
使用時需要明確輸入、輸出的維度關系。 設 batchSize=1,seqLen=3,inputSize=4,hiddenSize=2 則有 input.shape=(batchSize, inputSize),output.shape=(batchSize, hiddenSize),dataset.shape=(seqLen, batchSize, inputSize)
import torchbatch_size
= 1
seq_len
= 3
input_size
= 4
hidden_size
= 2
cell
= torch
. nn
. RNNCell
( input_size
= input_size
, hidden_size
= hidden_size
) dataset
= torch
. randn
( seq_len
, batch_size
, input_size
)
hidden
= torch
. zeros
( batch_size
, hidden_size
)
RNN
input.shape=(seqLen, batchSize, inputSize) h0.shape=(numLayers, batchSize, hiddenSize) output.shape=(seqLen, batchSize, hiddenSize) hn.shape=(numLayers, batchSize, hiddenSize)
import torchbatch_size
= 1
seq_len
= 3
input_size
= 4
hidden_size
= 2
num_layers
= 1
cell
= torch
. nn
. RNN
( input_size
= input_size
, hidden_size
= hidden_size
, num_layers
= num_layers
) inputs
= torch
. randn
( seq_len
, batch_size
, input_size
)
hidden
= torch
. zeros
( num_layers
, batch_size
, hidden_size
)
實例:使用RNNCell的字符串轉化 對于每個輸入字符,求解問題的整體結構:
將字符串"hello"轉化為"ohlol" 首先將輸入字符的每個字符進行向量化,得到RNNCell的輸入。 此時inputSize=4,seqLen=5。
import torch
input_size
= 4
hidden_size
= 4
batch_size
= 1
idx2char
= [ 'e' , 'h' , 'l' , 'o' ]
x_data
= [ 1 , 0 , 2 , 2 , 3 ]
y_data
= [ 3 , 1 , 2 , 3 , 2 ]
one_hot_lookup
= [ [ 1 , 0 , 0 , 0 ] , [ 0 , 1 , 0 , 0 ] , [ 0 , 0 , 1 , 0 ] , [ 0 , 0 , 0 , 1 ] ] x_one_hot
= [ one_hot_lookup
[ x
] for x
in x_data
]
inputs
= torch
. Tensor
( x_one_hot
) . view
( - 1 , batch_size
, input_size
)
labels
= torch
. LongTensor
( y_data
) . view
( - 1 , 1 )
class Model ( torch
. nn
. Module
) : def __init__ ( self
, input_size
, hidden_size
, batch_size
) : super ( Model
, self
) . __init__
( ) self
. batch_size
= batch_sizeself
. input_size
= input_sizeself
. hidden_size
= hidden_sizeself
. rnncell
= torch
. nn
. RNNCell
( input_size
= self
. input_size
, hidden_size
= self
. hidden_size
) def forward ( self
, input , hidden
) : hidden
= self
. rnncell
( input , hidden
) return hidden
def init_hidden ( self
) : return torch
. zeros
( self
. batch_size
, self
. hidden_size
) net
= Model
( input_size
, hidden_size
, batch_size
)
criterion
= torch
. nn
. CrossEntropyLoss
( )
optimizer
= torch
. optim
. Adam
( net
. parameters
( ) , lr
= 0.1 ) for epoch
in range ( 15 ) : loss
= 0 optimizer
. zero_grad
( ) hidden
= net
. init_hidden
( ) for input , label
in zip ( inputs
, labels
) : hidden
= net
( input , hidden
) loss
+= criterion
( hidden
, label
) loss
. backward
( ) optimizer
. step
( )
實例:使用RNN的字符串轉化
import torchinput_size
= 4
hidden_size
= 4
num_layers
= 1
batch_size
= 1
seq_len
= 5 idx2char
= [ 'e' , 'h' , 'l' , 'o' ]
x_data
= [ 1 , 0 , 2 , 2 , 3 ]
y_data
= [ 3 , 1 , 2 , 3 , 2 ]
one_hot_lookup
= [ [ 1 , 0 , 0 , 0 ] , [ 0 , 1 , 0 , 0 ] , [ 0 , 0 , 1 , 0 ] , [ 0 , 0 , 0 , 1 ] ]
x_one_hot
= [ one_hot_lookup
[ x
] for x
in x_data
]
inputs
= torch
. Tensor
( x_one_hot
) . view
( seq_len
, batch_size
, input_size
)
labels
= torch
. LongTensor
( y_data
) class Model ( torch
. nn
. Module
) : def __init__ ( self
, input_size
, hidden_size
, num_layers
, batch_size
, seq_len
) : super ( Model
, self
) . __init__
( ) self
. input_size
= input_sizeself
. num_layers
= num_layersself
. hidden_size
= hidden_sizeself
. batch_size
= batch_sizeself
. seq_len
= seq_lenself
. rnn
= torch
. nn
. RNN
( self
. input_size
, self
. hidden_size
, self
. num_layers
) def forward ( self
, input ) : hidden
= torch
. zeros
( self
. num_layers
, self
. batch_size
, self
. hidden_size
) out
, _
= self
. rnn
( input , hidden
) return out
. view
( - 1 , self
. hidden_size
) net
= Model
( input_size
, hidden_size
, batch_size
, num_layers
)
criterion
= torch
. nn
. CrossEntropyLoss
( )
optimizer
= torch
. optim
. Adam
( net
. parameters
( ) , lr
= 0.05 ) for epoch
in range ( 15 ) : optimizer
. zero_grad
( ) outputs
= net
( inputs
) loss
= criterion
( outputs
, labels
) loss
. backward
( ) optimizer
. step
( )
總結
以上是生活随笔 為你收集整理的PyTorch深度学习实践 的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔 網站內容還不錯,歡迎將生活随笔 推薦給好友。