(深度卷积生成对抗神经网络)DCGANs论文阅读与实现pytorch
文章目錄
- INTRODUCTION
- Approach and Model Architecture
- 具體的修改細節(jié)
- pytorch實現(xiàn)
- 可以選定特定的數(shù)字的DCGANs
INTRODUCTION
GANs有一個非常大的問題,就是訓練的結(jié)果非常不穩(wěn)定,很有可能訓練出來的模型生成的是一個亂七八糟的東西。
GANs have been known to be unstable to train, often resulting in generators that produce nonsensical outputs
這篇論文的貢獻
- 提出并評估了一系列的在卷積對抗生成神經(jīng)網(wǎng)絡(luò)(Convolutional GANs)上的限制。為了使得他們能更穩(wěn)定地訓練。我們稱這種網(wǎng)絡(luò)為深度卷積神經(jīng)網(wǎng)絡(luò)(DCGANs)
We propose and evaluate a set of constraintson the architectural topology of Convolutional GANs that make them stable to train in most settings. We name this class of architectures Deep Convolutional GANs (DCGAN)
- 我們使用訓練好的判別器來做圖片的分類任務(wù),相比去其他的非監(jiān)督學習算法更具有競爭力的效果。
We use the trained discriminators for image classi?cation tasks, showing competitive performance with other unsupervised algorithms.
- 我們設(shè)想這個過濾器(filter)通過GANs來學習。并且,從經(jīng)驗上表現(xiàn)出,這個過濾器有學習到如何去畫一個特定的物體。
We visualize the ?lters learnt by GANs and empirically show that speci?c ?lters have learned to draw speci?c objects
- 我們展示了這個生成器有非常有趣的向量計算屬性,允許簡單的在很多有語義的生成樣本進行操作。
Weshowthatthegeneratorshaveinterestingvectorarithmeticpropertiesallowingforeasy manipulation of many semantic qualities of generated samples.
Approach and Model Architecture
歷史上嘗試去增強GANs通過CNNs去建模這個圖片可以說是非常不成功。
這個想法從LAPGANs的作者去發(fā)展
我們也遇到了很多困難,在嘗試用CNN的結(jié)構(gòu)來提高GANs的過程中。
然而,在擴展了這個模型探索之后,我們認識到了一族結(jié)構(gòu)。這種結(jié)構(gòu),使得可以在一些數(shù)據(jù)集上的穩(wěn)定地訓練結(jié)果。訓練得到更好的結(jié)果,并且更深的生成模型。
- 首先是,用給定步長的卷積的卷積網(wǎng)絡(luò)代替所有的確定空間池化函數(shù),允許它學習它自己空間的樣本生成器和判別器。
- 第二是,在卷積特征的頂部,向著排除全相連的層的趨勢。比較好的一個結(jié)果手就是用全局的平均池化優(yōu)化了圖的分類模型。但是會發(fā)現(xiàn),雖然會增加穩(wěn)定性,但是會降低收斂的速度。適中的方法是: GANs第一層,是一個高斯噪聲的分布Z,作為輸入,但是結(jié)果需要被reshape轉(zhuǎn)成一個4維的矩陣。用來作為卷積棧的開始工作。
- 第二中,所對應圖交待了生成器的過程。提到了,這被稱之為(錯誤的叫法,逆卷積的過程),(正確的叫法,分數(shù)步長的卷積,fractional-strided convolutions)。判別器,的話最后一個卷積層的結(jié)果需要給sigmoid函數(shù)。
- 第三,Batch Nuormalization,使得學習通過正則化之后的輸入有0的均值和1的方差的正態(tài)分布。這是為了避免不好的初始化問題,然后讓梯度進入到更深層次的模型中。但是需要避免batchnorm應用到生成器的輸出層和判別器的輸出層。直接的將batchnorm應用到所有層,會導致震蕩或者是模型的不穩(wěn)定。
- ReLU應用在生成器,但是最后一個層需要用Tanh函數(shù)。我們觀察到,使用綁定的激活函數(shù)讓模型更快地使得訓練中的分布的顏色空間飽和。
- 在判別器上,我們發(fā)現(xiàn)LeakyReLU表現(xiàn)的更好。這篇文章基于原始的GANs,但是做了下面的這些修改
具體的修改細節(jié)
pytorch實現(xiàn)
下面代碼的輸出結(jié)果其實不是很好
原因是我的EPOCH設(shè)置為了1。如果你設(shè)置為更大的數(shù)值效果會好很多。
原因是,這個速度實在是太慢了。
而且,由于是把所有的特征都放在一起,沒有說有一個分類的操作,就直接開始來訓練了。怪不得清華小哥們提出了TripleGANs的思路!
那么多不是一個類的東西來混在一起來特征尋找,效果肯定是非常糟糕啦!
但是下面的代碼還是具有研究意義的(因為只需要在數(shù)據(jù)集上做一下操作,就可以變成一個正常的東西了。。這個效率實在是太低)。我會再改寫一個只有一個生成目標的DCGANs。
import torch import torch.nn as nn import torchvision import torch.utils.data as Data import matplotlib.pyplot as plt# Hyper Parameters EPOCH = 1 # 訓練整批數(shù)據(jù)多少次 BATCH_SIZE = 50 LR = 0.0002 # 學習率 DOWNLOAD_MNIST = True # 已經(jīng)下載好的話,會自動跳過的 len_Z = 100 # random input.channal for Generator g_hidden_channal = 64 d_hidden_channal = 64 image_channal = 1 # mnist數(shù)據(jù)為黑白的只有一維# Mnist 手寫數(shù)字 train_data = torchvision.datasets.MNIST(root='./mnist/', # 保存或者提取位置train=True, # this is training datatransform=torchvision.transforms.ToTensor(), # 轉(zhuǎn)換 PIL.Image or numpy.ndarray 成# torch.FloatTensor (C x H x W), 訓練的時候 normalize 成 [0.0, 1.0] 區(qū)間download=DOWNLOAD_MNIST, # 沒下載就下載, 下載了就不用再下了 )test_data = torchvision.datasets.MNIST(root='./mnist/',train=False )# 訓練集丟BATCH_SIZE個, 圖片大小為28*28 train_loader = Data.DataLoader(dataset=train_data,batch_size=BATCH_SIZE,shuffle=True # 是否打亂順序 )class Generator(nn.Module):def __init__(self, len_Z, hidden_channal, output_channal):super(Generator, self).__init__()self.layer1 = nn.Sequential(nn.ConvTranspose2d(in_channels=len_Z,out_channels=hidden_channal * 4,kernel_size=4,),nn.BatchNorm2d(hidden_channal * 4),nn.ReLU())# [BATCH, hidden_channal * 4 , 4, 4]self.layer2 = nn.Sequential(nn.ConvTranspose2d(in_channels=hidden_channal * 4,out_channels=hidden_channal * 2,kernel_size=3, # 保證生成圖像大小為28stride=2,padding=1),nn.BatchNorm2d(hidden_channal * 2),nn.ReLU())#self.layer3 = nn.Sequential(nn.ConvTranspose2d(in_channels=hidden_channal * 2,out_channels=hidden_channal,kernel_size=4,stride=2,padding=1),nn.BatchNorm2d(hidden_channal),nn.ReLU())self.layer4 = nn.Sequential(nn.ConvTranspose2d(in_channels=hidden_channal,out_channels=output_channal,kernel_size=4,stride=2,padding=1),nn.Tanh())def forward(self, x):# [50, 100, 1, 1]out = self.layer1(x)# [50, 256, 4, 4]# print(out.shape)out = self.layer2(out)# [50, 128, 7, 7]# print(out.shape)out = self.layer3(out)# [50, 64, 14, 14]# print(out.shape)out = self.layer4(out)# print(out.shape)# [50, 1, 28, 28]return out# # Test Generator # G = Generator(len_Z, g_hidden_channal, image_channal) # data = torch.randn((BATCH_SIZE, len_Z, 1, 1)) # print(G(data).shape)class Discriminator(nn.Module):def __init__(self, input_channal, hidden_channal):super(Discriminator, self).__init__()self.layer1 = nn.Sequential(nn.Conv2d(in_channels=input_channal,out_channels=hidden_channal,kernel_size=4,stride=2,padding=1),nn.BatchNorm2d(hidden_channal),nn.LeakyReLU(0.2))self.layer2 = nn.Sequential(nn.Conv2d(in_channels=hidden_channal,out_channels=hidden_channal * 2,kernel_size=4,stride=2,padding=1),nn.BatchNorm2d(hidden_channal * 2),nn.LeakyReLU(0.2))self.layer3 = nn.Sequential(nn.Conv2d(in_channels=hidden_channal * 2,out_channels=hidden_channal * 4,kernel_size=3,stride=2,padding=1),nn.BatchNorm2d(hidden_channal * 4),nn.LeakyReLU(0.2))self.layer4 = nn.Sequential(nn.Conv2d(in_channels=hidden_channal * 4,out_channels=1,kernel_size=4,stride=1,padding=0),nn.Sigmoid())# [BATCH, 1, 1, 1]def forward(self, x):# print(x.shape)out = self.layer1(x)# print(out.shape)out = self.layer2(out)# print(out.shape)out = self.layer3(out)# print(out.shape)out = self.layer4(out)return out# # Test Discriminator # D = Discriminator(1, d_hidden_channal) # data = torch.randn((BATCH_SIZE, 1, 28, 28)) # print(D(data).shape)G = Generator(len_Z, g_hidden_channal, image_channal) D = Discriminator(image_channal, g_hidden_channal)# loss & optimizer criterion = nn.BCELoss() optimD = torch.optim.Adam(D.parameters(), lr=LR) optimG = torch.optim.Adam(G.parameters(), lr=LR)label_Real = torch.FloatTensor(BATCH_SIZE).data.fill_(1) label_Fake = torch.FloatTensor(BATCH_SIZE).data.fill_(0)for epoch in range(EPOCH):for step, (images, imagesLabel) in enumerate(train_loader):G_ideas = torch.randn((BATCH_SIZE, len_Z, 1, 1))G_paintings = G(G_ideas)prob_artist0 = D(images) # D try to increase this probprob_artist1 = D(G_paintings)p0 = torch.squeeze(prob_artist0)p1 = torch.squeeze(prob_artist1)errD_real = criterion(p0, label_Real)errD_fake = criterion(p1, label_Fake)# errD_fake.backward()errD = errD_fake + errD_realerrG = criterion(p1, label_Real)optimD.zero_grad()errD.backward(retain_graph=True)optimD.step()optimG.zero_grad()errG.backward(retain_graph=True)optimG.step()if (step+1) % 100 == 0:picture = torch.squeeze(G_paintings[0]).detach().numpy()plt.imshow(picture, cmap=plt.cm.gray_r)plt.show()可以選定特定的數(shù)字的DCGANs
根據(jù)了我寫了另外一篇文章
- MNIST選取特定數(shù)值的訓練集
- 利用上面的代碼,加上上面鏈接中的代碼。結(jié)合一下,就是下面的版本了
生成器的進化過程:
import torch import torch.nn as nn import torchvision import torch.utils.data as Data import matplotlib.pyplot as plt# Hyper Parameters EPOCH = 10 # 訓練整批數(shù)據(jù)多少次 BATCH_SIZE = 50 LR = 0.0002 # 學習率 DOWNLOAD_MNIST = False # 已經(jīng)下載好的話,會自動跳過的 len_Z = 100 # random input.channal for Generator g_hidden_channal = 64 d_hidden_channal = 64 image_channal = 1 # mnist數(shù)據(jù)為黑白的只有一維# Mnist 手寫數(shù)字class myMNIST(torchvision.datasets.MNIST):def __init__(self, root, train=True, transform=None, target_transform=None, download=False, targetNum=None):super(myMNIST, self).__init__(root,train=train,transform=transform,target_transform=target_transform,download=download)if targetNum:self.train_data = self.train_data[self.train_labels == targetNum]self.train_data = self.train_data[:int(self.__len__() / BATCH_SIZE) * BATCH_SIZE]self.train_labels = self.train_labels[self.train_labels == targetNum][:int(self.__len__() / BATCH_SIZE) * BATCH_SIZE]def __len__(self):if self.train:return self.train_data.shape[0]else:return 10000train_data = myMNIST(root='./mnist/', # 保存或者提取位置train=True, # this is training datatransform=torchvision.transforms.ToTensor(), # 轉(zhuǎn)換 PIL.Image or numpy.ndarray 成# torch.FloatTensor (C x H x W), 訓練的時候 normalize 成 [0.0, 1.0] 區(qū)間download=DOWNLOAD_MNIST, # 沒下載就下載, 下載了就不用再下了targetNum=1 ) print(len(train_data)) # print(train_data.shape)# 訓練集丟BATCH_SIZE個, 圖片大小為28*28 train_loader = Data.DataLoader(dataset=train_data,batch_size=BATCH_SIZE,shuffle=True # 是否打亂順序 )class Generator(nn.Module):def __init__(self, len_Z, hidden_channal, output_channal):super(Generator, self).__init__()self.layer1 = nn.Sequential(nn.ConvTranspose2d(in_channels=len_Z,out_channels=hidden_channal * 4,kernel_size=4,),nn.BatchNorm2d(hidden_channal * 4),nn.ReLU())# [BATCH, hidden_channal * 4 , 4, 4]self.layer2 = nn.Sequential(nn.ConvTranspose2d(in_channels=hidden_channal * 4,out_channels=hidden_channal * 2,kernel_size=3, # 保證生成圖像大小為28stride=2,padding=1),nn.BatchNorm2d(hidden_channal * 2),nn.ReLU())#self.layer3 = nn.Sequential(nn.ConvTranspose2d(in_channels=hidden_channal * 2,out_channels=hidden_channal,kernel_size=4,stride=2,padding=1),nn.BatchNorm2d(hidden_channal),nn.ReLU())self.layer4 = nn.Sequential(nn.ConvTranspose2d(in_channels=hidden_channal,out_channels=output_channal,kernel_size=4,stride=2,padding=1),nn.Tanh())def forward(self, x):# [50, 100, 1, 1]out = self.layer1(x)# [50, 256, 4, 4]# print(out.shape)out = self.layer2(out)# [50, 128, 7, 7]# print(out.shape)out = self.layer3(out)# [50, 64, 14, 14]# print(out.shape)out = self.layer4(out)# print(out.shape)# [50, 1, 28, 28]return out# # Test Generator # G = Generator(len_Z, g_hidden_channal, image_channal) # data = torch.randn((BATCH_SIZE, len_Z, 1, 1)) # print(G(data).shape)class Discriminator(nn.Module):def __init__(self, input_channal, hidden_channal):super(Discriminator, self).__init__()self.layer1 = nn.Sequential(nn.Conv2d(in_channels=input_channal,out_channels=hidden_channal,kernel_size=4,stride=2,padding=1),nn.BatchNorm2d(hidden_channal),nn.LeakyReLU(0.2))self.layer2 = nn.Sequential(nn.Conv2d(in_channels=hidden_channal,out_channels=hidden_channal * 2,kernel_size=4,stride=2,padding=1),nn.BatchNorm2d(hidden_channal * 2),nn.LeakyReLU(0.2))self.layer3 = nn.Sequential(nn.Conv2d(in_channels=hidden_channal * 2,out_channels=hidden_channal * 4,kernel_size=3,stride=2,padding=1),nn.BatchNorm2d(hidden_channal * 4),nn.LeakyReLU(0.2))self.layer4 = nn.Sequential(nn.Conv2d(in_channels=hidden_channal * 4,out_channels=1,kernel_size=4,stride=1,padding=0),nn.Sigmoid())# [BATCH, 1, 1, 1]def forward(self, x):# print(x.shape)out = self.layer1(x)# print(out.shape)out = self.layer2(out)# print(out.shape)out = self.layer3(out)# print(out.shape)out = self.layer4(out)return out# # Test Discriminator # D = Discriminator(1, d_hidden_channal) # data = torch.randn((BATCH_SIZE, 1, 28, 28)) # print(D(data).shape)G = Generator(len_Z, g_hidden_channal, image_channal) D = Discriminator(image_channal, g_hidden_channal)# loss & optimizer criterion = nn.BCELoss() optimD = torch.optim.Adam(D.parameters(), lr=LR) optimG = torch.optim.Adam(G.parameters(), lr=LR)label_Real = torch.FloatTensor(BATCH_SIZE).data.fill_(1) label_Fake = torch.FloatTensor(BATCH_SIZE).data.fill_(0)for epoch in range(EPOCH):for step, (images, imagesLabel) in enumerate(train_loader):G_ideas = torch.randn((BATCH_SIZE, len_Z, 1, 1))G_paintings = G(G_ideas)prob_artist0 = D(images) # D try to increase this probprob_artist1 = D(G_paintings)p0 = torch.squeeze(prob_artist0)p1 = torch.squeeze(prob_artist1)errD_real = criterion(p0, label_Real)errD_fake = criterion(p1, label_Fake)# errD_fake.backward()errD = errD_fake + errD_realerrG = criterion(p1, label_Real)optimD.zero_grad()errD.backward(retain_graph=True)optimD.step()optimG.zero_grad()errG.backward(retain_graph=True)optimG.step()picture = torch.squeeze(G_paintings[0]).detach().numpy()plt.imshow(picture, cmap=plt.cm.gray_r)plt.show()總結(jié)
以上是生活随笔為你收集整理的(深度卷积生成对抗神经网络)DCGANs论文阅读与实现pytorch的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 卷积神经网络CNN入门【pytorch学
- 下一篇: pytorch生成一个数组