當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

深度学总结：Image Style Transfer pytorch方式实现，这个是非基于autoencoder和domain adversrial方式

發布時間：2024/9/15 编程问答 26 豆豆

生活随笔收集整理的這篇文章主要介紹了深度学总结：Image Style Transfer pytorch方式实现，这个是非基于autoencoder和domain adversrial方式小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

文章目錄

論文鏈接：
主要思路：
pytorch實現：
- 計算content的Loss:
- 計算style 的Loss:
- 計算total的Loss:
- 訓練過程：

論文鏈接：

https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Gatys_Image_Style_Transfer_CVPR_2016_paper.pdf
作者運用了一個披著深度學習外表的傳統方法做這個問題，技術不提倡，思路、想法很天才。

主要思路：

1、想辦法分離style features和content features;
2、怎么得到content features，作者偷懶直接取vgg19里面conv4_2的features，當作content features；
3、怎么得到style features，作者發揮天才的偷懶，認為style就是features之間的相關度，作者直接取vgg19里面conv1_1，conv2_1，conv3_1，conv4_1，conv5_1這5層的feature map來算風格相關度，作者又發現不同的layer對風格有影響，前面的風格細膩，后面的風格粗獷，又給這5個層的loss誤差加了權重。
4、那么目標是什么了？就是new_image和content_image的內容接近，new_image和stytle_image的風格接近， $L_{total}$ = $L_{content}$ + $L_{style}$ ，首先要像，其次才是風格，所以 $L_{content}$ 的比重要大。

pytorch實現：

計算content的Loss:

前面提到了作者偷懶直接取vgg19里面conv4_2的features，當作content features，那么只需要比較兩個feature map的差別就行了。

# the content losscontent_loss = torch.mean((target_features['conv4_2'] - content_features['conv4_2'])**2)

計算style 的Loss:

def gram_matrix(tensor):""" Calculate the Gram Matrix of a given tensor Gram Matrix: https://en.wikipedia.org/wiki/Gramian_matrix"""# get the batch_size, depth, height, and width of the Tensor_, d, h, w = tensor.size()# reshape so we're multiplying the features for each channeltensor = tensor.view(d, h * w)# calculate the gram matrixgram = torch.mm(tensor, tensor.t())return gram

# weights for each style layer # weighting earlier layers more will result in *larger* style artifacts # notice we are excluding `conv4_2` our content representation style_weights = {'conv1_1': 1.,'conv2_1': 0.75,'conv3_1': 0.2,'conv4_1': 0.2,'conv5_1': 0.2}# get style features only once before training style_features = get_features(style, vgg)# calculate the gram matrices for each layer of our style representation style_grams = {layer: gram_matrix(style_features[layer]) for layer in style_features}for layer in style_weights:# get the "target" style representation for the layertarget_feature = target_features[layer]target_gram = gram_matrix(target_feature)_, d, h, w = target_feature.shape# get the "style" style representationstyle_gram = style_grams[layer]# the style loss for one layer, weighted appropriatelylayer_style_loss = style_weights[layer] * torch.mean((target_gram - style_gram)**2)# add to the style lossstyle_loss += layer_style_loss / (d * h * w)

計算total的Loss:

content_weight = 1 # alpha style_weight = 1e6 # beta# calculate the *total* loss total_loss = content_weight * content_loss + style_weight * style_loss

訓練過程：

有人可能問VGG參數固定住了，需要的參數是什么了，參數就是新圖啊？通過上述loss調整圖像的像素值，這個和我們一般了解到的有點不一樣。
新圖就是在原圖的基礎上慢慢變化,f復制原圖并設置新圖為trainable的：

# create a third "target" image and prep it for change # it is a good idea to start of with the target as a copy of our *content* image # then iteratively change its style target = content.clone().requires_grad_(True).to(device)

訓練大概代碼如下：

# for displaying the target image, intermittently show_every = 400# iteration hyperparameters optimizer = optim.Adam([target], lr=0.003) steps = 2000 # decide how many iterations to update your image (5000)for ii in range(1, steps+1):# get the features from your target imagetarget_features = get_features(target, vgg)# the content losscontent_loss = torch.mean((target_features['conv4_2'] - content_features['conv4_2'])**2)# the style loss# initialize the style loss to 0style_loss = 0# then add to it for each layer's gram matrix lossfor layer in style_weights:# get the "target" style representation for the layertarget_feature = target_features[layer]target_gram = gram_matrix(target_feature)_, d, h, w = target_feature.shape# get the "style" style representationstyle_gram = style_grams[layer]# the style loss for one layer, weighted appropriatelylayer_style_loss = style_weights[layer] * torch.mean((target_gram - style_gram)**2)# add to the style lossstyle_loss += layer_style_loss / (d * h * w)# calculate the *total* losstotal_loss = content_weight * content_loss + style_weight * style_loss# update your target imageoptimizer.zero_grad()total_loss.backward()optimizer.step()