當前位置：首頁 > 人工智能 > pytorch >内容正文

pytorch

2015 深度学习文章整理

發布時間：2025/7/25 pytorch 18 豆豆

生活随笔收集整理的這篇文章主要介紹了 2015 深度学习文章整理小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

? ?國內外從事計算機視覺和圖像處理相關領域的著名學者都以在三大頂級會議（ICCV，CVPR和ECCV）上發表論文為榮，其影響力遠勝于一般SCI期刊論文，這三大頂級學術會議論文也引領著未來的研究趨勢。CVPR是主要的計算機視覺會議，可以把它看作是計算機視覺研究的奧林匹克。博主今天先來整理CVPR2015年的精彩文章（這個就夠很長一段時間消化的了）?
? ?頂級會議CVPR2015參會paper網址：?
http://www.cv-foundation.org/openaccess/CVPR2015.py

? ?來吧，一項項的開始整理，總有你需要的文章在等你！

CNN Architectures

CNN網絡結構：?
1.Hypercolumns for Object Segmentation and Fine-Grained Localization?
Authors: Bharath Hariharan, Pablo Arbeláez, Ross Girshick, Jitendra Malik

2.Modeling Local and Global Deformations in Deep Learning: Epitomic Convolution, Multiple Instance Learning, and Sliding Window Detection?
Authors: George Papandreou, Iasonas Kokkinos, Pierre-André Savalle

3.Going Deeper With Convolutions?
Authors: Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich?
這篇文章推薦一下，使用了《network in network》中的用 global averaging pooling layer 替代 fully-connected layer的思想。有看過的可以私信博主，一起討論文章心得。

4.Improving Object Detection With Deep Convolutional Networks via Bayesian Optimization and Structured Prediction?
Authors: Yuting Zhang, Kihyuk Sohn, Ruben Villegas, Gang Pan, Honglak Lee

5.Deep Neural Networks Are Easily Fooled: High Confidence Predictions for Unrecognizable Images?
Authors: Anh Nguyen, Jason Yosinski, Jeff Clune

Action and Event Recognition

1.Deeply Learned Attributes for Crowded Scene Understanding?
Authors: Jing Shao, Kai Kang, Chen Change Loy, Xiaogang Wang

2.Modeling Video Evolution for Action Recognition?
Authors: Basura Fernando, Efstratios Gavves, José Oramas M., Amir Ghodrati, Tinne Tuytelaars

3.Joint Inference of Groups, Events and Human Roles in Aerial Videos?
Authors: Tianmin Shu, Dan Xie, Brandon Rothrock, Sinisa Todorovic, Song Chun Zhu

Segmentation in Images and Video

1.Causal Video Object Segmentation From Persistence of Occlusions?
Authors: Brian Taylor, Vasiliy Karasev, Stefano Soatto

2.Fully Convolutional Networks for Semantic Segmentation?
Authors: Jonathan Long, Evan Shelhamer, Trevor Darrell?
——文章把全連接層當做卷積層，也用來輸出featuremap。這樣相比Hypercolumns/HED 這樣的模型，可遷移的模型層數（指VGG16/Alexnet等）就更多了。但是從文章來看，因為純卷積嘛，所以featuremap的每個點之間沒有位置信息的區分。相較于Hypercolumns的claim，鼻子的點出現在圖像的上半部分可以劃分為pedestrian類的像素，但是如果出現在下方就應該劃分為背景。所以位置信息應該是挺重要需要考慮的。這也許是速度與性能的trade-off?

3.Is object localization for free - Weakly-supervised learning with convolutional neural networks?
——弱監督做object detection的文章。首先fc layer當做conv layer與上面這篇文章思想一致。同時把最后max pooling之前的feature map看做包含class localization的信息，只不過從第五章“Does adding object-level supervision help classification”的結果看，效果雖好，但是這一物理解釋可能不夠完善。

4.Shape-Tailored Local Descriptors and Their Application to Segmentation and?Tracking?
Authors: Naeemullah Khan, Marei Algarni, Anthony Yezzi, Ganesh Sundaramoorthi

5.Deep Filter Banks for Texture Recognition and Segmentation?
Authors: Mircea Cimpoi, Subhransu Maji, Andrea Vedaldi

6.Deeply learned face representations are sparse, selective, and robust, Yi Sun, Xiaogang Wang, Xiaoou Tang?
——DeepID系列之DeepID2+。在DeepID2之上的改進是增加了網絡的規模(feature map數目)，另外每一層都接入一個全連通層加supervision。最精彩的地方應該是后面對神經元性能的分析，發現了三個特點：1.中度稀疏最大化了區分性，并適合二值化；2.身份和attribute選擇性；3.對遮擋的魯棒性。這三個特點在模型訓練時都沒有顯示或隱含地強加了約束，都是CNN自己學的。

Image and Video Processing and Restoration

1.Fast and Flexible Convolutional Sparse Coding?
Authors: Felix Heide, Wolfgang Heidrich, Gordon Wetzstein

2.What do 15,000 Object Categories Tell Us About Classifying and Localizing Actions??
Authors: Mihir Jain, Jan C. van Gemert, Cees G. M. Snoek?
——物品的分類對行為檢測有幫助作用。這篇文章是第一篇關于這個話題進行探討的，是個深坑，大家可以關注一下，考慮占坑。

3.Hypercolumns for Object Segmentation and Fine-Grained Localization?
Authors:Bharath Hariharan, Pablo Arbeláez, Ross Girshick, Jitendra Malik?
——一個很好的思路！以前的CNN或者R-CNN，我們總是用最后一層作為class label，倒數第二層作為feature。這篇文章的作者想到利用每一層的信息。因為對于每一個pixel來講，在所有層數上它都有被激發和不被激發兩種態，作者利用了每一層的激發態作為一個feature vector來幫助自己做精細的物體檢測。

3D Models and Images

1.The Stitched Puppet: A Graphical Model of 3D Human Shape and Pose?
Authors: Silvia Zuffi, Michael J. Black

2.3D Shape Estimation From 2D Landmarks: A Convex Relaxation Approach?
Authors: Xiaowei Zhou, Spyridon Leonardos, Xiaoyan Hu, Kostas Daniilidis

Images and Language

這個類別的文章需要好好看看，對思路的發散很有幫助

1.Show and Tell: A Neural Image Caption Generator?
Authors: Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan

2.Deep Visual-Semantic Alignments for Generating Image Descriptions?
Authors: Andrej Karpathy, Li Fei-Fei

3.Long-Term Recurrent Convolutional Networks for Visual Recognition and Description?
Authors: Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, Trevor Darrell

4.Becoming the Expert - Interactive Multi-Class Machine Teaching?
Authors: Edward Johns, Oisin Mac Aodha, Gabriel J. Brostow

其它

參考文獻一：CNN卷積神經網絡的改進（15年最新paper）：?
http://blog.csdn.net/u010402786/article/details/50499864?
文章中的四篇文章也值得一讀，其中一篇在上面出現過。一定要自己下載下來看一看。?
參考文獻二：這是另外一個博主的博客，也是對CVPR的文章進行了整理：?
http://blog.csdn.net/jwh_bupt/article/details/46916653

基本許多文章里面沒有注釋核心思想，接下來慢慢補充。2016-01-20

總結

以上是生活随笔為你收集整理的2015 深度学习文章整理的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：各种编程语言的深度学习库整理
下一篇： Android系统将内置滤镜功能