當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

【学术相关】CVPR2021最新接收论文合集！22个方向100+篇论文汇总｜持续更新

發布時間：2025/3/12 编程问答 10 豆豆

生活随笔收集整理的這篇文章主要介紹了【学术相关】CVPR2021最新接收论文合集！22个方向100+篇论文汇总｜持续更新小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

報道丨極市平臺

導讀

CVPR2021結果已出，本文為CVPR最新接收論文的資源匯總貼，附有相關文章與代碼鏈接。?

文章在Github上持續更新，歡迎大家 star/fork（點擊閱讀原文即可跳轉）：
https://github.com/extreme-assistant/CVPR2021-Paper-Code-Interpretation/blob/master/CVPR2021.md

官網鏈接：http://cvpr2021.thecvf.com
時間：2021年6月19日-6月25日
論文接收公布時間：2021年2月28日

1.CVPR2021接受論文/代碼分方向整理

分類目錄：

1. 檢測
圖像目標檢測(Image Object Detection)
視頻目標檢測(Video Object Detection)
三維目標檢測(3D Object Detection)
動作檢測(Activity Detection)
異常檢測(Anomally Detetion)
2. 圖像分割(Image Segmentation)
全景分割(Panoptic Segmentation)
語義分割(Semantic Segmentation)
實例分割(Instance Segmentation)
摳圖(Matting)
3. 圖像處理(Image Processing)
圖像復原(Image Restoration)/超分辨率(Super Resolution)
圖像陰影去除(Image Shadow Removal)
圖像去噪/去模糊/去雨去霧(Image Denoising)
圖像編輯(Image Edit)
圖像翻譯(Image Translation))
4. 人臉(Face)
5. 目標跟蹤(Object Tracking)
6. 重識別(Re-Identification)
7. 醫學影像(Medical Imaging)
8. GAN/生成式/對抗式(GAN/Generative/Adversarial)
9. 估計(Estimation)
人體姿態估計(Human Pose Estimation)
光流/位姿/運動估計(Flow/Pose/Motion Estimation)
深度估計(Depth Estimation)
10. 三維視覺(3D Vision)
三維點云(3D Point Cloud)
三維重建(3D Reconstruction)
11. 神經網絡架構(Neural Network Structure)
Transformer
圖神經網絡(GNN)
12. 神經網絡架構搜索(NAS)
13. 數據處理(Data Processing)
數據增廣(Data Augmentation)
歸一化/正則化(Batch Normalization)
圖像聚類(Image Clustering)
14. 模型壓縮(Model Compression)
知識蒸餾(Knowledge Distillation)
15. 模型評估(Model Evaluation)
16. 數據集(Database)
17. 主動學習(Active Learning)
18. 小樣本學習/零樣本(Few-shot Learning)
19. 持續學習(Continual Learning/Life-long Learning)
20. 視覺推理(Visual Reasoning)
21. 遷移學習/domain/自適應
22. 對比學習(Contrastive Learning)
暫無分類

檢測

圖像目標檢測(Image Object Detection)

[7] Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection(小樣本目標檢測的語義關系推理)
paper：https://arxiv.org/abs/2103.01903

[6] General Instance Distillation for Object Detection(通用實例蒸餾技術在目標檢測中的應用)
paper：https://arxiv.org/abs/2103.02340

[5] Instance Localization for Self-supervised Detection Pretraining(自監督檢測預訓練的實例定位)
paper：https://arxiv.org/pdf/2102.08318.pdf
code：https://github.com/limbo0000/InstanceLoc

[4] Multiple Instance Active Learning for Object Detection（用于對象檢測的多實例主動學習）
paper：https://github.com/yuantn/MIAL/raw/master/paper.pdf
code：https://github.com/yuantn/MIAL

[3] Towards Open World Object Detection(開放世界中的目標檢測)
paper：Towards Open World Object Detection
code：https://github.com/JosephKJ/OWOD

[2] Positive-Unlabeled Data Purification in the Wild for Object Detection(野外檢測對象的陽性無標簽數據提純)

[1] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
paper：https://arxiv.org/pdf/2011.09094.pdf
解讀：

無監督預訓練檢測器：https://www.zhihu.com/question/432321109/answer/1606004872

視頻目標檢測(Video Object Detection)

[3] Depth from Camera Motion and Object Detection(相機運動和物體檢測的深度)
paper：https://arxiv.org/abs/2103.01468

[2] There is More than Meets the Eye: Self-Supervised Multi-Object Detection ?and Tracking with Sound by Distilling Multimodal Knowledge(多模態知識提取的自監督多目標檢測與有聲跟蹤)
paper：https://arxiv.org/abs/2103.01353
project：http://rl.uni-freiburg.de/research/multimodal-distill

[1] Dogfight: Detecting Drones from Drone Videos（從無人機視頻中檢測無人機）

三維目標檢測(3D object detection)

[2] 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection(利用IoU預測進行半監督3D對象檢測)
paper：https://arxiv.org/pdf/2012.04355.pdf
code：https://github.com/THU17cyz/3DIoUMatch
project：https://thu17cyz.github.io/3DIoUMatch/
video：https://youtu.be/nuARjhkQN2U

[1] Categorical Depth Distribution Network for Monocular 3D Object Detection(用于單目三維目標檢測的分類深度分布網絡)
paper：https://arxiv.org/abs/2103.01100

動作檢測(Activity Detection)

[1] Coarse-Fine Networks for Temporal Activity Detection in Videos
paper：https://arxiv.org/abs/2103.01302

異常檢測(Anomally Detetion)

[1] Multiresolution Knowledge Distillation for Anomaly Detection(用于異常檢測的多分辨率知識蒸餾)
paper：https://arxiv.org/abs/2011.11108

圖像分割(Image Segmentation)

[2] Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?
paper：https://arxiv.org/abs/2012.06166
code：https://github.com/mboudiaf/RePRI-for-Few-Shot-Segmentation

[1] PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation(語義流經點以進行航空圖像分割)

全景分割(Panoptic Segmentation)

[2] Cross-View Regularization for Domain Adaptive Panoptic Segmentation(用于域自適應全景分割的跨視圖正則化)
paper：https://arxiv.org/abs/2103.02584

[1] 4D Panoptic LiDAR Segmentation（4D全景LiDAR分割）
paper：https://arxiv.org/abs/2102.12472

語義分割(Semantic Segmentation)

[2] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市規模3D點云的語義分割：數據集，基準和挑戰)
paper：https://arxiv.org/abs/2009.03137
code：https://github.com/QingyongHu/SensatUrban

[1] PLOP: Learning without Forgetting for Continual Semantic Segmentation（PLOP：學習而不會忘記連續的語義分割）
paper：https://arxiv.org/abs/2011.11390

實例分割(Instance Segmentation)

[1] End-to-End Video Instance Segmentation with Transformers(使用Transformer的端到端視頻實例分割)
paper：https://arxiv.org/abs/2011.14503

摳圖(Matting)

[1] Real-Time High Resolution Background Matting
paper：https://arxiv.org/abs/2012.07810
code：https://github.com/PeterL1n/BackgroundMattingV2
project：https://grail.cs.washington.edu/projects/background-matting-v2/
video：https://youtu.be/oMfPTeYDF9g

9. 估計(Estimation)

人體姿態估計(Human Pose Estimation)

[2] CanonPose: Self-supervised Monocular 3D Human Pose Estimation in the Wild（野外自監督的單眼3D人類姿態估計）

[1] PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers（具有透視作物層的3D姿勢的幾何感知神經重建）
paper：https://arxiv.org/abs/2011.13607

光流/位姿/運動估計(Flow/Pose/Motion Estimation)

[3] GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation(用于單眼6D對象姿態估計的幾何引導直接回歸網絡)
paper：http://arxiv.org/abs/2102.12145
code：https://github.com/THU-DA-6D-Pose-Group/GDR-Net

[2] Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments(在動態室內環境中，通過空間劃分的魯棒神經路由可實現攝像機的重新定位)
paper：https://arxiv.org/abs/2012.04746
project：https://ai.stanford.edu/~hewang/

[1] MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization(通過3D掃描同步進行多主體分割和運動估計)
paper：https://arxiv.org/pdf/2101.06605.pdf
code：https://github.com/huangjh-pub/multibody-sync

深度估計(Depth Estimation)

圖像處理(Image Processing)

圖像復原(Image Restoration)/超分辨率(Super Resolution)

[3] Multi-Stage Progressive Image Restoration(多階段漸進式圖像復原)
paper：https://arxiv.org/abs/2102.02808
code：https://github.com/swz30/MPRNet

[2] Data-Free Knowledge Distillation For Image Super-Resolution(DAFL算法的SR版本)

[1] AdderSR: Towards Energy Efficient Image Super-Resolution(將加法網路應用到圖像超分辨率中)

paper：https://arxiv.org/pdf/2009.08891.pdf
code：https://github.com/huawei-noah/AdderNet
解讀：華為開源加法神經網絡

圖像陰影去除(Image Shadow Removal)

[1] Auto-Exposure Fusion for Single-Image Shadow Removal(用于單幅圖像陰影去除的自動曝光融合)
paper：https://arxiv.org/abs/2103.01255
code：https://github.com/tsingqguo/exposure-fusion-shadow-removal

圖像去噪/去模糊/去雨去霧(Image Denoising)

[1] DeFMO: Deblurring and Shape Recovery of Fast Moving Objects(快速移動物體的去模糊和形狀恢復)
paper：https://arxiv.org/abs/2012.00595
code：https://github.com/rozumden/DeFMO
video：https://www.youtube.com/watch?v=pmAynZvaaQ4

圖像編輯(Image Edit)

[1] Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing（利用GAN中潛在的空間維度進行實時圖像編輯）

圖像翻譯（Image Translation）

[2] Image-to-image Translation via Hierarchical Style Disentanglement
paper：https://arxiv.org/abs/2103.01456
code：https://github.com/imlixinyang/HiSD

[1] Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation(樣式編碼：用于圖像到圖像翻譯的StyleGAN編碼器)
paper：https://arxiv.org/abs/2008.00951
code：https://github.com/eladrich/pixel2style2pixel
project：https://eladrich.github.io/pixel2style2pixel/

人臉(Face)

[5] Cross Modal Focal Loss for RGBD Face Anti-Spoofing(Cross Modal Focal Loss for RGBD Face Anti-Spoofing)?

paper：https://arxiv.org/abs/2103.00948

[4] When Age-Invariant Face Recognition Meets Face Age Synthesis: A ?Multi-Task Learning Framework(當年齡不變的人臉識別遇到人臉年齡合成時：一個多任務學習框架)
paper：https://arxiv.org/abs/2103.01520
code：https://github.com/Hzzone/MTLFace

[3] Multi-attentional Deepfake Detection(多注意的深偽檢測)
paper：https://arxiv.org/abs/2103.02406

[2] Image-to-image Translation via Hierarchical Style Disentanglement
paper：https://arxiv.org/abs/2103.01456
code：https://github.com/imlixinyang/HiSD

[1] A 3D GAN for Improved Large-pose Facial Recognition(用于改善大姿勢面部識別的3D GAN)
paper：https://arxiv.org/pdf/2012.10545.pdf

目標跟蹤(Object Tracking)

[4] HPS: localizing and tracking people in large 3D scenes from wearable sensors(通過可穿戴式傳感器對大型3D場景中的人進行定位和跟蹤)

[3] Track to Detect and Segment: An Online Multi-Object Tracker(跟蹤檢測和分段：在線多對象跟蹤器)
project：https://jialianwu.com/projects/TraDeS.html
video：https://www.youtube.com/watch?v=oGNtSFHRZJA

[2] Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking(多目標跟蹤的概率小波計分和修復)
paper：https://arxiv.org/abs/2012.02337

[1] Rotation Equivariant Siamese Networks for Tracking（旋轉等距連體網絡進行跟蹤）
paper：https://arxiv.org/abs/2012.13078

重識別

[1] Meta Batch-Instance Normalization for Generalizable Person Re-Identification(通用批處理人員重新標識的元批實例規范化)
paper：https://arxiv.org/abs/2011.14670

醫學影像(Medical Imaging)

[4] Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning(多機構協作改進基于深度學習的聯合學習磁共振圖像重建)
paper：https://arxiv.org/abs/2103.02148
code：https://github.com/guopengf/FLMRCM

[3] 3D Graph Anatomy Geometry-Integrated Network for Pancreatic Mass Segmentation, Diagnosis, and Quantitative Patient Management(用于胰腺腫塊分割，診斷和定量患者管理的3D圖形解剖學幾何集成網絡)

[2] Deep Lesion Tracker: Monitoring Lesions in 4D Longitudinal Imaging Studies(深部病變追蹤器：在4D縱向成像研究中監控病變)
paper：https://arxiv.org/abs/2012.04872

[1] Automatic Vertebra Localization and Identification in CT by Spine Rectification and Anatomically-constrained Optimization(通過脊柱矯正和解剖學約束優化在CT中自動進行椎骨定位和識別)
paper：https://arxiv.org/abs/2012.07947

神經網絡架構搜索(NAS)

[3] AttentiveNAS: Improving Neural Architecture Search via Attentive(通過注意力改善神經架構搜索)
paper：https://arxiv.org/pdf/2011.09011.pdf

[2] ReNAS: Relativistic Evaluation of Neural Architecture Search(NAS predictor當中ranking loss的重要性)
paper：https://arxiv.org/pdf/1910.01523.pdf

[1] HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens（降低NAS的成本）
paper：https://arxiv.org/pdf/2005.14446.pdf

GAN/生成式/對抗式(GAN/Generative/Adversarial)

[5] Efficient Conditional GAN Transfer with Knowledge Propagation across Classes(高效的有條件GAN轉移以及跨課程的知識傳播)
paper：https://arxiv.org/abs/2102.06696
code：http://github.com/mshahbazi72/cGANTransfer

[4] Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing（利用GAN中潛在的空間維度進行實時圖像編輯）

[3] Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs(Hijack-GAN：意外使用經過預訓練的黑匣子GAN)
paper：https://arxiv.org/pdf/2011.14107.pdf

[2] Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation(樣式編碼：用于圖像到圖像翻譯的StyleGAN編碼器)

paper：https://arxiv.org/abs/2008.00951
code：https://github.com/eladrich/pixel2style2pixel
project：https://eladrich.github.io/pixel2style2pixel/

[1] A 3D GAN for Improved Large-pose Facial Recognition(用于改善大姿勢面部識別的3D GAN)
paper：https://arxiv.org/pdf/2012.10545.pdf

三維視覺(3D Vision)

[2] A Deep Emulator for Secondary Motion of 3D Characters(三維角色二次運動的深度仿真器)

paper：https://arxiv.org/abs/2103.01261

[1] 3D CNNs with Adaptive Temporal Feature Resolutions(具有自適應時間特征分辨率的3D CNN)
paper：https://arxiv.org/abs/2011.08652

三維點云(3D Point Cloud)

[6] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市規模3D點云的語義分割：數據集，基準和挑戰)
paper：https://arxiv.org/abs/2009.03137
code：https://github.com/QingyongHu/SensatUrban

[5] SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration(SpinNet：學習用于3D點云注冊的通用表面描述符)
paper：https://t.co/xIAWVGQeB2?amp=1
code：https://github.com/QingyongHu/SpinNet

[4] MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization(通過3D掃描同步進行多主體分割和運動估計)
paper：https://arxiv.org/pdf/2101.06605.pdf
code：https://github.com/huangjh-pub/multibody-sync

[3] Diffusion Probabilistic Models for 3D Point Cloud Generation(三維點云生成的擴散概率模型)
paper：https://arxiv.org/abs/2103.01458
code：https://github.com/luost26/diffusion-point-cloud

[2] Style-based Point Generator with Adversarial Rendering for Point Cloud Completion(用于點云補全的對抗性渲染基于樣式的點生成器)
paper：https://arxiv.org/abs/2103.02535

[1] PREDATOR: Registration of 3D Point Clouds with Low Overlap(預測器：低重疊的3D點云的注冊)
paper：https://arxiv.org/pdf/2011.13005.pdf
code：https://github.com/ShengyuH/OverlapPredator
project：https://overlappredator.github.io/

三維重建(3D Reconstruction)

[1] PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers（具有透視作物層的3D姿勢的幾何感知神經重建）
paper：https://arxiv.org/abs/2011.13607

模型壓縮(Model Compression)

[2] Manifold Regularized Dynamic Network Pruning（動態剪枝的過程中考慮樣本復雜度與網絡復雜度的約束）

[1] Learning Student Networks in the Wild（一種不需要原始訓練數據的模型壓縮和加速技術）
paper：https://arxiv.org/pdf/1904.01186.pdf
code：https://github.com/huawei-noah/DAFL
解讀：

華為諾亞方舟實驗室提出無需數據網絡壓縮技術：https://zhuanlan.zhihu.com/p/81277796

知識蒸餾(Knowledge Distillation)

[3] General Instance Distillation for Object Detection(通用實例蒸餾技術在目標檢測中的應用)
paper:https://arxiv.org/abs/2103.02340

[2] Multiresolution Knowledge Distillation for Anomaly Detection(用于異常檢測的多分辨率知識蒸餾)
paper:https://arxiv.org/abs/2011.11108

[1] Distilling Object Detectors via Decoupled Features（前景背景分離的蒸餾技術）

神經網絡架構(Neural Network Structure)

[3] Rethinking Channel Dimensions for Efficient Model Design(重新考慮通道尺寸以進行有效的模型設計)
paper:https://arxiv.org/abs/2007.00992
code:https://github.com/clovaai/rexnet

[2] Inverting the Inherence of Convolution for Visual Recognition（顛倒卷積的固有性以進行視覺識別）

[1] RepVGG: Making VGG-style ConvNets Great Again
paper:https://arxiv.org/abs/2101.03697
code:https://github.com/megvii-model/RepVGG
解讀：

RepVGG：極簡架構，SOTA性能，讓VGG式模型再次偉大:https://zhuanlan.zhihu.com/p/344324470

Transformer

[3] Transformer Interpretability Beyond Attention Visualization(注意力可視化之外的Transformer可解釋性)
paper:https://arxiv.org/pdf/2012.09838.pdf
code:https://github.com/hila-chefer/Transformer-Explainability

[2] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
paper:https://arxiv.org/pdf/2011.09094.pdf
解讀：無監督預訓練檢測器:https://www.zhihu.com/question/432321109/answer/1606004872

[1] Pre-Trained Image Processing Transformer(底層視覺預訓練模型)
paper:https://arxiv.org/pdf/2012.00364.pdf

圖神經網絡(GNN)

[2] Quantifying Explainers of Graph Neural Networks in Computational Pathology(計算病理學中圖神經網絡的量化解釋器)
paper:https://arxiv.org/pdf/2011.12646.pdf

[1] Sequential Graph Convolutional Network for Active Learning(主動學習的順序圖卷積網絡)
paper:https://arxiv.org/pdf/2006.10219.pdf

數據處理(Data Processing)

數據增廣(Data Augmentation)

[1] KeepAugment: A Simple Information-Preserving Data Augmentation(一種簡單的保存信息的數據擴充)
paper:https://arxiv.org/pdf/2011.11778.pdf

歸一化/正則化(Batch Normalization)

[3] Adaptive Consistency Regularization for Semi-Supervised Transfer Learning(半監督轉移學習的自適應一致性正則化)
paper:https://arxiv.org/abs/2103.02193
code:https://github.com/SHI-Labs/Semi-Supervised-Transfer-Learning

[2] Meta Batch-Instance Normalization for Generalizable Person Re-Identification(通用批處理人員重新標識的元批實例規范化)
paper:https://arxiv.org/abs/2011.14670

[1] Representative Batch Normalization with Feature Calibration（具有特征校準功能的代表性批量歸一化）

圖像聚類(Image Clustering)

[2] Improving Unsupervised Image Clustering With Robust Learning（通過魯棒學習改善無監督圖像聚類）
paper:https://arxiv.org/abs/2012.11150
code:https://github.com/deu30303/RUC

[1] Reconsidering Representation Alignment for Multi-view Clustering(重新考慮多視圖聚類的表示對齊方式)

模型評估(Model Evaluation)

[1] Are Labels Necessary for Classifier Accuracy Evaluation?(測試集沒有標簽，我們可以拿來測試模型嗎？)
paper:https://arxiv.org/abs/2007.02915
解讀:https://zhuanlan.zhihu.com/p/328686799

數據集(Database)

[2] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市規模3D點云的語義分割：數據集，基準和挑戰)
paper:https://arxiv.org/abs/2009.03137
code:https://github.com/QingyongHu/SensatUrban

[1] Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels（重新標記ImageNet：從單標簽到多標簽，從全局標簽到本地標簽）
paper:https://arxiv.org/abs/2101.05022
code:https://github.com/naver-ai/relabel_imagenet

主動學習(Active Learning)

[3] Vab-AL: Incorporating Class Imbalance and Difficulty with Variational Bayes for Active Learning
paper:https://github.com/yuantn/MIAL/raw/master/paper.pdf
code:https://github.com/yuantn/MIAL

[2] Multiple Instance Active Learning for Object Detection（用于對象檢測的多實例主動學習）
paper:https://github.com/yuantn/MIAL/raw/master/paper.pdf
code:https://github.com/yuantn/MIAL

[1] Sequential Graph Convolutional Network for Active Learning(主動學習的順序圖卷積網絡)
paper:https://arxiv.org/pdf/2006.10219.pdf

小樣本學習(Few-shot Learning)/零樣本

[5] Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?
paper:https://arxiv.org/abs/2012.06166
code:https://github.com/mboudiaf/RePRI-for-Few-Shot-Segmentation

[4] Counterfactual Zero-Shot and Open-Set Visual Recognition(反事實零射和開集視覺識別)
paper:https://arxiv.org/abs/2103.00887
code:https://github.com/yue-zhongqi/gcm-cf

[3] Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection(小樣本目標檢測的語義關系推理)
paper:https://arxiv.org/abs/2103.01903

[2] Few-shot Open-set Recognition by Transformation Consistency(轉換一致性很少的開放集識別)

[1] Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning(探索少量學習的不變表示形式和等變表示形式的互補強度)
paper:https://arxiv.org/abs/2103.01315

持續學習(Continual Learning/Life-long Learning)

[2] Rainbow Memory: Continual Learning with a Memory of Diverse Samples（不斷學習與多樣本的記憶）

[1] Learning the Superpixel in a Non-iterative and Lifelong Manner(以非迭代和終身的方式學習超像素)

視覺推理(Visual Reasoning)

[1] Transformation Driven Visual Reasoning(轉型驅動的視覺推理)
paper:https://arxiv.org/pdf/2011.13160.pdf
code:https://github.com/hughplay/TVR
project:https://hongxin2019.github.io/TVR/

遷移學習/domain/自適應](#domain)

[4] Continual Adaptation of Visual Representations via Domain Randomization and Meta-learning(通過域隨機化和元學習對視覺表示進行連續調整)
paper:https://arxiv.org/abs/2012.04324

[3] Domain Generalization via Inference-time Label-Preserving Target Projections(基于推理時間保標目標投影的區域泛化)
paper:https://arxiv.org/abs/2103.01134

[2] MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive ?Sensing(可伸縮的自適應視頻壓縮傳感重建)
paper:https://arxiv.org/abs/2103.01786
code:https://github.com/xyvirtualgroup/MetaSCI-CVPR2021

[1] FSDR: Frequency Space Domain Randomization for Domain Generalization(用于域推廣的頻域隨機化)
paper:https://arxiv.org/abs/2103.02370

對比學習(Contrastive Learning)

[1] Fine-grained Angular Contrastive Learning with Coarse Labels(粗標簽的細粒度角度對比學習)
paper:https://arxiv.org/abs/2012.03515

暫無分類

Quantifying Explainers of Graph Neural Networks in Computational Pathology(計算病理學中圖神經網絡的量化解釋器)
paper:https://arxiv.org/pdf/2011.12646.pdf

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts(探索具有對比場景上下文的數據高效3D場景理解)
paper:http://arxiv.org/abs/2012.09165
project:http://sekunde.github.io/project_efficient
video:http://youtu.be/E70xToZLgs4

Data-Free Model Extraction(無數據模型提取)
paper:https://arxiv.org/abs/2011.14779

Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition(用于【位置識別】的局部全局描述符的【多尺度融合】)
paper:https://arxiv.org/pdf/2103.01486.pdf
code:https://github.com/QVPR/Patch-NetVLAD

Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations(適用于正確概念的權利：通過可解釋性來修正神經符號概念)
paper:https://arxiv.org/abs/2011.12854

Multi-Objective Interpolation Training for Robustness to Label Noise(多目標插值訓練的魯棒性)
paper:https://arxiv.org/abs/2012.04462
code:https://git.io/JI40X

VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs(【文本生成】VX2TEXT：基于視頻的文本生成的端到端學習來自多模式輸入)
paper:https://arxiv.org/pdf/2101.12059.pdf

Scan2Cap: Context-aware Dense Captioning in RGB-D Scans(【圖像字幕】Scan2Cap：RGB-D掃描中的上下文感知密集字幕)?

paper:https://arxiv.org/abs/2012.02206
code:https://github.com/daveredrum/Scan2Cap
project:https://daveredrum.github.io/Scan2Cap/
video:https://youtu.be/AgmIpDbwTCY

Hierarchical and Partially Observable Goal-driven Policy Learning with ?Goals Relational Graph(基于目標關系圖的分層部分可觀測目標驅動策略學習)
paper:https://arxiv.org/abs/2103.01350

ID-Unet: Iterative Soft and Hard Deformation for View Synthesis(視圖合成的迭代軟硬變形)
paper:https://arxiv.org/abs/2103.02264

PML: Progressive Margin Loss for Long-tailed Age Classification(【長尾分布】【圖像分類】長尾年齡分類的累進邊際損失)
paper:https://arxiv.org/abs/2103.02140

Diversifying Sample Generation for Data-Free Quantization（【圖像生成】多樣化的樣本生成，實現無數據量化）
paper:https://arxiv.org/abs/2103.01049

Domain Generalization via Inference-time Label-Preserving Target Projections（通過保留推理時間的目標投影進行域泛化）
paper:https://arxiv.org/pdf/2103.01134.pdf

DeRF: Decomposed Radiance Fields（分解的輻射場）
project:https://ubc-vision.github.io/derf/

Densely connected multidilated convolutional networks for dense prediction tasks（【密集預測】密集連接的多重卷積網絡，用于密集的預測任務）
paper:https://arxiv.org/abs/2011.11844

VirTex: Learning Visual Representations from Textual Annotations（【表示學習】從文本注釋中學習視覺表示）
paper:https://arxiv.org/abs/2006.06666
code:https://github.com/kdexd/virtex

Weakly-supervised Grounded Visual Question Answering using Capsules（使用膠囊進行弱監督的地面視覺問答）

FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation（【視頻插幀】FLAVR：用于快速幀插值的與流無關的視頻表示）
paper:https://arxiv.org/pdf/2012.08512.pdf
code:https://tarun005.github.io/FLAVR/Code
project:https://tarun005.github.io/FLAVR/

Probabilistic Embeddings for Cross-Modal Retrieval（跨模態檢索的概率嵌入）
paper:https://arxiv.org/abs/2101.05068

Self-supervised Simultaneous Multi-Step Prediction of Road Dynamics and Cost Map(道路動力學和成本圖的自監督式多步同時預測)

IIRC: Incremental Implicitly-Refined Classification(增量式隱式定義的分類)
paper:https://arxiv.org/abs/2012.12477
project:https://chandar-lab.github.io/IIRC/

Fair Attribute Classification through Latent Space De-biasing(通過潛在空間去偏的公平屬性分類)
paper:https://arxiv.org/abs/2012.01469
code:https://github.com/princetonvisualai/gan-debiasing
project:https://princetonvisualai.github.io/gan-debiasing/

Information-Theoretic Segmentation by Inpainting Error Maximization(修復誤差最大化的信息理論分割)
paper:https://arxiv.org/abs/2012.07287

UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pretraining(【視頻語言學習】UC2：通用跨語言跨模態視覺和語言預培訓)

Less is More: CLIPBERT for Video-and-Language Learning via Sparse Sampling(通過稀疏采樣進行視頻和語言學習)
paper:https://arxiv.org/pdf/2102.06183.pdf
code:https://github.com/jayleicn/ClipBERT

D-NeRF: Neural Radiance Fields for Dynamic Scenes(D-NeRF：動態場景的神經輻射場)
paper:https://arxiv.org/abs/2011.13961
project:https://www.albertpumarola.com/research/D-NeRF/index.html

Weakly Supervised Learning of Rigid 3D Scene Flow(剛性3D場景流的弱監督學習)
paper:https://arxiv.org/pdf/2102.08945.pdf
code:https://arxiv.org/pdf/2102.08945.pdf
project:https://3dsceneflow.github.io/

往期精彩回顧適合初學者入門人工智能的路線及資料下載機器學習及深度學習筆記等資料打印機器學習在線手冊深度學習筆記專輯《統計學習方法》的代碼復現專輯 AI基礎下載機器學習的數學基礎專輯本站qq群704220115，加入微信群請掃碼：

總結

以上是生活随笔為你收集整理的【学术相关】CVPR2021最新接收论文合集！22个方向100+篇论文汇总｜持续更新的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：微软发动图明示新一代操作系统Window
下一篇： Win7系统win键没有反应的应对措施