當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

ICML2018论文研讨会记录

發布時間：2024/3/26 编程问答 43 豆豆

生活随笔收集整理的這篇文章主要介紹了 ICML2018论文研讨会记录小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

ps：簡單記錄ICML2018論文研討會內容

2018.7.23

Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations. http://proceedings.mlr.press/v80/wang18d/wang18d.pdf

零和博弈（GAN受此啟發）和逆強化學習

Learning to Explore via Meta-Policy Gradient. http://proceedings.mlr.press/v80/xu18d/xu18d.pdf

元策略梯度

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. http://proceedings.mlr.press/v80/rashid18a/rashid18a.pdf

值函數分解

2018.7.24

Ray: A Distributed Framework for Emerging AI Applications. Arxiv. https://arxiv.org/pdf/1712.05889.pdf

伯克利分布式工具，分享者并沒有講清楚如何部署分布式

Katyusha X: Practical Momentum Method for Stochastic Sum-of-Nonconvex Optimization. ICML 2018. http://proceedings.mlr.press/v80/allen-zhu18a/allen-zhu18a.pdf

非凸優化

Self-Imitation Learning. ICML 2018. http://proceedings.mlr.press/v80/oh18b/oh18b.pdf

自模仿學習

2018.7.27

Mix & Match - Agent Curricula for Reinforcement learning. ICML 2018. http://proceedings.mlr.press/v80/czarnecki18a/czarnecki18a.pdf

transfer learning用于強化學習
k越大，模型吸收前面模型的內容越多，訓練復雜度越高

Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings. ICML 2018. http://proceedings.mlr.press/v80/co-reyes18a/co-reyes18a.pdf

類似于VAE用于分層強化學習

State Abstractions for Lifelong Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/abel18a/abel18a.pdf

終身強化學習相當于任務可遷移

2018.7.28

Efficient Neural Architecture Search via Parameter Sharing. ICML 2018. http://proceedings.mlr.press/v80/pham18a/pham18a.pdf

在NAS基礎上做改進，對于給定的神經網絡模塊，建立DAG圖，具體算法有待繼續研究
Google Brain的insight很好，但是還很weak

Implicit Quantile Networks for Distributional Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/dabney18a/dabney18a.pdf

本文分成Quantile和Distributional，可以看下作者之前兩篇工作

2018.7.29

Bayesian Optimization of Combinatorial Structures. ICML 2018. http://proceedings.mlr.press/v80/baptista18a/baptista18a.pdf

沒聽懂，只是取得部分進展

Visualizing and Understanding Atari Agents. ICML 2018. http://proceedings.mlr.press/v80/greydanus18a/greydanus18a.pdf

高斯模糊某一片，看看這塊區域對于Q值的影響

Policy Optimization with Demonstrations. ICML 2018. http://proceedings.mlr.press/v80/kang18a/kang18a.pdf

沒怎么聽

2018.7.30

Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents. ICML 2018. http://proceedings.mlr.press/v80/zhang18n/zhang18n.pdf

智能體之間的通信，隨機選擇子圖，理論早已經弄好，然后實驗簡單設計

Structured Evolution with Compact Architectures for Scalable Policy Optimization. ICML 2018. http://proceedings.mlr.press/v80/choromanski18a/choromanski18a.pdf

Google brain的，講了一堆矩陣概念，理論解釋不清楚，實驗完備

Using Reward Machines for High-Level Task Specification and Decomposition in Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/icarte18a/icarte18a.pdf

用狀態機完成與環境交互一次，就能完成多任務的reward計算

2018.7.31

Essentially No Barriers in Neural Network Energy Landscape. ICML 2018.
http://proceedings.mlr.press/v80/draxler18a/draxler18a.pdf

局部最優點連線

Time Limits in Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/pardo18a/pardo18a.pdf

考慮有限步長

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. ICML 2018. http://proceedings.mlr.press/v80/athalye18a/athalye18a.pdf

best paper對抗的7篇中未被攻克的

2018.8.1

Learning with Abandonment. ICML 2018.
http://proceedings.mlr.press/v80/schmit18a/schmit18a.pdf

在推薦系統中用強化學習，設計了一個用戶容忍度theta

Latent Space Policies for Hierarchical Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/haarnoja18a/haarnoja18a.pdf

分層強化學習主要是解決解決系數reward或者復雜情況
這篇文章文不對標題的分層強化學習

Coordinated Exploration in Concurrent Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/dimakopoulou18a/dimakopoulou18a.pdf

提出了seed算法，對比了之前的UCB和辛普森采樣，沒有解釋清楚Concurrent多智能體協同運作

2018.8.2

Clipped Action Policy Gradient. ICML 2018. http://proceedings.mlr.press/v80/fujita18a/fujita18a.pdf

求策略梯度的時候用alpha和beta截斷，是無偏估計

An Inference-Based Policy Gradient Method for Learning Options. ICML 2018. http://proceedings.mlr.press/v80/smith18a/smith18a.pdf

分層強化學習領域的一篇文章與
與ICML2017的A Laplacian Framework for Option Discovery
in Reinforcement Learning算法類似，實驗也有比較

2018.8.3

Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control. ICML 2018. http://proceedings.mlr.press/v80/srinivas18b/srinivas18b.pdf

引出對state抽象，做一個model-based，model-based與model-free結合

Investigating Human Priors for Playing Video Games. ICML 2018. http://proceedings.mlr.press/v80/dubey18a/dubey18a.pdf

2018.8.4