當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

论文笔记之：Graph Attention Networks

發布時間：2025/7/14 编程问答 33 豆豆

生活随笔收集整理的這篇文章主要介紹了论文笔记之：Graph Attention Networks 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

Graph Attention Networks

2018-02-06??16:52:49

Abstract：

　　本文提出一種新穎的 graph attention networks (GATs), 可以處理 graph 結構的數據，利用 masked self-attentional layers 來解決基于 graph convolutions 以及他們的預測的前人方法（prior methods）的不足。

　　對象：graph-structured data.

　　方法：masked self-attentional layers.?

　　目標：to address the shortcomings of prior methods based on graph convolutions or their approximations.?

　　具體方法：By stacking layers in which nodes are able to attend over their neghborhood's feature. We enables specifying different weights to different nodes in a neighborhood, without requiring any kinds of costly matrix operation or depending on knowing the graph structure upfront.?

Introduction：

　　Background：CNN 已經被廣泛的應用于各種 grid 結構的數據當中，各種 task 都取得了不錯的效果，如：物體檢測，語義分割，機器翻譯等等。但是，有些數據結構，不是這種 grid-like structure 的，如：3D meshes, social networks, telecommunication networks, biological networks, brain connection。

　　已經有多個嘗試將 RNN 和 graph 結構的東西結合起來，來進行表示。

　　目前，將 convolution 應用到 the graph domain，常見的有兩種做法：

　　1. spectral approaches?

　　2. non-spectral approaches (spatial based methods)

　　文章對這兩種方法進行了簡要的介紹，回顧了一些最近的相關工作。

　　然后就提到了 Attention Mechanisms，這種思路已經被廣泛的應用于各種場景中。其中一個優勢就是：they allow for dealing with variable sized inputs, focusing on the most relvant parts of the input to make decisions。當 attention 被用來計算 single sequence 的表示時，通常被稱為：self-attention or intra-attention。將這種方法和 CNN/RNN 結合在一起，就可以得到非常好的結果了。

　　受到最新工作的啟發，我們提出了 attention-based architecture 來執行 node classification of graph-structured data。This idea is to compute the hidden representations of each node in the graph, by attending over its neighbors, following a self-attention stategy。這個注意力機制有如下幾個有趣的性質：

　　1. 操作是非常有效的。

　　2. 可應用到有不同度的 graph nodes，通過給其緊鄰指定不同的權重；

　　3. 這個模型可以直接應用到 inductive learning problems, including tasks where the model has to generalize to completely unseen graphs.??

　　Our approach of sharing a neural network computation across edges is reminiscent of the formulation of relational networks (Santoro et al., 2017), wherein relations between objects (regional features from an image extracted by a convolutional neural network) are aggregated across all object pairs, by employing a shared mechanism.?　　

　　作者在三個數據集上進行了實驗，達到頂尖的效果，表明了 attention-based models 在處理任意結構的 graph 的潛力。

GAT Architecture ：

1. Graph Attentional Layer?

　　本文所提出 attentional layer 的輸入是一組節點特征（a set of node features），?其中，N 是節點的個數，F 是每個節點的特征數。該層產生一組新的節點特征，作為其輸出，即：。

　　為了得到充分表達能力，將輸入特征轉換為高層特征，至少我們需要一個可學習的線性轉換（one learnable linear transformation）。為了達到該目標，作為初始步驟，一個共享的線性轉換，參數化為? weight matrix，W，應用到每一個節點上。我們然后在每一個節點上，進行 self-attention --- a shared attentional mechanism a：計算 attention coefficients?

　　表明 node j's feature 對 node i 的重要性。最 general 的形式，該模型允許 every node to attend on every other node, dropping all structural information. 我們將這種 graph structure 通過執行 masked attention 來注射到該機制當中 ---? 我們僅僅對 nodes $j$ 計算 $e_{ij}$，其中，graph 中節點 i 的一些近鄰，記為：$N_{i}$。在我們的實驗當中，這就是 the first-order neighbors of $i$。

　　為了使得系數簡單的適應不同的節點，我們用 softmax function 對所有的 j 進行歸一化：

　　在我們的實驗當中，該 attention 機制 a? 是一個 single-layer feedforward neural network，參數化為權重向量?。全部展開，用 attention 機制算出來的系數，可以表達為：

　　其中，$*^T$ 代表轉置，|| 代表 concatenation operation。

　　一旦得到了，該歸一化的 attention 系數可以用來計算對應特征的線性加權，可以得到最終的每個節點的輸出向量：

　　為了穩定 self-attention 的學習過程，我們發現將我們的機制拓展到 multi-head attention 是有好處的，類似于：Attention is all you need. 特別的，K 個獨立的 attention 機制執行公式（4）的轉換，然后將其特征進行組合，得到下面的特征輸出：

　　特別的，如果我們執行在 network 的最后輸出層執行該 multi-head attention，concatenation 就不再是必須的了，相反的，我們采用 averaging，推遲執行最終非線性，

?　　所提出 attention 加權機制的示意圖，如下所示：

總結

以上是生活随笔為你收集整理的论文笔记之：Graph Attention Networks的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： nginx: [emerg] socke
下一篇： HTTP权威指南阅读笔记五：Web服务器

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

编程问答

论文笔记之：Graph Attention Networks

總結