GraphSAGE: Inductive Representation Learning on Large Graphs
GraphSAGE研究意義:
1. 圖卷積神經網絡最常用的幾個模型(GCN、GAT、GraphSAGE)
2、歸納式學習(inductive learning)
3、不同于之前的學習node embedding,提出學習aggregators等函數的方式
4、探討了多種aggregator方式(mean、pool、lstm)
5、圖表征學習的經典baseline
?
論文主要結構:
一、摘要Abstract
介紹圖的廣泛應用,主要引出本文的motivations是做圖的歸納式學習,通過學習一組函數對節點的鄰居采樣,然后匯聚得到向量式表達,具體可以總結為以下幾點:
? ? ? ?1、提出一種歸納式學習模型,可以得到新點/新圖的表征
2、GraphSAGE模型通過學習一組函數來得到點的特征
3、采樣并匯聚點的鄰居特征與節點的特征拼接得到點的特征
4、GraphSAGE算法在直推式和歸納式學習均達到最優結果
二、Introduction
? ? ? ?介紹了圖的廣泛應用,介紹之前的工作主要是基于靜態圖的算法,GraphSAGE處理新點甚至新圖,總結了DeepWalk、Node2vec、GCN等算法,提出本文算法主要是訓練aggregate函數
三、Related Work
? ? ? ?介紹之前的算法,基于隨機游走、矩陣分解、圖卷積等算法
四、GraphSAGE模型
? ? ? ?主要介紹前向傳播算法、模型參數介紹、aggregator模型結構
? ??
?
GraphSAGE算法如上圖Algorithm1,主要的部分就是歸納也就是(4)、(5)兩部分,所有鄰居信息匯聚,以及自身信息和鄰居信息合并計算
?
接著,文章又介紹了目標函數(如上圖3.2),不僅可以進行有監督學習,還可以進行無監督學習,無監督學習的目標函數和之前的圖算法目標函數一致,說的就是圖結構中,兩個節點關系比較緊密,那么學出來的兩個節點的embedding也比較相似
之后介紹了aggregate函數的幾種方式,包括Mean、LSTM、Pooling,論文附錄中還給出批量學習的算法
五、Experiments
? ? ? ?實驗設置、數據集選擇、直推式學習實驗、參數分析、不同aggregate函數對模型的影響分析
?
?
主要介紹了一些實驗參數以及對·實驗數據集的介紹,最后實驗結果對比
六、Theoretical Analysis && Conclusion
? ? ? ? 總結提出的GraphSAGE模型具有歸納式的能力,鄰居匯聚時考慮不同的aggregator方式,討論了幾種未來方向和subgraph embedding 鄰居采樣方式等
? ? ? ? 創新點:?
1、歸納式學習(inductive learning)
2、多種aggregators探討
3、文中并給出一些理論分析
關鍵點:
1、模型結構
2、鄰居節點的sampling
3、Batch訓練方式
啟發點:
1、歸納式學習方式
2、多種aggregate函數討論
3、Batch 訓練方式 sample 鄰居性能高效
4、GCN、GAT、GraphSAGE經典的baselines
七、Coding
論文中的數據集-cora數據集主要包含兩個文件,一個是cora.cites表示兩個節點節點是否有邊另一個是cora.content 表示每個節點的特征以及labelexample:cora.cites35 1033 35 103482 35 103515 35 1050679 35 1103960 35 1103985 35 1109199 35 1112911 35 1113438 35 1113831 35 1114331 35 1117476 35 1119505 35 1119708 35 1120431 35 1123756 35 1125386 35 1127430 35 1127913 .....cora.content31336 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 Neural_Networks ......?
""" 加載數據并對數據進行處理 """def load_cora():import numpy as npnum_nodes = 2708num_feats = 1433feat_data = np.zeros((num_nodes, num_feats))labels = np.empty((num_nodes, 1), dtype=np.int64)node_map = {}label_map = {}with open('../cora/cora.content') as fp:for i,line in enumerate(fp):info = line.strip().split()tmp = []for ss in info[1:-1]:tmp.append(float(ss))feat_data[i,:] = tmpnode_map[info[0]] = iif not info[-1] in label_map:label_map[info[-1]] = len(label_map)labels[i] = label_map[info[-1]]from collections import defaultdictadj_lists = defaultdict(set)with open('../cora/cora.cites') as fp:for i,line in enumerate(fp):info = line.strip().split()uid = node_map[info[0]]target_uid = node_map[info[1]]adj_lists[uid].add(target_uid)adj_lists[target_uid].add(uid)return feat_data,labels,adj_lists""" 構建aggregate 函數"""import torch import torch.nn as nn from torch.autograd import Variable import randomclass MeanAggregator(nn.Module):def __init__(self,features,cuda=False,gcn=False):super(MeanAggregator,self).__init__()self.features = featuresself.cuda = cudaself.gcn = gcndef forward(self,nodes,to_neighs,num_sample=10):_set = setif not num_sample is None:_sample = random.samplesamp_neighs = [_set(_sample(to_neigh, num_sample)) if len(to_neigh) >= num_sample else to_neigh for to_neigh in to_neighs]else:sample_neighs = to_neighsif self.gcn:sample_neighs = [samp_neigh + set([nodes[i]]) for i,samp_neigh in enumerate(samp_neighs)]unique_nodes_list = list(set.union(*samp_neighs))unique_nodes = {n:i for i,n in enumerate(unique_nodes_list)}mask = Variable(torch.zeros(len(samp_neighs),len(unique_nodes)))column_indices = [unique_nodes[n] for samp_neigh in samp_neighs for n in samp_neigh]row_indices = [i for i in range(len(samp_neighs)) for j in range(len(samp_neighs[i]))]mask[row_indices,column_indices] = 1if self.cuda:mask = mask.cuda()num_neigh = mask.sum(1,keepdim=True)mask = mask.div(num_neigh)if self.cuda:embed_matrix = self.features(torch.LongTensor(unique_nodes_list).cuda())else:embed_matrix = self.features(torch.LongTensor(unique_nodes_list))to_feats = mask.mm(embed_matrix)return to_feats""" 自身節點和鄰居節點進行聚合 """import torch import torch.nn as nn from torch.nn import init import torch.nn.functional as Fclass Encoder(nn.Module):"""Encodes a node's using 'convolutional' GraphSage approach"""def __init__(self, features, feature_dim, embed_dim, adj_lists, aggregator,num_sample=10,base_model=None, gcn=False, cuda=False, feature_transform=False): super(Encoder, self).__init__()self.features = features# 變換前的hidden_size/維度self.feat_dim = feature_dimself.adj_lists = adj_lists# 即鄰居聚合后的mebeddingself.aggregator = aggregatorself.num_sample = num_sampleif base_model != None:self.base_model = base_modelself.gcn = gcn# 變換后的hidden_size/維度self.embed_dim = embed_dimself.cuda = cudaself.aggregator.cuda = cuda# 矩陣W維度 = 變換后維度 * 變換前維度# 其中gcn表示是否拼接,如果拼接的話由于是"自身向量||鄰居聚合向量", 所以維度為2倍self.weight = nn.Parameter(torch.FloatTensor(embed_dim, self.feat_dim if self.gcn else 2 * self.feat_dim))init.xavier_uniform(self.weight)def forward(self, nodes):"""Generates embeddings for a batch of nodes.nodes -- list of nodes"""neigh_feats = self.aggregator.forward(nodes, [self.adj_lists[int(node)] for node in nodes], self.num_sample)if not self.gcn:if self.cuda:self_feats = self.features(torch.LongTensor(nodes).cuda())else:self_feats = self.features(torch.LongTensor(nodes))# 將自身和聚合鄰居的向量拼接, algorithm 1 line 5的拼接部分combined = torch.cat([self_feats, neigh_feats], dim=1)else:# 只用聚合鄰居的向量來表示,不用自身信息, algorithm 1 line 5的拼接部分combined = neigh_feats# 送入到神經網絡,algorithm 1 line 5乘以矩陣Wcombined = F.relu(self.weight.mm(combined.t()))# 經過一層GNN layer后的點的embedding,維度為embed_dim * nodesreturn combined""" 定義整體結構 """class SupervisedGraphSage(nn.Module):def __init__(self, num_classes, enc):super(SupervisedGraphSage, self).__init__()# 這里面賦值為enc2(經過兩層GNN)self.enc = encself.xent = nn.CrossEntropyLoss()# 全連接參數矩陣,映射到labels num_classes維度做分類self.weight = nn.Parameter(torch.FloatTensor(num_classes, enc.embed_dim))init.xavier_uniform(self.weight)def forward(self, nodes):# embeds實際是我們兩層GNN后的輸出nodes embeddingembeds = self.enc(nodes)# 最后將nodes * hidden size 映射到 nodes * num_classes(= 7)之后做softmax計算cross entropyscores = self.weight.mm(embeds)return scores.t()def loss(self, nodes, labels):# 錢箱傳播scores = self.forward(nodes)# 定義的cross entropyreturn self.xent(scores, labels.squeeze())""" 訓練模型 """def run_cora():# 隨機數設置seed(種子)np.random.seed(1)random.seed(1)# cora數據集點數num_nodes = 2708# 加載cora數據集, 分別是# feat_data: 特征# labels: 標簽# adj_lists: 鄰接表,dict (key: node, value: neighbors set)feat_data, labels, adj_lists = load_cora()# 設置輸入的input features矩陣X的維度 = 點的數量 * 特征維度features = nn.Embedding(2708, 1433)# 為矩陣X賦值,參數不更新features.weight = nn.Parameter(torch.FloatTensor(feat_data), requires_grad=False)# features.cuda()# 一共兩層GNN layer# 第一層GNN# 以mean的方式聚合鄰居, algorithm 1 line 4agg1 = MeanAggregator(features, cuda=True)# 將自身和聚合鄰居的向量拼接后送入到神經網絡(可選是否只用聚合鄰居的信息來表示), algorithm 1 line 5enc1 = Encoder(features, 1433, 128, adj_lists, agg1, gcn=True, cuda=False)# 第二層GNN# 將第一層的GNN輸出作為輸入傳進去# 這里面.t()表示轉置,是因為Encoder class的輸出維度為embed_dim * nodesagg2 = MeanAggregator(lambda nodes : enc1(nodes).t(), cuda=False)# enc1.embed_dim = 128, 變換后的維度還是128enc2 = Encoder(lambda nodes : enc1(nodes).t(), enc1.embed_dim, 128, adj_lists, agg2,base_model=enc1, gcn=True, cuda=False)# 采樣的鄰居點的數量enc1.num_samples = 5enc2.num_samples = 5# 7分類問題# enc2是經過兩層GNN layer時候得到的 node embedding/featuresgraphsage = SupervisedGraphSage(7, enc2)# graphsage.cuda()# 目的是打亂節點順序rand_indices = np.random.permutation(num_nodes)# 劃分測試集、驗證集、訓練集test = rand_indices[:1000]val = rand_indices[1000:1500]train = list(rand_indices[1500:])# 用SGD的優化,設置學習率optimizer = torch.optim.SGD(filter(lambda p : p.requires_grad, graphsage.parameters()), lr=0.7)# 記錄每個batch訓練時間times = []# 共訓練100個batchfor batch in range(100):# 取256個nodes作為一個batchbatch_nodes = train[:256]# 打亂訓練集的順序,使下次迭代batch隨機random.shuffle(train)# 記錄開始時間start_time = time.time()optimizer.zero_grad()# 這個是SupervisedGraphSage里面定義的cross entropy lossloss = graphsage.loss(batch_nodes, Variable(torch.LongTensor(labels[np.array(batch_nodes)])))# 反向傳播和更新參數loss.backward()optimizer.step()# 記錄結束時間end_time = time.time()times.append(end_time-start_time)# print (batch, loss.data[0])print (batch, loss.data)# 做validationval_output = graphsage.forward(val)# 計算micro F1 scoreprint ("Validation F1:", f1_score(labels[val], val_output.data.numpy().argmax(axis=1), average="micro"))# 計算每個batch的平均訓練時間print ("Average batch time:", np.mean(times)) """ 模型運行結果 """run_cora()0 tensor(1.9649) 1 tensor(1.9406) 2 tensor(1.9115) 3 tensor(1.8925) 4 tensor(1.8731) 5 tensor(1.8354) 6 tensor(1.8018) 7 tensor(1.7535) 8 tensor(1.6938) 9 tensor(1.6029) 10 tensor(1.6312) 11 tensor(1.5248) 12 tensor(1.4800) 13 tensor(1.4503) 14 tensor(1.4162) 15 tensor(1.3210) 16 tensor(1.2243) 17 tensor(1.2255) 18 tensor(1.0978) 19 tensor(1.1330) 20 tensor(0.9534) 21 tensor(0.9112) 22 tensor(0.9170) 23 tensor(0.7924) 24 tensor(0.8008) 25 tensor(0.7142) 26 tensor(0.7839) 27 tensor(0.8878) 28 tensor(1.2177) 29 tensor(0.9943) 30 tensor(0.8073) 31 tensor(0.6588) 32 tensor(0.6254) 33 tensor(0.5622) 34 tensor(0.5158) 35 tensor(0.4763) 36 tensor(0.5298) 37 tensor(0.5419) 38 tensor(0.5098) 39 tensor(0.4122) 40 tensor(0.4262) 41 tensor(0.4451) 42 tensor(0.4126) 43 tensor(0.4409) 44 tensor(0.3913) 45 tensor(0.4496) 46 tensor(0.4365) 47 tensor(0.4601) 48 tensor(0.4714) 49 tensor(0.4090) 50 tensor(0.4145) 51 tensor(0.3428) 52 tensor(0.3454) 53 tensor(0.3531) 54 tensor(0.3131) 55 tensor(0.2719) 56 tensor(0.3519) 57 tensor(0.3286) 58 tensor(0.3125) 59 tensor(0.2529) 60 tensor(0.3033) 61 tensor(0.2332) 62 tensor(0.3049) 63 tensor(0.3026) 64 tensor(0.3770) 65 tensor(0.3811) 66 tensor(0.3223) 67 tensor(0.2450) 68 tensor(0.2620) 69 tensor(0.2846) 70 tensor(0.2482) 71 tensor(0.3044) 72 tensor(0.4133) 73 tensor(0.3156) 74 tensor(0.4421) 75 tensor(0.2596) 76 tensor(0.2585) 77 tensor(0.2639) 78 tensor(0.2035) 79 tensor(0.2328) 80 tensor(0.1748) 81 tensor(0.1730) 82 tensor(0.1978) 83 tensor(0.1614) 84 tensor(0.1890) 85 tensor(0.1227) 86 tensor(0.1568) 87 tensor(0.1527) 88 tensor(0.2365) 89 tensor(0.2297) 90 tensor(0.1787) 91 tensor(0.1920) 92 tensor(0.1864) 93 tensor(0.1254) 94 tensor(0.1678) 95 tensor(0.1336) 96 tensor(0.1562) 97 tensor(0.2531) 98 tensor(0.2392) 99 tensor(0.2089) Validation F1: 0.864 Average batch time: 0.047979302406311035?
總結
以上是生活随笔為你收集整理的GraphSAGE: Inductive Representation Learning on Large Graphs的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: node2vec: Scalable F
- 下一篇: metapath2vec: Scalab