Linkless Link Prediction via Relational Distillation

时间：2023-11-06 14:36:13浏览次数：41

标签：via text sum LLP Distillation Prediction mathcal hat 模型

概
符号说明
LLP
代码

Guo Z., Shiao W., Zhang S., Liu Y., Chawla N. V., Shah N. and Zhao T. Linkless link prediction via relational distillation. ICML, 2023.

概

从 GNN 教师模型蒸馏到 MLP 学生模型.

符号说明

\(G = (\mathcal{V, E})\), 无向图;
\(\mathbf{A} \in \{0, 1\}^{N \times N}\), 邻接矩阵;
\(\mathbf{X} \in \mathbb{R}^{N \times F}\), node features;
\(\mathcal{E}^- = (\mathcal{V} \times \mathcal{V}) \setminus \mathcal{E}\).
\(\mathbf{H} \in \mathbb{R}^{N \times D}\), 结点表示.

LLP

LLP 假设教师模型是一个 GNN 模型 (因为通过它所得的结点表示有比较好的结构信息), 然后希望通过蒸馏将这些信息蒸馏给学生模型.
想法很简单, 令:

\[\hat{y}_{ij} = \sigma(\text{Decoder}(\bm{h}_i, \bm{h}_j)) \]
为对结点 \(v_i, v_j\) 间存在边的概率预测. 通过下面的两种方式, 拉近教师模型和学生模型的分布.
Rank-based Matching:

\[\mathcal{L}_{LLP\_R} = \sum_{v \in \mathcal{V}} \sum_{\hat{y}_{v, i}, \hat{y}_{v, j}} \max(0, -r \cdot (\hat{y}_{v, i} - \hat{y}_{v, j}) + \delta), \]
其中

\[r = \left \{ \begin{array}{ll} 1 & \text{ if } y_{v,i}^t - y_{v,j}^t > \delta, \\ -1 & \text{ if } y_{v,i}^t - y_{v,j}^t < -\delta, \\ 0 & \text{ otherwise}. \end{array} \right. \]
想法其实很简单, 就是要求学生模型模型教师模型的排序 (以一定的 margin \(\delta\)), 如果不满足给予一定的惩罚.
Distribution-based Matching:

\[\mathcal{L}_{LLP\_D} = \sum_{v \in \mathcal{V}} \sum_{i \in \mathcal{C}_v} \frac{\exp (y_{v, i}^t / \tau)}{\sum_{j \in \mathcal{C}_v} \exp (y_{v, j}^t / \tau)} \log \frac{\exp (\hat{y}_{v, i} / \tau)}{\sum_{j \in \mathcal{C}_v} \exp (\hat{y}_{v, j} / \tau)}. \]
即一般的 logits 的蒸馏. \(\mathcal{C}_v\) 是需要采样的, 以免过多的计算量. 采样方式如下:
1. 通过随机游走采样局部近似的点, 记为 \(\mathcal{C}_v^N\);
2. 随机采样结点, 记为 \(\mathcal{C}_v^R\);
3. 最后 \(\mathcal{C}_v = \mathcal{C}_v^N \cup \mathcal{C}_v^R\).
最后的训练损失为:

\[\mathcal{L} = \alpha \cdot \mathcal{L}_{sup} + \beta \cdot \mathcal{L}_{LLP\_R} + \gamma \cdot \mathcal{L}_{LLP\_D}. \]

代码

[official]

标签：via,text,sum,LLP,Distillation,Prediction,mathcal,hat,模型
From： https://www.cnblogs.com/MTandHJ/p/17812562.html

8 Innovative BERT Knowledge Distillation Papers That Have Changed The Landscape
8InnovativeBERTKnowledgeDistillationPapersThatHaveChangedTheLandscapeofNLPContemporarystate-of-the-artNLPmodelsaredifficulttobeutilizedinproduction.Knowledgedistillationofferstoolsfortacklingsuchissuesalongwithseveralothe......
odoo fileupload via controller
#-*-coding:utf-8-*-#PartofOdoo.SeeLICENSEfileforfullcopyrightandlicensingdetails.importbase64fromcollectionsimportOrderedDictfromdatetimeimportdatetimefromodooimporthttpfromodoo.exceptionsimportAccessError,Missin......
pgsql create table,cpp fill psql table via the third party library pqxx
//createtablet1;createtablet1(idbigserialnotnullprimarykey,authorvarchar(40)notnull,commentvarchar(40)notnull,contentvarchar(40)notnull,headervarchar(40)notnull,isbnvarchar(40)notnull,objectvarchar(40)notnull,summaryvarchar(40......
论文阅读：Few-Shot Point Cloud Semantic Segmentation via Contrastive Self-Supervis
Few-ShotPointCloudSemanticSegmentationvia ContrastiveSelf-SupervisionandMulti-ResolutionAttention基于对比自我监督和多分辨率注意力的小样本点云语义分割摘要本文提出了一种适用于现实世界应用的有效的小样本点云语义分割方法。现有的点云小样本分割方法在很大程......
论文阅读：Knowledge Distillation via the Target-aware Transformer
摘要Knowledgedistillationbecomesadefactostandardtoimprovetheperformanceofsmallneuralnetworks.知识蒸馏成为提高小型神经网络性能的事实上的标准。Mostofthepreviousworksproposetoregresstherepresentationalfeaturesfromtheteachertothes......
Triangle Graph Interest Network for Click-through Rate Prediction
目录概TGINMotivation:Triangle的重要性Model代码JiangW.,JiaoY.,WangQ.,LiangC.,GuoL.,ZhangY.,SunZ.,XiongY.andZhuY.Trianglegraphinterestnetworkforclick-throughrateprediction.WSDM,2022.概'图'用于精排,但是这里的图的使用主要是基于......
Dual Graph enhanced Embedding Neural Network for CTR Prediction
目录概DG-ENNGuoW.,SuR.,TanR.,GuoH.,ZhangY.,LiuZ.,TangR.andHeX.Dualgraphenhancedembeddingneuralnetworkforctrprediction.KDD,2021.概图网络用在精排上,作者的出发点是为了解决(user/item)特征的稀疏性和用户交互序列的稀疏性,不过这出......
Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Predicti
目录概Fi-GNN代码LiZ.,CuiZ.,WuS.,ZhangX.andWangL.Fi-GNN:Modelingfeatureinteractionsviagraphneuralnetworksforctrprediction.CIKM,2019.概"图网络"用在精排阶段(算哪门子图网络啊).Fi-GNN一个item可能有多种field,比如:\[\underbrace......
TALL: Temporal Activity Localization via Language Query
1introduction确定任务：TALL（TemporalActivityLocalizationviaLanguage）：基于文本的时间活动定位，具体来说就是给定给定一个未修剪的视频和一个自然语言查询，目标是确定视频中所描述活动的开始和结束时间。将视觉和文本特征嵌入到公共空间以获得更好效果，但是这样对齐任务(alignme......
论文阅读：Semi-supervised point cloud segmentation using self-training with label
Semi-supervisedpointcloudsegmentationusingself-trainingwithlabelconfidencepredictionLi等人（2021b）基于伪标签置信度预测的半监督分割方法,额外设计判别网络（discriminatornetwork），该网络目标是区分预测结果和真实标注，并对无标注点云的预测结果输出置信度预测，对判别网络......

Linkless Link Prediction via Relational Distillation

概

符号说明

LLP

代码

相关文章

赞助商

阅读排行