Convolutional 2D Knowledge Graph Embeddings

Convolutional 2D Knowledge Graph Embeddings

Abstract Link prediction for knowledge graphs is the task of predicting missing relationships between entities. Previous work on link prediction has focused on shallow, fast models which can scale to large knowledge graphs. However, these models learn less expressive features than deep, multi-layer models which potentially limits performance. (之前模型浅层、快速但表现力差不及深层模型). In this work we introduce ConvE, a multi-layer convolutional network model for link prediction, and report state-of-the-art results for several established datasets. We also show that the model is highly parameter efficient, yielding the same performance as DistMult and R-GCN with 8x and 17x fewer parameters. Analysis of our model suggests that it is particularly effective at modelling nodes with high indegree – which are common in highlyconnected, complex knowledge graphs such as Freebase and YAGO3. In addition, it has been noted that the WN18 and FB15k datasets suffer from test set leakage, due to inverse relations from the training set being present in the test set however, the extent of this issue has so far not been quantified. We find this problem to be severe: a simple rule-based model can achieve state-of-the-art results on both WN18 and FB15k. To ensure that models are evaluated on datasets where simply exploiting inverse relations cannot yield competitive results, we investigate and validate several commonly used datasets – deriving robust variants where necessary. We then perform experiments on these robust datasets for our own and several previously proposed models, and find that ConvE achieves state-of-the-art Mean Reciprocal Rank across all datasets.

  1. Introduction




  • 介绍了一个简单的、有竞争力的2D卷积链路预测模型,ConvE;
  • 开发了一个1-N的评分程序,加速三倍的训练和300倍的评估;
  • 更好的参数效率,比DistMult和R-GCN在FB15k-237表现更好,参数却是它们的1/8,1/17;
  • 论文提出的模型与其他浅的模型的性能区别随着知识图谱复杂度的增加而成比例增加
  • 验证了测试数据集泄露的严重性,同时提出了一个改进版的数据集
  • 对ConvE和先前其他最好的模型做了评估,ConvE取得了SOTA效果

1D vs 2D Convolutions


\[\left( \left[ \begin{matrix} a & a & a \\ \end{matrix} \right];\left[ \begin{matrix} b & b & b \\ \end{matrix} \right] \right)=\left[ \begin{matrix} a & a & a & b & b & b \\ \end{matrix} \right]\]



\[\left( \left[ \begin{matrix} a & a & a \\ a & a & a \\ \end{matrix} \right];\left[ \begin{matrix} b & b & b \\ b & b & b \\ \end{matrix} \right] \right)=\left[ \begin{matrix} a & a & a \\ a & a & a \\ b & b & b \\ b & b & b \\ \end{matrix} \right]\]


\[\left[ \begin{matrix} a & a & a \\ b & b & b \\ a & a & a \\ b & b & b \\ \end{matrix} \right]\]


Remark1: nn.Conv2d(1, 32, (3, 3), bias=True)


\[{{\psi }_{r}}\left( {{e}_{s}},{{e}_{o}} \right)=f\left( vec\left( f\left( \left[ {{{\bar{e}}}_{s}};{{{\bar{r}}}_{r}} \right]*\omega \right) \right)W \right){{e}_{o}}\tag{1} \]



\[\mathcal{L}\left( p,t \right)=-\frac{1}{N}\sum\limits_{i}{\left( {{t}_{i}}\cdot \log \left( {{p}_{i}} \right)+\left( 1-{{t}_{i}} \right)\cdot \log \left( 1-{{p}_{i}} \right) \right)}\tag{2} \]

From: https://www.cnblogs.com/Wallenda/p/16837251.html
