论文阅读《FAKE OR GENUINE? CONTEXTUALISED TEXT REPRESENTATION FOR FAKE REVIEW DETECTION》

标签：layers SVM GENUINE neural CONTEXTUALISED 评论 FAKE 原文 model

一、论文所解决的问题

采用集成学习的方式，集成了RoBERTa, ALBERT, and XLNet三种bert改进的版本，以一定的权重进行结果的计算，解决虚假评论预测问题。

二、创新点

采用了集成学习方式，打败了各自单独处理虚假评论检测的结果。

三、模型架构图

将输入分别用三种网络计算，将得到的结果进行加权求和，再利用交叉熵损失函数计算。

四、设置参数及所用的数据集

1. 网络构成

RoBERTa:：768 hidden layers, 12 layers, 125 million parameters, and 12 attention heads.

XLnets : 768 hidden layers, 12 layers, 110 million parameters, and 12 attention heads.

RLBE RT : 768 hidden layers, 12 layers, 12 attention heads, 128 embedding, and 11 million parameters.

2. 参数设置

mini_batch:32

epoches: 10

delta = 0

used AdamW optimiser

loss using binary cross-entropy

3. 数据集

使用OpSpam和Deception两个数据集，OpSpam数据集包含了美国芝加哥地区20家酒店的1600条评论文本，其中800条是假的，800条是真的。标签“1”表示虚假评论，而标签“0”表示合法评论。这些评论来自不同的来源。假评论是用亚马逊机械土耳其(AMT)构建的，其余的评论是从Yelp、猫途鹰(TripAdvisor)和Expedia等各种在线评论网站收集的。Deception数据集[16]代表一个包含3032个评论的黄金标准数据集。该数据集包含关于三个不同领域(酒店、医生和餐馆)的信息。两个数据集都只有审查文本，没有任何元数据信息。在我们的实验中，OpSpam和Deception数据集的80%用于训练，每个数据集的其余20%用于测试模型。表2显示了两个数据集的统计信息。

五、实验结果

附虚假评论检测领域基线网络：

SVM [5]: A model of combining bigram and LIWC features using SVM as a classifier.

原文：M. Ott, Y. Choi, C. Cardie, and J. T. Hancock,《 Finding deceptive opinion spam by any stretch of the imagination》

SVM [19]: A model of a combination of four grams and LIWC features using SVM as a classifier.

原文：L. Cagnina and P. Rosso, 《Classification of deceptive opinions using a low dimensionality representation》

SVM [15]: A model of using unigram features with SVM as a classifier.

原文：S. Feng, R. Banerjee, and Y. Choi, 《Syntactic stylometry for deception detection》

SAGE [16]: The Sparse Additive Generative Model (SAGE) is a mix of topic modelling and a generalised additive model.

原文：J. Li, M. Ott, C. Cardie, and E. Hovy, 《Towards a general rule for identifying deceptive opinion spam》

RCNN [38] is a model of a combination of recurrent neural networks and convolutional neural networks.

原文：S. Lai, L. Xu, K. Liu, and J. Zhao, 《Recurrent convolutional neural networks for text classification》

GRNN–CNN [39]: it is a hybrid fake reviews detection model. They combined a gated recurrent neural network (GRU) and a convolutional neural network.

原文：Y. Ren and D. Ji, 《Neural networks for deceptive opinion spam detection: an empirical study》

DRI-RCNN [27] is a recurrent convolutional deep neural networks model (DRI-RCNN) for detecting fake reviews based on word contexts.

原文：W. Zhang, Y. Du, T. Yoshida, and Q. Wang, 《DRI-RCNN: An approach to deceptive review identification using recurrent convolutional neural network》

BERT-Base Case [6]: A BERT-trained model is used to pre-train a deep bidirectional representation of the text that is capable of handling unlabelled data by simultaneously focusing on right and left context in all layers.
————————————————
版权声明：本文为CSDN博主「weixin_39877064」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/weixin_39877064/article/details/127001784

标签：layers,SVM,GENUINE,neural,CONTEXTUALISED,评论,FAKE,原文,model
From： https://www.cnblogs.com/poemWineTea/p/16721229.html

论文阅读《FAKE OR GENUINE? CONTEXTUALISED TEXT REPRESENTATION FOR FAKE REVIEW DETECTION》

相关文章

赞助商

阅读排行