一、 论文所解决的问题
采用集成学习的方式,集成了RoBERTa, ALBERT, and XLNet三种bert改进的版本,以一定的权重进行结果的计算,解决虚假评论预测问题。
二、 创新点
三、 模型架构图
四、 设置参数及所用的数据集
1. 网络构成
RoBERTa::768 hidden layers, 12 layers, 125 million parameters, and 12 attention heads.
XLnets : 768 hidden layers, 12 layers, 110 million parameters, and 12 attention heads.
RLBE RT : 768 hidden layers, 12 layers, 12 attention heads, 128 embedding, and 11 million parameters.
2. 参数设置
epoches: 10
delta = 0
used AdamW optimiser
loss using binary cross-entropy
3. 数据集
SVM [5]: A model of combining bigram and LIWC features using SVM as a classifier.
原文:M. Ott, Y. Choi, C. Cardie, and J. T. Hancock,《 Finding deceptive opinion spam by any stretch of the imagination》
SVM [19]: A model of a combination of four grams and LIWC features using SVM as a classifier.
原文:L. Cagnina and P. Rosso, 《Classification of deceptive opinions using a low dimensionality representation》
SVM [15]: A model of using unigram features with SVM as a classifier.
原文:S. Feng, R. Banerjee, and Y. Choi, 《Syntactic stylometry for deception detection》
SAGE [16]: The Sparse Additive Generative Model (SAGE) is a mix of topic modelling and a generalised additive model.
原文:J. Li, M. Ott, C. Cardie, and E. Hovy, 《Towards a general rule for identifying deceptive opinion spam》
RCNN [38] is a model of a combination of recurrent neural networks and convolutional neural networks.
原文:S. Lai, L. Xu, K. Liu, and J. Zhao, 《Recurrent convolutional neural networks for text classification》
GRNN–CNN [39]: it is a hybrid fake reviews detection model. They combined a gated recurrent neural network (GRU) and a convolutional neural network.
原文:Y. Ren and D. Ji, 《Neural networks for deceptive opinion spam detection: an empirical study》
DRI-RCNN [27] is a recurrent convolutional deep neural networks model (DRI-RCNN) for detecting fake reviews based on word contexts.
原文:W. Zhang, Y. Du, T. Yoshida, and Q. Wang, 《DRI-RCNN: An approach to deceptive review identification using recurrent convolutional neural network》
BERT-Base Case [6]: A BERT-trained model is used to pre-train a deep bidirectional representation of the text that is capable of handling unlabelled data by simultaneously focusing on right and left context in all layers.
版权声明:本文为CSDN博主「weixin_39877064」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。