8 Innovative BERT Knowledge Distillation Papers That Have Changed The Landscape of NLP

时间：2023-10-31 13:01:39浏览次数：29

标签：BERT Knowledge NLP text distillation model pruning

Contemporary state-of-the-art NLP models are difficult to be utilized in production. Knowledge distillation offers tools for tackling such issues along with several others, but it has its quirks.

BERT’s inefficient nature has not gone unnoticed. Many researchers have pursued ways to reduce its cost and size. Some of the most active research is in model compression techniques such as smaller architectures (structured pruning), distillation, quantization, and unstructured pruning. A few of the more impactful papers include:

DistilBERT used knowledge distillation to transfer knowledge from a BERT base model to a 6-layer version.
TinyBERT implemented a more complicated distillation
The Lottery Ticket Hypothesis applied magnitude pruning during pre-training of a BERT model to create a sparse architecture
Movement Pruning applied a combination of the magnitude and gradient information to remove redundant parameters while fine-tuning with distillation.

https://towardsdatascience.com/https-medium-com-chaturangarajapakshe-text-classification-with-transformer-models-d370944b50ca

This post is about text classification on problems with a limited sample count.

标签：BERT,Knowledge,NLP,text,distillation,model,pruning
From： https://blog.51cto.com/emanlee/8103775

R语言自然语言处理NLP:情感分析上市公司文本信息知识发现可视化|附代码数据
全文链接：http://tecdat.cn/?p=31702原文出处：拓端数据部落公众号情感分析，就是根据一段文本，分析其表达情感的技术。比较简单的情感分析，能够辨别文本内容是积极的还是消极的（褒义/贬义）；比较复杂的情感分析，能够知道这些文字是否流露出恐惧、生气、狂喜等细致入微的情感。此外，情感的二......
CVer从0入门NLP（一）———词向量与RNN模型
......
论文阅读：DeepKE：A Deep Learning Based Knowledge Extraction Toolkit for Knowledge B
DeepKE，支持数据集和模型的结合来实现非结构化数据中信息的提取。同时提出框架和一系列的组件来实现足够的模块化和可扩展性。项目地址1.Introduction现存的KB是在实体和关系方面是不完备的。常见的一些标志性的应用：Spacy（实体识别）OpenNER（关系提取）OpenIE（信息提取）RESIN（事......
MLP代码模型--NLP方向
训练对于二分类任务，通常使用一个包含两个输出单元的输出层，而不是一个单一的输出单元。这是因为在二分类任务中，每个类别通常对应一个输出单元，一个用于表示类别1（例如正类别），另一个用于表示类别2（例如负类别）预测是......
栩栩如生,音色克隆,Bert-vits2文字转语音打造鬼畜视频实践(Python3.10)
诸公可知目前最牛逼的TTS免费开源项目是哪一个？没错，是Bert-vits2，没有之一。它是在本来已经极其强大的Vits项目中融入了Bert大模型，基本上解决了VITS的语气韵律问题，在效果非常出色的情况下训练的成本开销普通人也完全可以接受。BERT的核心思想是通过在大规模文本语料上进行无监督预......
UniKGQA Unified Retrieval and Reasoning for Solving Multi-hop Question Answering
目录概主要内容代码JiangJ.,ZhouK.,ZhaoW.andWenJ.UniKGQA:Unifiedretrievalandreasoningforsolvingmulti-hopquestionansweringoverknowledgegraph.ICLR,2023.概统一:从知识图谱中检索出相关的子图,并在子图中进行推理.主要内容我们有知识图谱......
Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text
目录概主要内容代码SunH.,DhingraB.,ZaheerM.,MazaitisK.,SalakhutdinovR.andCohenW.W.Opendomainquestionansweringusingearlyfusionofknowledgebasesandtext.EMNLP,2018.概KnowledgeBases+Text的推理.主要内容假设我们有一个不完全的知......
论文阅读：Unifying Large Language Model and Knowledge Graph：A RoadMap
1Introduction大模型和知识图谱结合的综述。简单介绍一下大模型和知识图谱的优缺点：如上所示。本文主要划分为三个模块，分别为：KG-enhancedLLMsLLM-augmentedKGsSynergizedLLM+KG2Background主要介绍了LLM和KG2.1LargeLanguageModel(LLMs)主要依靠transforme......
Makefile knowledge summarization
WildcardThewildcardinmakefileissimilarwithmacroinC/C++,itisn'tsimilarwithwildcardinlinuxshell,soitdoesn'texpendautomatically.object1=*.c//*.cobject2=$(wildcard*.cpp)//main.cppt1.cppt2.cppAutomaticallygene......
论文阅读：Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point M
Point-BERT:Pre-training3DPointCloudTransformerswith MaskedPointModeling摘要我们提出了Point-BERT，一个学习注意力的新范式，将BERT[8]的概念推广到三维点云。受BERT的启发，我们设计了一个掩蔽点建模（MPM）任务来预先训练点云注意力。具体来说，我们首先将点云划分为几个局部的......

8 Innovative BERT Knowledge Distillation Papers That Have Changed The Landscape of NLP

相关文章

赞助商

阅读排行