1 Statistical Model
1.1 One-Hot
1.2 Bag of words(BOW)
https://web.stanford.edu/class/datasci112/lectures/lecture8.pdf
1.3 N-grams
1.4 TF-IDF
2 Word Embedding(Neural Network Model)
2.1 Word2Vec
https://projector.tensorflow.org/
Continuous Bag of Words(CBOW)
Skip-Gram
The goal is to get the word vector
Trainable weight is input weight matrix and output matrix
2.2 Glove
2.3 FastText
3 BERT
4 SBERT(Sentence Embedding)
Reference
标签:https,Text,LM,Embedding,Representation,com,sawyerbutton,matrix From: https://www.cnblogs.com/forhheart/p/18171197https://deysusovan93.medium.com/from-traditional-to-modern-a-comprehensive-guide-to-text-representation-techniques-in-nlp-369946f67497
https://github.com/sawyerbutton/NLP-Funda-2023-Spring
https://github.com/sawyerbutton/LM-Funda-2024-Spring/blob/main/示例代码/Lesson3/LM_Lesson3_Embedding_demo.ipynb