首页 > 其他分享 >SciTech-BigDataAIML-Algorithm-Heuristic启发式-

SciTech-BigDataAIML-Algorithm-Heuristic启发式-

时间:2024-11-06 18:08:45浏览次数:1  
标签:Heuristic LDA Algorithm topics 主题 BigDataAIML SciTech Topics

SciTech-BigDataAIML-Algorithm-Heuristic启发式-

LDA(Latent Dirichilet Allocation) Topics Model主题模型。

LDA(Latent Dirichilet Allocation, 潜在狄利克雷分布)是一种 Topics Model(主题模型),
用于在Large Scale Docs(大量文档)自动发现Hidden Topics(隐藏主题)
在NLP和Text Analysis上, LDA被广泛应用于文本分类、文档聚类、信息检索等场景。

LDA的核心思想

Doc(每篇文档) 看作由 Topics(多个主题)构成,
Topic(每个主题)又由 Terms(一组单词)按一定Distribution(概率分布)生成.

from gensim import corpora,models
#假设已经有文本数据
texts=[["human","interface","computer"1,
["survey",user","computer","system","response"1l
#创建词典和语料库
dictionary=corpora.Dictionary(texts)
corpus=[dictionary. doc2bow(text) for text in texts]
#使用LDA进行主题分析
da-model=models. LdaModeL(corpus, num-topics=2, id2word=dictionary, passes=10)
topics=lda_modeL.print-topics(num-words=3)
for topic in topics
print(topic)
(0,'0.229
*computerw+
0.228*"interface"+0.227*"human"I
(1,0.178*"computer"+0.175*"system"+0.175*"response")

标签:Heuristic,LDA,Algorithm,topics,主题,BigDataAIML,SciTech,Topics
From: https://www.cnblogs.com/abaelhe/p/18530732

相关文章