jieba分词

jieba分词

时间：2023-12-21 21:45:09浏览次数：31

import jieba

# 读取文本文件
path = "红楼梦.txt"
file = open(path, "r", encoding="GB2312",errors="ignore")
text = file.read()
file.close()

# 使用jieba分词
words = jieba.lcut(text)

# 统计词频
counts = {}
for word in words:
# 过滤掉长度为1的词语
    if len(word) == 1:
        continue
# 更新字典中的词频
    counts[word] = counts.get(word, 0) + 1

# 对字典中的键值对进行排序
items = list(counts.items())
items.sort(key=lambda x: x[1], reverse=True)

# 输出前20个高频词语
for i in range(20):
    word, count = items[i]
    print(f"{word:<10}{count:>5}")

标签：jieba,word,items,file,counts,分词
From： https://www.cnblogs.com/antea/p/17920181.html

jieba分词
importjiebatxt=open("D:\\python\\西游记.txt","r",encoding='ansi').read()words=jieba.lcut(txt)#使用精确模式对文本进行分词counts={}#通过键值对的形式存储词语及其出现的次数forwordinwords:iflen(word)==1:continueelifword......
jieba分词《聊斋》
importjiebatxt=open("聊斋志异白话简写版.txt","r",encoding='utf-8').read()words=jieba.lcut(txt)#使用精确模式对文本进行分词counts={}#通过键值对的形式存储词语及其出现的次数forwordinwords:iflen(word)==1:continueelif......
jieba分词——西游记相关的分词，出现次数最高的20个
1importjieba23txt=open("D:\Pythonproject\Python123作业\西游记.txt","r",encoding='utf-8').read()4words=jieba.lcut(txt)#使用精确模式对文本进行分词5counts={}#通过键值对的形式存储词语及其出现的次数67forwordinwords:......
jieba 分词
西游记相关的分词，出现次数最高的20个输入：1importjieba2excludes={"一个","我们","怎么","那里","不知","不是","只见","两个","不敢","这个","如何","原来","甚......
jieba分词
尾号为1,2,3的同学做，西游记相关的分词，出现次数最高的20个。```importjieba#读取文本文件path="西游记.txt"file=open(path,"r",encoding="utf-8")text=file.read()file.close()#使用jieba分词words=jieba.lcut(text)#统计词频counts={}forwordin......
jieba 分词
描述尾号为1,2,3的同学做，西游记相关的分词，出现次数最高的20个。‪‬‪‬‪‬‪‬‪‬‮‬‪‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‮‬‪‬‪‬‪‬‪‬‪‬‮‬‫‬‭‬‪‬‪‬‪‬‪‬‪‬‮‬‫‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‪‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‭‬‪‬‪......
jieba 分词红楼梦相关的分词，出现次数最高的20个
点击查看代码importjiebaimportwordclouddeftakeSecond(elem):returnelem[1]defcreateWordCloud(text):#生成词云函数w=wordcloud.WordCloud(font_path="STZHONGS.TTF",width=1000,height=500,background_color="white")w.g......
jieba 分词西游记
importjiebatxt=open("西游记.txt","r",encoding='utf-8').read()words=jieba.lcut(txt)counts={}forwordinwords:iflen(word)==1:continueelifword=="大圣"orword=="老孙"or......
jieba分词
jieba分词‪‬‪‬‪‬‪‬‪‬‮‬‪‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‮‬‪‬‪‬‪‬‪‬‪‬‮‬‫‬‪‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‪‬‪‬‪‬‪‬‪‬‪‬‮‬‭‬‪‬‪‬‪‬‪‬‪‬‪‬‮‬描述‪‬‪‬‪‬‪‬‪‬‮‬‪‬......
jieba分词之聊斋
importjiebaexcludes={"不知","不可","一日","不敢","数日","以为","不能","可以","不得","如此","------------","三日","而已","明日","其中&qu......

相关文章

赞助商

阅读排行