网站首页
编程语言
数据库
系统相关
其他分享
编程问答
聊斋
2023-12-28
聊斋jieba库
importjiebaprint("0217向悦")#读取文本文件path="聊斋志异.txt"file=open(path,"r",encoding="utf-8")text=file.read()file.close()#使用jieba分词words=jieba.lcut(text)#统计词频counts={}forwordinwords:#过滤掉长度为1的词语
2023-12-20
jieba分词《聊斋》
importjiebatxt=open("聊斋志异白话简写版.txt","r",encoding='utf-8').read()words=jieba.lcut(txt)#使用精确模式对文本进行分词counts={}#通过键值对的形式存储词语及其出现的次数forwordinwords:iflen(word)==1:continueelif
2023-12-17
jieba分词之聊斋
importjiebaexcludes={"不知","不可","一日","不敢","数日","以为","不能","可以","不得","如此","------------","三日","而已","明日","其中&qu