首页 > 其他分享 >Lucene分词报错:”TokenStream contract violation: close() call missing”

Lucene分词报错:”TokenStream contract violation: close() call missing”

时间:2022-12-19 17:02:46浏览次数:74  
标签:java violation wltea analyzer call org close 报错 tokenStream

Lucene使用IKAnalyzer分词时报错:”​​TokenStream contract violation: close() call missing​​”  解决办法是每次完成后必须调用关闭方法。

如果报错:​​java.lang.illegalstateexception: tokenstream contract violation: reset()/close() call missing​​,则要在tokenStream.incrementToken(),原因是lucene从4.6.0开始tokenstream使用方法更改的问题,在使用incrementtoken方法前必须调用reset方法,详见api http://lucene.apache.org/core/4_6_0/core/index.html 。

以下正确示例代码(第10行和22行调用reset()和close()方法):

public Set<String> slicing(String text){
Set<String> result = new HashSet<>();
StringReader reader = null;
TokenStream tokenStream = null;
try {
reader = new StringReader(text);
tokenStream = analyzer.tokenStream("", reader);
CharTermAttribute charTermAttribute = tokenStream.getAttribute(CharTermAttribute.class);
OffsetAttribute offsetAttribute = tokenStream.addAttribute(OffsetAttribute.class);
tokenStream.reset();
while (tokenStream.incrementToken()) {
int startOffset = offsetAttribute.startOffset();
int endOffset = offsetAttribute.endOffset();
if((endOffset - startOffset) > 1){
String term = charTermAttribute.toString();
result.add(term);
}
}
} catch (IOException e) {
e.printStackTrace();
} finally{
IOs.close(tokenStream, reader);
}
return result;
}

 

​http://www.lizi.pw/archives/56​

 

org.wltea.analyzer.lucene.IKAnalyzer

Exception in thread "main" java.lang.IllegalStateException: 词典尚未初始化,请先调用initial方法
at org.wltea.analyzer.dic.Dictionary.getSingleton(Dictionary.java:137)
at org.wltea.analyzer.core.CJKSegmenter.analyze(CJKSegmenter.java:80)
at org.wltea.analyzer.core.IKSegmenter.next(IKSegmenter.java:116)
at org.wltea.analyzer.lucene.IKTokenizer.incrementToken(IKTokenizer.java:88)

 



标签:java,violation,wltea,analyzer,call,org,close,报错,tokenStream
From: https://blog.51cto.com/u_15147537/5953065

相关文章