首页 > 其他分享 >分词器(4) -- 使用分词器

分词器(4) -- 使用分词器

时间:2022-10-09 22:46:02浏览次数:34  
标签:index end -- 使用 start 分词器 offset type

创建索引时指定分词器

创建

PUT test_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_custom_analyzer" :{
          "type": "custom",
          "tokenizer": "standard",
          "char_filter":[
            "html_strip"
            ],
          "fliter": [
             "lowercase"
            ]
        }
      }
    }
  }
}

#创建完毕
{
  "acknowledged": true,
  "shards_acknowledged": true,
  "index": "test_index"
}

效果

POST test_index/_analyze
{
  "analyzer": "my_custom_analyzer",
  "text": "this is a <b>box</b> ?"
}
{
  "tokens": [
    {
      "token": "this",
      "start_offset": 0,
      "end_offset": 4,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "is",
      "start_offset": 5,
      "end_offset": 7,
      "type": "<ALPHANUM>",
      "position": 1
    },
    {
      "token": "a",
      "start_offset": 8,
      "end_offset": 9,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "box",
      "start_offset": 13,
      "end_offset": 20,
      "type": "<ALPHANUM>",
      "position": 3
    }
  ]
}

标签:index,end,--,使用,start,分词器,offset,type
From: https://www.cnblogs.com/mister-liu/p/16773978.html

相关文章