Elastic intervals的使用

标签：index Elastic content intervals 使用 query POST example match

在Elasticsearch中，intervals查询是用来做复杂的区间表达式匹配的，它可以基于分析过的文本字段执行一系列复杂的关系运算。intervals查询特别适合于那些需要对文本数据进行模式匹配，而不只是单一词汇匹配的情况。

intervals语法

GET your_index/_search
{
  "query": {
    "intervals": {
      "field_name": {
        "any_of": [ // any_of, all_of, none_of, one_of 是子查询类型
          {
            "match": { "query": "some text pattern" } // 匹配文本
          },
          {
            "range": { "gte": "lower_bound", "lte": "upper_bound" } // 区间范围匹配
          },
          // 其他子查询...
        ],
        "filter": { /* 可选的过滤条件 */ },
        "should_match": "all" // 或者 "at_least"，"none"，"at_most"
      }
    }
  }
}

GET your_index/_search
{
  "query": {
    "intervals": {
      "content": {
        "all_of": [
          { "match": { "query": "A" } },
          { "any_of": [{ "match": { "query": "B" } }, { "ordered": true }] }
        ]
      }
    }
  }
}

在上面的示例中，我们指示Elasticsearch在content字段中查找包含"A"，并且紧随其后的任意位置有"B"的文档。

any_of：这个条件表示只要满足所列出的一个或多个子查询即可
all_of：这个条件表示必须满足所有列出的子查询
none_of：通常通过must_not结构结合intervals实现，表示文档不应满足子查询列表中的任何条件。

{
  "intervals": {
    "content_field": {
      "range": {
        "gte": "start_word",
        "lte": "end_word",
        "ordered": true,
        "gap": 2 // 限制两个词汇间的最大单词数
      }
    }
  }
}

{
  "intervals": {
    "content_field": {
      "sequence": [
        { "match": { "query": "word1" } },
        { "match": { "query": "word2" } },
        { "match": { "query": "word3" } }
      ]
    }
  }
}

用于匹配一个严格的词序序列，也就是说，文档中的词汇必须按照指定顺序出现。

案例

场景一

使用intervals查询找寻句子中“quick”和“dog”之间不超过两个单词的文档

索引创建

PUT /example_index
{
  "mappings": {
    "properties": {
      "sentence": {
        "type": "text",
        "analyzer": "standard"
      }
    }
  }
}

文档插入

POST /example_index/_doc
{
  "sentence": "The quick brown fox jumps over the lazy dog."
}

POST /example_index/_doc
{
  "sentence": "Red roses are blue violets are red."
}

POST /example_index/_doc
{
  "sentence": "The cat in the hat sat on the mat."
}

查询语句

GET /example_index/_search
{
  "query": {
    "intervals": {
      "sentence": {
        "all_of": [
          { "match": { "query": "quick" } },
          { "any_of": [{ "match": { "query": "dog" } }, { "ordered": true, "gap": 2 }] }
        ]
      }
    }
  }
}

场景二

使用intervals查询查找包含数字“3”到“7”的连续序列的文档

索引创建

PUT /example_index_numbers
{
  "mappings": {
    "properties": {
      "numbers": {
        "type": "text",
        "analyzer": "whitespace"
      }
    }
  }
}

文档插入

POST /example_index_numbers/_doc
{
  "numbers": "1 2 3 4 5 6 7"
}

POST /example_index_numbers/_doc
{
  "numbers": "8 9 10 11 12 13 14"
}

POST /example_index_numbers/_doc
{
  "numbers": "15 16 17 18 19 20 21"
}

查询语句

GET /example_index_numbers/_search
{
  "query": {
    "intervals": {
      "numbers": {
        "all_of": [
          { "match": { "query": "3" } },
          { "any_of": [{ "match": { "query": "7" } }, { "ordered": true, "gap": 1 }] }
        ]
      }
    }
  }
}

场景三

查询在职位描述中包含了“software”并且紧接着是“developer”的文档。

索引创建

PUT /example_index
{
  "mappings": {
    "properties": {
      "content": {
        "type": "text",
        "analyzer": "standard"
      }
    }
  }
}

文档插入

POST /example_index/_doc
{
  "content": "John Doe is an engineer from New York working at XYZ Corp."
}

POST /example_index/_doc
{
  "content": "Jane Smith is a software developer based in California."
}

POST /example_index/_doc
{
  "content": "Michael Johnson works as a data scientist at ABC Inc. in Texas."
}

POST /example_index/_doc
{
  "content": "Sarah Brown is a product manager living in Illinois."
}

POST /example_index/_doc
{
  "content": "Emily Davis, an architect from Washington DC, joined XYZ Corp last year."
}

POST /example_index/_doc
{
  "content": "Robert Harris, who lives in Oregon, is a senior software engineer."
}

POST /example_index/_doc
{
  "content": "Jessica Wilson works in marketing for ABC Inc., located in Florida."
}

查询语句

GET /example_index/_search
{
  "query": {
    "intervals": {
      "content": {
        "all_of": [
          { "match": { "query": "software" } },
          { "any_of": [{ "match": { "query": "developer" } }, { "ordered": true, "gap": 0 }] }
        ]
      }
    }
  }
}

场景四

假设我们想要找出在文章内容中连续提到"prepare", “toppings”, "bake"这三个词的文章。

索引创建

PUT /blog-posts
{
  "mappings": {
    "properties": {
      "title": {
        "type": "text"
      },
      "content": {
        "type": "text",
        "analyzer": "standard"
      }
    }
  }
}

文档插入

POST /blog-posts/_doc
{
  "title": "How to make pizza from scratch",
  "content": "First, prepare the dough. Then add toppings like tomato sauce, mozzarella cheese, mushrooms, pepperoni, and bake it for 10 minutes at 425°F."
}

POST /blog-posts/_doc
{
  "title": "Best practices for gardening",
  "content": "Plant seeds, water regularly, fertilize when needed, prune dead branches, and enjoy the fruits of your labor."
}

POST /blog-posts/_doc
{
  "title": "Building a birdhouse",
  "content": "Cut wood to size, assemble pieces, attach roof, drill entrance hole, then hang the birdhouse in a suitable location."
}

查询语句

GET /blog-posts/_search
{
  "query": {
    "intervals": {
      "content": {
        "sequence": [
          { "match": { "query": "prepare" } },
          { "match": { "query": "toppings" } },
          { "match": { "query": "bake" } }
        ]
      }
    }
  }
}

标签：index,Elastic,content,intervals,使用,query,POST,example,match
From： https://blog.csdn.net/qq_29312279/article/details/136661604

intervals语法

案例

场景一

索引创建

文档插入

查询语句

场景二

索引创建

文档插入

查询语句

场景三

索引创建

文档插入

查询语句

场景四

索引创建

文档插入

查询语句

相关文章

赞助商

阅读排行