在Elasticsearch中,intervals查询是用来做复杂的区间表达式匹配的,它可以基于分析过的文本字段执行一系列复杂的关系运算。intervals查询特别适合于那些需要对文本数据进行模式匹配,而不只是单一词汇匹配的情况。
intervals语法
GET your_index/_search
{
"query": {
"intervals": {
"field_name": {
"any_of": [ // any_of, all_of, none_of, one_of 是子查询类型
{
"match": { "query": "some text pattern" } // 匹配文本
},
{
"range": { "gte": "lower_bound", "lte": "upper_bound" } // 区间范围匹配
},
// 其他子查询...
],
"filter": { /* 可选的过滤条件 */ },
"should_match": "all" // 或者 "at_least","none","at_most"
}
}
}
}
GET your_index/_search
{
"query": {
"intervals": {
"content": {
"all_of": [
{ "match": { "query": "A" } },
{ "any_of": [{ "match": { "query": "B" } }, { "ordered": true }] }
]
}
}
}
}
在上面的示例中,我们指示Elasticsearch在content字段中查找包含"A",并且紧随其后的任意位置有"B"的文档。
- any_of: 这个条件表示只要满足所列出的一个或多个子查询即可
- all_of: 这个条件表示必须满足所有列出的子查询
- none_of: 通常通过must_not结构结合intervals实现,表示文档不应满足子查询列表中的任何条件。
{
"intervals": {
"content_field": {
"range": {
"gte": "start_word",
"lte": "end_word",
"ordered": true,
"gap": 2 // 限制两个词汇间的最大单词数
}
}
}
}
{
"intervals": {
"content_field": {
"sequence": [
{ "match": { "query": "word1" } },
{ "match": { "query": "word2" } },
{ "match": { "query": "word3" } }
]
}
}
}
- 用于匹配一个严格的词序序列,也就是说,文档中的词汇必须按照指定顺序出现。
案例
场景一
使用intervals查询找寻句子中“quick”和“dog”之间不超过两个单词的文档
索引创建
PUT /example_index
{
"mappings": {
"properties": {
"sentence": {
"type": "text",
"analyzer": "standard"
}
}
}
}
文档插入
POST /example_index/_doc
{
"sentence": "The quick brown fox jumps over the lazy dog."
}
POST /example_index/_doc
{
"sentence": "Red roses are blue violets are red."
}
POST /example_index/_doc
{
"sentence": "The cat in the hat sat on the mat."
}
查询语句
GET /example_index/_search
{
"query": {
"intervals": {
"sentence": {
"all_of": [
{ "match": { "query": "quick" } },
{ "any_of": [{ "match": { "query": "dog" } }, { "ordered": true, "gap": 2 }] }
]
}
}
}
}
场景二
使用intervals查询查找包含数字“3”到“7”的连续序列的文档
索引创建
PUT /example_index_numbers
{
"mappings": {
"properties": {
"numbers": {
"type": "text",
"analyzer": "whitespace"
}
}
}
}
文档插入
POST /example_index_numbers/_doc
{
"numbers": "1 2 3 4 5 6 7"
}
POST /example_index_numbers/_doc
{
"numbers": "8 9 10 11 12 13 14"
}
POST /example_index_numbers/_doc
{
"numbers": "15 16 17 18 19 20 21"
}
查询语句
GET /example_index_numbers/_search
{
"query": {
"intervals": {
"numbers": {
"all_of": [
{ "match": { "query": "3" } },
{ "any_of": [{ "match": { "query": "7" } }, { "ordered": true, "gap": 1 }] }
]
}
}
}
}
场景三
查询在职位描述中包含了“software”并且紧接着是“developer”的文档。
索引创建
PUT /example_index
{
"mappings": {
"properties": {
"content": {
"type": "text",
"analyzer": "standard"
}
}
}
}
文档插入
POST /example_index/_doc
{
"content": "John Doe is an engineer from New York working at XYZ Corp."
}
POST /example_index/_doc
{
"content": "Jane Smith is a software developer based in California."
}
POST /example_index/_doc
{
"content": "Michael Johnson works as a data scientist at ABC Inc. in Texas."
}
POST /example_index/_doc
{
"content": "Sarah Brown is a product manager living in Illinois."
}
POST /example_index/_doc
{
"content": "Emily Davis, an architect from Washington DC, joined XYZ Corp last year."
}
POST /example_index/_doc
{
"content": "Robert Harris, who lives in Oregon, is a senior software engineer."
}
POST /example_index/_doc
{
"content": "Jessica Wilson works in marketing for ABC Inc., located in Florida."
}
查询语句
GET /example_index/_search
{
"query": {
"intervals": {
"content": {
"all_of": [
{ "match": { "query": "software" } },
{ "any_of": [{ "match": { "query": "developer" } }, { "ordered": true, "gap": 0 }] }
]
}
}
}
}
场景四
假设我们想要找出在文章内容中连续提到"prepare", “toppings”, "bake"这三个词的文章。
索引创建
PUT /blog-posts
{
"mappings": {
"properties": {
"title": {
"type": "text"
},
"content": {
"type": "text",
"analyzer": "standard"
}
}
}
}
文档插入
POST /blog-posts/_doc
{
"title": "How to make pizza from scratch",
"content": "First, prepare the dough. Then add toppings like tomato sauce, mozzarella cheese, mushrooms, pepperoni, and bake it for 10 minutes at 425°F."
}
POST /blog-posts/_doc
{
"title": "Best practices for gardening",
"content": "Plant seeds, water regularly, fertilize when needed, prune dead branches, and enjoy the fruits of your labor."
}
POST /blog-posts/_doc
{
"title": "Building a birdhouse",
"content": "Cut wood to size, assemble pieces, attach roof, drill entrance hole, then hang the birdhouse in a suitable location."
}
查询语句
GET /blog-posts/_search
{
"query": {
"intervals": {
"content": {
"sequence": [
{ "match": { "query": "prepare" } },
{ "match": { "query": "toppings" } },
{ "match": { "query": "bake" } }
]
}
}
}
}
标签:index,Elastic,content,intervals,使用,query,POST,example,match
From: https://blog.csdn.net/qq_29312279/article/details/136661604