elasticsearch7介绍和基础语法

标签：index keyword GET 介绍语法 elasticsearch7 query type match

elasticsearch7介绍和基础语法

参考博客：https://blog.csdn.net/qq_47387991/article/details/129349790

什么是`elasticsearch`？

一个开源的分布式搜索引擎，可以用来实现搜索、日志统计、分析、系统监控等功能。 开源 分布式 搜索引擎
elasticsearch底层是基于lucene来实现的。

Lucene是一个Java语言的搜索引擎类库，是Apache公司的顶级项目，由DougCutting于1999年研发。官网地址:https://lucene.apache.org 。

Lucene的优势：

易扩展
高性能（基于倒排索引）

Lucene的缺点：

只限于java语言开发
学习曲线陡峭
不支持水平扩展

`elasticsearch`和`mysql`有什么区别吗？

Elasticsearch：擅长海量数据的搜索、分析、计算
Mysql：擅长事务类型操作，可以确保数据的安全和一致性
在企业中，往往是两者结合使用：
- 对安全性要求较高的写操作，使用mysql实现
- 对查询性能要求较高的搜索需求，使用elasticsearch实现
- 两者再基于某种方式，实现数据的同步，保证一致性

倒排索引是什么？

参考https://blog.csdn.net/qq_47387991/article/details/129349790

里面很详细，这里不再赘述。

elasticsearch中的概念

MySQL	Elasticsearch	说明
Table	Index	索引(index)，就是文档的集合，类似数据库的表(table)
Row	Document	文档（Document），就是一条条的数据，类似数据库中的行（Row），文档都是JSON格式
Column	Field	字段（Field），就是JSON文档中的字段，类似数据库中的列（Column）
Schema	Mapping	Mapping（映射）是索引中文档的约束，例如字段类型约束。类似数据库的表结构（Schema）
SQL	DSL	DSL(Domain Specific Language)是elasticsearch提供的JSON风格的请求语句，用来操作elasticsearch，实现CRUD

elasticsearch常用的请求语句（DSL）

查看elasticsearch状态⬇

GET /_cat/nodes：查看所有节点
GET /_cat/health：查看 es 健康状况
GET /_cat/master：查看主节点
GET /_cat/indices：查看所有索引 
GET /_cat/indices?v：v参数表示以更易读的表格形式输出结果，包括索引名称、状态、文档数量、存储大小等详细信息

PUT /my_index #创建一个名为my_index的索引
GET my_index/_doc/1 #查询my_index索引下id为1的文档

更新数据⬇

# 第一种更新方式
POST customer/external/1/_update
{
	"doc":{
	"name": "John Doew"
	}
}
# 更新同时增加新属性
POST customer/external/1/_update
{
	"doc": { "name": "Jane Doe", "age": 20 }
}
# 第二种更新方式
POST customer/external/1
{
"name": "John Doe2"
}
# 第三种更新方式
PUT customer/external/1
{
"name": "John Doe"
}
# 第一种POST操作会对比源文档数据，如果相同不会有什么操作，文档 version 不增加

# 第二种和第三种操作总会将数据重新保存并增加 version 版本

删除数据⬇

# 删除一条数据
DELETE customer/_doc/1
# 删除整个索引
DELETE customer
# 批量操作
# 批量操作可以显著提高性能，因为它减少了网络往返次数和资源消耗
# 批量导入文档
POST /_bulk
{ "index" : { "_index" : "customer", "_type" : "external" } }
{ "name": "John Doe" }
{ "index" : { "_index" : "customer", "_type" : "external" } }
{ "name": "Jane Smith" }

# 批量更新文档
POST /_bulk
{ "update" : {"_id" : "1", "_index" : "customer", "_type" : "external"} }
{ "doc" : { "name" : "John Updated" } }
{ "update" : {"_id" : "2", "_index" : "customer", "_type" : "external"} }
{ "doc" : { "name" : "Jane Updated" } }

# 混合操作
POST /_bulk
{ "index" : { "_index" : "customer", "_type" : "external" } }
{ "name": "First Document" }
{ "delete" : {"_id" : "3", "_index" : "customer", "_type" : "external"} }
{ "update" : {"_id" : "4", "_index" : "customer", "_type" : "external"} }
{ "doc" : { "name" : "Fourth Document Updated" } }

检索⬇

# 使用REST request URI发送搜索参数
GET bank/_search?q=*&sort=account_number:asc

# REST request body
POST bank/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "account_number": {
        "order": "desc"
      }
    }
  ]
}

REST request body检索进阶
match_all匹配全部 ⬇

GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "balance": {
        "order": "desc"
      }
    }
  ],
  "from": 5,
  "size": 5,
  "_source": ["balance", "firstname"]
}

# "from": 5,：定义从结果的哪个位置开始返回，这里设置为5，意味着从第6个结果开始返回。

# "size": 5,：定义返回结果的数量，这里设置为5，意味着返回5个结果。

# "_source": ["balance","firstname"]：定义返回的源字段，这里指定返回balance和firstname字段的值。

match根据提供的字段匹配 ⬇
匹配查询有两种可能：
- 完全匹配: 文档的"address"字段完全包含"Mill road"这个短语，例如：“123 Mill road”。
- 部分匹配: 搜索关键词"Mill road"会被分词为"mill"和"road"，如果"address"字段包含"Mill"和"road"，就会被匹配到，如"198 Mill Lane"或"263 Aviation Road"。

GET bank/_search
{
  "query": {
    "match": {
      "address": "mill"
    }
  }
}

短语匹配match_phrase⬇

GET bank/_search
{
  "query": {
    "match_phrase": {
      "address": "Mill Road"
    }
  }
}

多字段匹配multi_match⬇

GET bank/_search
{
  "query": {
    "multi_match": {
      "query": "Albemarle",
      "fields": ["address", "firstname"]
    }
  }
}

bool复合查询
must必须满足的匹配条件 ⬇

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "gender": "M"
          }
        },
        {
          "match": {
            "address": "mill"
          }
        }
      ]
    }
  }
}

must not必须不满足的匹配条件 ⬇

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "gender": "M"
          }
        },
        {
          "match": {
            "address": "mill"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "age": "28"
          }
        }
      ]
    }
  }
}

should⬇
- should字句用于指定一组可选条件。这些条件不是必须满足的，但如果文档满足这些条件，它们可以提高文档的相关性得分
- should子句通常与minimum_should_match参数一起使用，这个参数指定了至少需要多少个should条件被满足

GET bank/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "gender": "M"
          }
        },
        {
          "match": {
            "address": "mill"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "age": "18"
          }
        }
      ],
      "should": [
        {
          "match": {
            "lastname": "Wallace"
          }
        }
      ],
      "minimum_should_match": 0
    }
  }
}

filter过滤⬇
- 在语义上，filter和must非常相近，但与must不同的是，filter子句不会影响文档的相关性得分（_score）。

GET bank/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "gender": "M"
          }
        },
        {
          "match": {
            "address": "mill"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "age": "18"
          }
        }
      ],
      "should": [
        {
          "match": {
            "lastname": "Wallace"
          }
        }
      ],
      "filter": [
        {
        # range：用于匹配数值或日期字段的某个范围内的值
          "range": { 
            "age": {
              "gte": 20,
              "lte": 30
            }
          }
        }
      ], 
      "minimum_should_match": 1
    }
  }
}

term不分词的精确查询

GET bank/_search
{
  "query": {
    "term": {
      "age": 28
    }
  }
}

keyword 精确查询
- 用途：用于对keyword类型的字段执行精确匹配。
- 特点：keyword查询通常用于那些在索引时不经过分词器处理的字段。它允许用户搜索确切的字符串值，而不考虑分词器的影响。

{
  "query": {
    "match": {
      "username.keyword": "john_doe"
    }
  }
}

aggregations聚合分析

以下是聚合分析的几个例子：

搜索 address 中包含 mill 的所有人的年龄分布以及平均年龄，但不显示这些人的详情

GET bank/_search
{
  "query": {
    "match": {
      "address": "mill"
    }
  },
  "aggs": {
    "group_by_state": {
      "terms": {
        "field": "age"
      }
    },
    "avg_age": {
      "avg": {
        "field": "age"
      }
    }
  },
  "size": 0
}

按照年龄聚合，并且请求每个年龄的平均薪资

GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "group_by_age": {
      "terms": {
        "field": "age"
      },
      "aggs": {
        "avg_balance": {
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  },
  "size": 0
}

elasticsearch的结构(mapping)组织

PUT /my-index
{
  "mappings": {
    "properties": {
      "name": {
        "type": "text"
      },
      "age": {
        "type": "integer"
      },
      "date": {
        "type": "date",
        "format": "yyyy-MM-dd"
      }
    }
  }
}

添加新的字段映射⬇

PUT /my_index/_mapping
{
  "properties": {
    "employee-id": {
      "type": "keyword",
      "index": false
    }
  }
}

修改映射&数据迁移⬇

index索引的mapping是不能修改的。

如果一定要修改，只能重新创建一个索引，再把数据迁移到新索引。

新建索引⬇

PUT newbank
{
  "mappings" : {
      "properties" : {
        "account_number" : {
          "type" : "long"
        },
        "address" : {
          "type" : "text"
        },
        "age" : {
          "type" : "long"
        },
        "balance" : {
          "type" : "long"
        },
        "city" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "email" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "employer" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "firstname" : {
          "type" : "keyword"
        },
        "gender" : {
          "type" : "keyword"
        },
        "lastname" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "state" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
}

数据迁移⬇

POST /_reindex
{
  "source": {
    "index": "bank",
    "type": "acount"
  },
  "dest": {
    "index": "newbank"
  }
}

标签：index,keyword,GET,介绍,语法,elasticsearch7,query,type,match
From： https://blog.csdn.net/m0_70484213/article/details/145063564

elasticsearch7介绍和基础语法

elasticsearch7介绍和基础语法

什么是`elasticsearch`？

`elasticsearch`和`mysql`有什么区别吗？

倒排索引是什么？

elasticsearch中的概念

elasticsearch常用的请求语句（DSL）

REST request body检索进阶

bool复合查询

filter过滤⬇

`term`不分词的`精确`查询

`keyword` `精确`查询

`aggregations`聚合分析

elasticsearch的结构(mapping)组织

相关文章

赞助商

阅读排行

elasticsearch7介绍和基础语法

elasticsearch7介绍和基础语法

什么是elasticsearch？

elasticsearch和mysql有什么区别吗？

倒排索引是什么？

elasticsearch中的概念

elasticsearch常用的请求语句（DSL）

REST request body检索进阶

bool复合查询

filter过滤⬇

term不分词的精确查询

keyword 精确查询

aggregations聚合分析

elasticsearch的结构(mapping)组织

相关文章

赞助商

阅读排行

什么是`elasticsearch`？

`elasticsearch`和`mysql`有什么区别吗？

`term`不分词的`精确`查询

`keyword` `精确`查询

`aggregations`聚合分析