Elasticsearch8.4安装及Java Api Client的使用

标签：Elasticsearch8.4 Java index 索引 Api 文档 offset response es

简介

一、ElasticSearch安装

二、可视化界面（elasticserach-head）插件安装

九、整合SpringBoot，基于 Java API Client

简介

ELK 是指 Elasticsearch、Logstash、Kibana 三大开源框架。
1. Elasticsearch 是一个基于 Lucene、分布式、Restful 交互方式的近实时搜索平台框架，简称 ES。
2. Logstash 是 ELK 的中央数据流引擎，用于从不同目标（文件/数据存储/MQ）收集的不同格式数据，经过过滤后支持输出到不同的目的地（文件/MQ/redis/elasticsearch/kafka等）。
3. Kibana 可以将 ES 的数据通过友好的页面展示出来，提供实时分析的功能。

一、ElasticSearch安装

官网下载：Past Releases of Elastic Stack Software | Elastic；
解压目录：
1. bin：启动文件；
2. config：配置文件；
  1. log4g2：日志的配置；
  2. jvm.options：java虚拟机的配置，运行内存的修改；
  3. elasticsearch.yml：es的配置文件，默认端口9200；
3. lib：依赖jar包；
4. modules：功能模块；
5. plugins：相关的插件；
启动服务：
1. 说明：由于 ES8 默认开启了 ssl 认证，所以无法访问 9200 端口；
2. 问题解决：进入 elasticsearch.yml 文件
  - 修改 xpack.security.enabled: false，默认为 true；
  - 新增 ingest.geoip.downloader.enabled: false，关闭定位的配置（有报错）；

启动：elasticsearch.bat；

访问：http://127.0.0.1:9200，页面显示JSON字符串如下：

{
  "name" : "LAPTOP-3GVBG58O",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "pepFhZOUTGOlEvKxBEAu9A",
  "version" : {
    "number" : "8.4.0",
    "build_flavor" : "default",
    "build_type" : "zip",
    "build_hash" : "f56126089ca4db89b631901ad7cce0a8e10e2fe5",
    "build_date" : "2022-08-19T19:23:42.954591481Z",
    "build_snapshot" : false,
    "lucene_version" : "9.3.0",
    "minimum_wire_compatibility_version" : "7.17.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
 "tagline" : "You Know, for Search"
}

二、可视化界面（elasticserach-head）插件安装

安装npm、node.js环境，因为这个插件是一个web项目，需要npm编译；
下载地址：GitHub - mobz/elasticsearch-head；

编译启动：

 git clone git://github.com/mobz/elasticsearch-head.git
 cd elasticsearch-head
 npm install
 npm run start

访问地址：http://localhost:9100；
跨域问题：不同的端口之间互相访问，如此处的9100端口访问9200端口，访问不到，即跨域。

解决：修改 elasticserach.yml ，注意：配置文件中不能存在中文，否则报错！

# 增加新的参数，head插件可以访问es，跨域访问一定要配置
http.cors.enabled: true
http.cors.allow-origin: "*"

重新启动es及es-head：访问9100端口，并连接9200的es服务器；此时显示为健康状态；
菜单介绍：
1. 概览：es服务器的基本情况；
2. 索引：每个索引相当于一个数据库，内部存储的东西类似一张张的数据表；
3. 数据浏览：查看索引中的内容；
4. 查询：基本查询/复合查询（此处使用 Kibana 做查询）;

三、Kibana的安装

官网：Kibana：数据的探索、可视化和分析 | Elastic；
下载：注意要和 ES 的版本一致！同时也是一个web项目，需要安装 npm 环境；
编译启动：解压缩后，执行 bin 目录下的 kibana.bat 文件，启动Kibana；
访问：http://localhost:5601；
汉化：修改配置文件 config 目录下的 kibana.yml，修改 i18n.local: "zh-CN"，默认值为 "en"；

四、ES 核心概念

1、索引；

2、字段类型（mapping）；

3、文档（documents）；

4、分片（倒排索引）；

elasticsearch是面向文档、关系行数据库和 elasticsearch 客观的对比；数据都是JSON格式。

Relational DB ElasticSearch
数据库(database) 索引(indices)
表(table) 类型(types)
行(row) 文档(documents)
字段(column) 字段(fields)
物理设计：es在后台将每个索引划分成多个分片，每个分片可以在集群中的不同服务器间迁移
倒排索引：es使用的是倒排索引结构，采用 Lucene 倒排索引作为底层。这种结构适用于快速的全文搜索，一个索引由文档中所有不重复的列表构成，对于每一个词，都有一个包含它的文档列表；
在elasticsearch中，索引被分为多个分片，每份分片是一个Lucene的索引。所以一个elasticsearch索引是由多个Lucene索引组成的。

Relational DB	ElasticSearch
数据库(database)	索引(indices)
表(table)	类型(types)
行(row)	文档(documents)
字段(column)	字段(fields)

五、IK分词器

什么是IK分词器？
1. 分词：把一段中文或字符串划分成一个个的关键字，搜索的时候会将输入的信息进行分词，去和数据库或索引库中的分词信息进行匹配。
2. IK分词器：中文分词是将每一个汉字看成一个词，这时就需要安装IK分词器去解决这个问题。
3. IK分词器提供了两个算法：
  - ik_smart：最少切分；
  - ik_max_word：最细粒度划分；
安装：
1. 官网地址：GitHub - infinilabs/analysis-ik；
2. 注意：下载 elasticsearch 对应版本的IK分词器（v8.4.0），下载zip压缩包即可；
3. 解压：解压到 elasticsearch 安装目录的 plugins 文件夹内；
4. 重启 elasticsearch 服务：注意观察IK分词器加载插件的日志；
5. 查看已加载插件：bin目录下打开cmd，输入命令 elasticsearch-plugin list 查看；

使用Kibana测试：

ik_smart：最少切分；

// 请求
GET _analyze
{
  "analyzer": "ik_smart",  // 分词模式
  "text": "中国共产党"       // 输入单词
}

// 返回值：
{
  "tokens": [
    {
      "token": "中国共产党",
      "start_offset": 0,
      "end_offset": 5,
      "type": "CN_WORD",
      "position": 0
    }
  ]
}

ik_max_word：最细粒度划分；

// 请求
GET _analyze
{
  "analyzer": "ik_max_word",
  "text": "中国共产党"
}

// 返回值：
{
  "tokens": [
    {
      "token": "中国共产党",
      "start_offset": 0,
      "end_offset": 5,
      "type": "CN_WORD",
      "position": 0
    },
    {
      "token": "中国",
      "start_offset": 0,
      "end_offset": 2,
      "type": "CN_WORD",
      "position": 1
    },
    {
      "token": "国共",
      "start_offset": 1,
      "end_offset": 3,
      "type": "CN_WORD",
      "position": 2
    },
    {
      "token": "共产党",
      "start_offset": 2,
      "end_offset": 5,
      "type": "CN_WORD",
      "position": 3
    },
    {
      "token": "共产",
      "start_offset": 2,
      "end_offset": 4,
      "type": "CN_WORD",
      "position": 4
    },
    {
      "token": "党",
      "start_offset": 4,
      "end_offset": 5,
      "type": "CN_CHAR",
      "position": 5
    }
  ]
}

问题：不能按照自己的意愿分词；

分词器配置：自定义用户词典（.dic结尾的文件）

配置文件：elasticsearch-8.4.0\plugins\ik\config\IKAnalyzer.cfg.xml；
创建一个 my.dic 文件，内部写入自己的单词：

注意：文件的编码格式必须是UTF-8，否则将不会生效。
```
德玛西亚
诺克萨斯
```

配置中加入创建的文件：<entry key="ext_dict">test.dic</entry>

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
    <comment>IK Analyzer 扩展配置</comment>
    <!--用户可以在这里配置自己的扩展字典 -->
    <entry key="ext_dict">my.dic</entry>
     <!--用户可以在这里配置自己的扩展停止词字典-->
    <entry key="ext_stopwords"></entry>
    <!--用户可以在这里配置远程扩展字典 -->
    <!-- <entry key="remote_ext_dict">words_location</entry> -->
    <!--用户可以在这里配置远程扩展停止词字典-->
    <!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>

重启 elasticsearch 服务：查看日志加载用户配置的.dic文件：

[Dict Loading] D:\IT\elasticsearch\elasticsearch-8.4.0\plugins\ik\config\my.dic

再次测试：

ik_smart：

// 请求
GET _analyze
{
  "analyzer": "ik_smart",
  "text": "德玛西亚之力"
}

// 返回值
{
  "tokens": [
    {
      "token": "德玛西亚",
      "start_offset": 0,
      "end_offset": 4,
      "type": "CN_WORD",
      "position": 0
    },
    {
      "token": "之力",
      "start_offset": 4,
      "end_offset": 6,
      "type": "CN_WORD",
      "position": 1
    }
  ]
}

ik_max_word：

// 请求
GET _analyze
{
  "analyzer": "ik_max_word",
  "text": "德玛西亚之力"
}

// 返回值
{
  "tokens": [
    {
      "token": "德玛西亚",
      "start_offset": 0,
      "end_offset": 4,
      "type": "CN_WORD",
      "position": 0
    },
    {
      "token": "西亚",
      "start_offset": 2,
      "end_offset": 4,
      "type": "CN_WORD",
      "position": 1
    },
    {
      "token": "之力",
      "start_offset": 4,
      "end_offset": 6,
      "type": "CN_WORD",
      "position": 2
    }
  ]
}

六、Rest风格说明：ES推荐使用的

Kibana基于Restful命令：type已被弃用，使用 _doc 替代，注意版本问题

method	url地址	描述
PUT	localhost:9200/索引名称/_doc/文档ID	创建文档（指定文档ID）
POST	localhost:9200/索引名称/_doc	创建文档（随机文档ID）
POST	localhost:9200/索引名称/_update/文档ID	修改文档
DELETE	localhost:9200/索引名称/_doc/文档ID	删除文档（指定文档ID）
GET	localhost:9200/索引名称/_doc/文档ID	查询文档（指定文档ID）
POST	localhost:9200/索引名称/_search	查询所有数据

七、关于索引的操作

1、PUT命令

创建索引和文档：

PUT mytest/_doc/1       // mytest为索引，_doc固定格式，1为文档id
{   // 数据
  "name": "我的测试",
  "descripe": "ES根据ID创建索引测试，索引为1"
}

// 返回值
{
  "_index": "mytest",       // 索引名称
  "_id": "1",               // 文档id
  "_version": 1,            // 版本号，1为首次创建的版本
  "result": "created",      // 结果：created为创建
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 0,
  "_primary_term": 1
}

创建索引，指定索引中的数据类型：

PUT mytest2     // 仅创建索引
{
  "mappings": {
    "properties": {
      "name": {
        "type": "text"
      },
      "age": {
        "type": "integer"
      },
      "birth": {
        "type": "date"
      }
    }
  }
}

2、GET命令

获取指定的索引信息：
```
GET mytest
```
根据文档id获取索引中的信息：
```
GET mytest/_doc/1
```

获取ES的状态：

GET _cat/health     // 健康状态

GET _cat/indices?v      // 索引的基本情况，已创建索引、索引的文档数量等统计信息

3、POST命令

修改数据：

POST mytest/_update/1
{
  "doc": {      // "doc"为固定格式
    "name": "测试名称修改"
  }
}

4、DELETE命令

删除索引及文档

DELETE mytest2      // 全部删除，索引和文档

删除指定的文档
```
DELETE mytest/_doc/1
```

八、关于文档的操作

简单搜索：get直接进行查询，查询结果格式如下：

// 返回值
{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,           // 搜索结果的总数
      "relation": "eq"      // 条件关系：eq表示等于
    },
    "max_score": 0.9808291,         // 最大分数，搜索结果最高匹配度的值
    "hits": [
      {
        "_index": "mytest3",
        "_id": "2",
        "_score": 0.9808291,        // 每个搜索的结果都有一个匹配度，数值越高值越大
        "_source": {            // 具体搜索结果的文档信息
          "name": "李四",
          "age": 34,
          "descripe": "ls喜欢张三"
        }
      }
    ]
  }
}

复杂搜索：排序、分页、高亮、模糊、精准等

get查询

GET mytest3/_search?q=name:三        // 使用&连接条件，q表示查询条件

post查询

注意：match使用分词器进行匹配，而keyword数据类型不会被分词器解析；

POST mytest3/_search        // 多条件搜索
{
  "query": {
    "match": {      // 匹配name字典的值
      "name": "王"
    }
  }
}

指定字段

POST mytest3/_search
{
  "query": {
    "match": {
      "name": "王"
    }
  },
  "_source": ["name", "age"]        // 仅搜索部分字段
}

排序

POST mytest3/_search
{
  "query": {
    "match": {
      "name": "王"
    }
  },
  "sort": [         // 排序，按某一属性进行排序
    {
      "age": {
        "order": "asc"
      }
    }
  ]
}

分页

POST mytest3/_search
{
  "query": {
    "match": {
      "name": "王"
    }
  },
  "from": 0,        // 起始页，从0开始
  "size": 2         // 每页条数
}

多条件查询，must类似and的条件，should类似or的条件

POST mytest3/_search
{
  "query": {
    "bool": {       // 条件为布尔值，
      "must": [     // 必须关系，可选值"must"、"must_not"、"should"等..
        {
          "match": {        // 必须匹配一下字段
            "name": "王"
          }
        },
        {
          "match": {
            "age": 46
          }
        }
      ]
    }
  }
}

条件过滤，大于、等于、小于..

POST mytest3/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "range": {        // 范围值
            "age": {        
              "gt": 50,     // 大于，gte：大于等于
              "lt": 70      // 小于，lte：小于等于
            }
          }
        }
      ]
    }
  }
}

包含关系，like

POST mytest3/_search
{
  "query": {
    "match": {
      "descripe": "测试 三"        // 包含指定条件的，多个条件使用 "空格" 隔开
    }
  }
}

精确查询：使用倒排索引精确查询

POST mytest3/_search
{
  "query": {
    "term": {
      "name": "三"
    }
  }
}

高亮查询：符合条件的数据，对应字段自动增加HTML高亮标签；

POST mytest3/_search
{
  "query": {
    "match": {
      "name": "四"
    }
  },
  "highlight": {
    "pre_tags": "<p class=\"key\">",        // 自定义html标签的前缀
    "post_tags": "</p>",        // 自定义html标签的后缀
    "fields": {
      "name": {}
    }
  }
}

九、整合SpringBoot，基于 Java API Client

Elasticsearch Clients官方文档：Elasticsearch Clients | Elastic
选择对应的语言，这里选择使用 Java Client: 8.4；
SpringBoot 版本选择：SpringBoot 2.7.5；

ES 依赖引入：

<!--与ES版本保持一致-->
<dependency>
    <groupId>co.elastic.clients</groupId>
    <artifactId>elasticsearch-java</artifactId>
    <version>8.4.0</version>
</dependency>

<dependency>
    <groupId>com.fasterxml.jackson.core</groupId>
    <artifactId>jackson-databind</artifactId>
    <version>2.12.3</version>
</dependency>

<!--解决json异常问题-->
<dependency>
    <groupId>jakarta.json</groupId>
    <artifactId>jakarta.json-api</artifactId>
    <version>2.0.1</version>
</dependency>

配置 ES 的配置类：

@Configuration
public class ElasticsearchConfig {

    @Bean
    public ElasticsearchClient elasticsearchClient() {
        // Create the low-level client
        RestClient restClient = RestClient.builder(
                new HttpHost("127.0.0.1", 9200)).build();

        // Create the transport with a Jackson mapper
        ElasticsearchTransport transport = 
                new RestClientTransport(restClient, new JacksonJsonpMapper());

        // And create the API client
        return new ElasticsearchClient(transport);
    }
}

测试类：

注入 ElasticsearchClient 对象：

@Autowired
private ElasticsearchClient client;

创建索引：

@Test
public void createIndex() throws IOException {
    // CreateIndexRequest request =
    // new CreateIndexRequest.Builder().index("es_springboot_test").build();
    // CreateIndexResponse createResponse = client.indices().create(request);
    
    // lambda表达式简写
    CreateIndexResponse createResponse =
  client.indices().create(r -> r.index("es_springboot_test"));
    System.out.println(createResponse);
}

查询索引是否存在：

注意：这个 ExistsRequest 对象与判断文档的对象名称一样，但是包不一样；

import co.elastic.clients.elasticsearch.indices.ExistsRequest;
    
@Test
public void existIndex() throws IOException {
    // ExistsRequest request = new ExistsRequest.Builder()
    //         .index("es_springboot_test").build();
    // BooleanResponse exists = client.indices().exists(request);
    
    // lambda表达式简写
    BooleanResponse exists = client.indices()
            .exists(r -> r.index("es_springboot_test"));
    boolean value = exists.value();
    System.out.println(value);
}

删除索引：

@Test
public void removeIndex() throws IOException {
    // DeleteIndexRequest request =
    //         new DeleteIndexRequest.Builder()
    //         .index("es_springboot_test").build();
    // DeleteIndexResponse response = client.indices().delete(request);
    
    // lambda表达式简写
    DeleteIndexResponse response =
            client.indices().delete(r -> r.index("es_springboot_test"));
    System.out.println(response);
}

新建文档：通过java对象操作

@Test
public void addDocument() throws IOException {
    Student student = new Student(UUID.randomUUID().toString(),
            "张三",
            23,
            Student.Sex.female,
            "北京");

    // IndexRequest<Object> request = new IndexRequest.Builder<>()
    //         .index("es_springboot_test")
    //         .id("1")
    //         .document(student)
    //         .build();
    // IndexResponse response = client.index(request);
    
    // lambda表达式简写
    IndexResponse response = client.index(r -> r
            .index("es_springboot_test")
            .id("1")
            .document(student));
    System.out.println(response);
}

判断文档是否存在：

注意：这个 ExistsRequest 对象与判断索引的对象名称一样，但是包不一样；

import co.elastic.clients.elasticsearch.core.ExistsRequest;

@Test
public void existDocument() throws IOException {
    // ExistsRequest request = new ExistsRequest.Builder()
    //         .index("es_springboot_test")
    //         .id("1")
    //         .build();
    // BooleanResponse response = client.exists(request);
    
    // lambda表达式简写
    BooleanResponse response =
            client.exists(r -> r
                    .index("es_springboot_test")
                    .id("1"));
    boolean value = response.value();
    System.out.println(value);
}

获取指定id的文档：

@Test
public void getDocument() throws IOException {
    // GetRequest request = new GetRequest.Builder()
    //         .index("es_springboot_test")
    //         .id("1")
    //         .build();
    // GetResponse<Student> response = client.get(request, Student.class);
    
    // lambda表达式简写
    GetResponse<Student> response =
            client.get(r -> r.index("es_springboot_test").id("1"),
                    Student.class);
    System.out.println(response);
}

更新指定id的文档：

@Test
public void updateDocument() throws IOException {
    Student student = new Student();
    student.setAddress("河南");
    // UpdateRequest request = new UpdateRequest.Builder()
    //         .index("es_springboot_test")
    //         .id("1")
    //         .doc(student)
    //         .build();
    // UpdateResponse<Student> response = client.update(request, Student.class);
    
    // lambda表达式简写
    UpdateResponse<Student> response =
            client.update(r -> r.index("es_springboot_test").id("1").doc(student),
                    Student.class);
    System.out.println(response);
}

删除指定id的文档：

@Test
public void deleteDocument() throws IOException {
    // DeleteRequest request = new DeleteRequest.Builder()
    //         .index("es_springboot_test")
    //         .id("1")
    //         .build();
    // DeleteResponse response = client.delete(request);
    
    // lambda表达式简写
    DeleteResponse response =
            client.delete(r -> r.index("es_springboot_test").id("1"));
    System.out.println(response);
}

批量操作文档、索引（文档的增删改查）：

@Test
public void bulkDocument() throws IOException {
    ArrayList<Student> list = new ArrayList<>();

    BulkRequest.Builder builder = new BulkRequest.Builder()
            .index("es_springboot_test");
    
    // 循环中操作
    for (int i = 0; i < list.size(); i++) {
        String index = String.valueOf(i + 1);
        Student student = list.get(i);
        // 第一层item为增删改查的方法，第二层为文档操作
        builder.operations(item -> item.index(v -> v
                .id(index)
                .document(student)));
    }
    BulkRequest request = builder.build();

    BulkResponse response = client.bulk(request);
    System.out.println(response);
}

搜索查询操作：使用 SearchRequest 的 query 方法进行查询，highlight 高亮显示；

@Test
public void search() throws IOException {
    // Highlight highlight = new Highlight.Builder()
    //         .preTags("<p class=\"highlight\">")
    //         .postTags("</p>")
    //         .build();
    // SearchRequest request = new SearchRequest.Builder()
    //         .index("es_springboot_test")
    //         .highlight(highlight)
    //         .query(q -> q.term(t -> t.field("sex").value(Student.Sex.female.toString())))
    //         .build();

    // SearchResponse<Student> response = client.search(request, Student.class);
    
    // lambda表达式简写
    SearchResponse<Student> response = client.search(r -> r
                    .index("es_springboot_test")
                    .highlight(h -> h
                            .preTags("<p class=\"highlight\">")
                            .fields("sex", v -> v)
                            .postTags("</p>"))
                    .query(q -> q
                            .term(t -> t
                                    .field("sex")
                                    .value(Student.Sex.female.toString()))),
            Student.class);
    System.out.println(response);

    HitsMetadata<Student> hits = response.hits();
    List<Hit<Student>> list = hits.hits();
    list.forEach(item -> System.out.println(item.source()));
}

标签：Elasticsearch8.4,Java,index,索引,Api,文档,offset,response,es
From： https://blog.csdn.net/fyx_demo/article/details/139293731

Elasticsearch8.4安装及Java Api Client的使用

简介

一、ElasticSearch安装

二、可视化界面（elasticserach-head）插件安装

三、Kibana的安装

四、ES 核心概念

五、IK分词器

六、Rest风格说明：ES推荐使用的

七、关于索引的操作

1、PUT命令

2、GET命令

3、POST命令

4、DELETE命令

八、关于文档的操作

九、整合SpringBoot，基于 Java API Client

相关文章

赞助商

阅读排行