ElasticSearch批量重建索引
ver 7.16.2
ES的设计目标是针对检索的, 对修改的支持不够好, 特别是对数据结构的修改, 和关系数据库不一样, 修改数据结构后, 索引的历史数据不会改变, 如果需要历史数据也应用修改后的结构和设置, 需要重建索引(Reindex).
重建索引的一般流程为:
- 基于旧索引
index_old
结构修改部分配置后生成新索引index_new
- 将数据通过
reindex
从index_old
同步到index_new
- 删除
index_old
- 给
index_new
一个别名index_old
, 程序仍然直接使用index_old
操作索引
Kibana Dev Tools
当索引数量很少时, 可以采用kibana的devtool, 可视化操作, 对用户友好.
# 1.获取源索引结构, 备份或者作为新索引的依据
GET my_index
# 2.根据源索引结构做修改后, 新建索引, 建议同步数据之前关闭新索引的刷新和副本
PUT my_index_alias
{
"mappings": {
//...new mappings
},
"settings": {
//...new settings
"index": {
// 新索引关闭刷新和副本, 提高后续的写入效率
"refresh_interval": "-1",
"number_of_replicas": "0"
}
}
}
# 3.异步执行重建并同步数据到新索引
# slices多分片并行, 增大size提高每批处理条数, proceed忽略冲突数据(理论上新索引不存在冲突数据)
POST _reindex?slices=auto&wait_for_completion=false
{
"source": {
"index": "my_index",
"size": 5000
},
"dest": {
"index": "my_index_alias",
"op_type": "create"
},
"conflicts": "proceed"
}
# 4.查看task进度
## 使用上一步返回的ID查询
GET /_tasks/CzIa7FVORqu6sRH1U0LUMw:2873350476
## 查询所有重建索引的任务
GET _tasks?detailed=true&actions=*reindex
# 5.索引重建完成后删除旧索引
DELETE my_index
# 6.新索引别名为旧索引
POST _aliases
{
"actions": [
{
"add": {
"index": "my_index_alias",
"alias": "my_index"
}
}
]
}
# 7.新索引启用副本和刷新
PUT my_index_alias/_settings
{
"index": {
"refresh_interval": "1s",
"number_of_replicas": "1"
}
}
脚本批量重建
一般生产环境下, 同一类型的索引会按天或者按类型分成多个索引, 方便运维, 但这样会为重建索引带来不便.
这种场景下可以使用下列脚本, 通过循环传入索引名达到批量重建索引的目的.
#!/bin/bash
es_url=http://ip:port
username=username
password=password
index_old_name=需重建的索引名
index_new_name=${index_old_name}_new
function error_exit() {
echo -e "\e[31m 操作失败 \e[0m";
exit 1
}
set -e
script_path=$(cd `dirname $0`; pwd)
index_dest_config=""
one_line_config=""
if [[ -f "${script_path}/config.json" ]];then
echo "0.发现新索引配置: ${script_path}/config.json"
index_dest_config=$(cat ${script_path}/config.json)
echo $index_dest_config
one_line_config=$(echo $index_dest_config)
else
echo "0.未发现新索引配置, 根据旧索引为你生成了一份配置: $script_path/config.json, 请修改."
index_dest_config=$(curl -ks -u ${username}:${password} -X GET "${es_url}/${index_old_name}?pretty" > config.json)
exit 0
fi
echo "1.新建索引..."
result=$(curl -ks -u ${username}:${password} -X PUT -H "Content-Type: application/json" "${es_url}/${index_new_name}" -d "@${script_path}/config.json")
echo $result
echo "$result" | grep '"acknowledged":true' || error_exit
echo "2.查询新索引详情并保存初始的副本数和刷新间隔, 方便后面恢复..."
result=$(curl -ks -u ${username}:${password} -X GET "${es_url}/${index_new_name}?pretty")
echo $result
echo $result | grep -v 'error' || error_exit
duplicate=$(echo "$result" | grep number_of_replicas | sed 's/,//g')
refresh=$(echo "$result" | grep refresh_interval | sed 's/,//g')
echo -e "${duplicate}\n${refresh}"
echo "3.重建索引之前关闭刷新和副本, 优化新索引的写入速度, 从而提高索引重建速度..."
result=$(curl -ks -u ${username}:${password} -X PUT -H "Content-Type: application/json" "${es_url}/${index_new_name}/_settings" -d \
'{"index": {"refresh_interval": "-1","number_of_replicas": "0"}}')
echo $result
echo $result | grep '"acknowledged":true' || error_exit
echo "4.开始重建索引..."
task=`curl -ks -u ${username}:${password} -X POST -H "Content-Type: application/json" "${es_url}/_reindex?slices=auto&wait_for_completion=false" -d \
'
{
"source": {
"index": "'${index_old_name}'",
"size": 5000
},
"dest": {
"index": "'${index_new_name}'",
"op_type": "create"
},
"conflicts": "proceed"
}
'`
echo "$task"
echo "$task" | grep '"task":' || error_exit
task_id=`echo "$task" | awk -F '"' '{print $4}'`
echo "task_id=$task_id"
while [ 1 ]
do
sleep 1
task_status=$(curl -ks -u ${username}:${password} -X GET "${es_url}/_tasks/${task_id}?pretty")
echo "$task_status"
if [[ -n $(echo "$task_status" | grep complete | grep true) ]];then
echo "$index_old_name -> $index_new_name 索引重建完成."
break
fi
echo "$index_old_name -> $index_new_name 重建中..."
done
echo "5.删除旧索引"
result=$(curl -ks -u ${username}:${password} -X DELETE -H "Content-Type: application/json" "${es_url}/${index_old_name}")
echo $result
echo $result | grep '"acknowledged":true' || error_exit
echo "6.新索引使用旧索引别名"
result=$(curl -ks -u ${username}:${password} -X POST -H "Content-Type: application/json" "${es_url}/_aliases" -d \
'{
"actions": [
{
"add": {
"index": "'${index_new_name}'",
"alias": "'${index_old_name}'"
}
}
]
}'
)
echo $result
echo $result | grep '"acknowledged":true' || error_exit
echo "7.恢复副本和刷新设置"
result=$(curl -ks -u ${username}:${password} -X PUT -H "Content-Type: application/json" "${es_url}/${index_new_name}/_settings" -d \
'{"index": {'${duplicate}','${refresh}'}}')
echo $result
echo $result | grep '"acknowledged":true' || error_exit
标签:index,name,批量,echo,索引,ElasticSearch,result,new
From: https://www.cnblogs.com/longkang/p/17628169.html