背景:
尝试解决如下问题:单分片存在过多文档,超过lucene限制
分析
1.一般为日志数据或者OLAP数据,直接删除索引重建
2.尝试保留索引,生成新索引
- 数据写入新索引,查询时候包含 old_index,new_index
3.尝试split
split index API
如果需要将当前index的primary shard数量增加时,可以使用split index api。
会生成一个新index,但会保留原来的index。
步骤:
确保source index只读
PUT source_index/_settings
{
"settings": {
"index.blocks.write": true
}
}
spilt API修改primary shard数量
POST source_index/_split/new_index
{
"settings": {
"index.number_of_shards": 10
}
}
监控执行进度
GET _cat/recovery/new_index
测试
版本 7.17.5
# 新建测试索引
PUT test_split
{
}
# 关闭source索引的写入
PUT /test_split/_settings
{
"settings": {
"index.blocks.write": true
}
}
# 执行split API
POST /test_split/_split/test_split_new
{
"settings": {
"index.number_of_shards": 12
}
}
遇到报错并解决,在split API执行阶段:
1. source 索引必须是 read-only 的
{
"error": {
"root_cause": [
{
"type": "illegal_state_exception",
"reason": "index test_split must be read-only to resize index. use \"index.blocks.write=true\""
}
],
"type": "illegal_state_exception",
"reason": "index test_split must be read-only to resize index. use \"index.blocks.write=true\""
},
"status": 500
}
2. source分片数(3)必须是target分片数的因子(所以target不能为11,可以为12)
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "the number of source shards [3] must be a factor of [11]"
}
],
"type": "illegal_argument_exception",
"reason": "the number of source shards [3] must be a factor of [11]"
},
"status": 400
}
应用
集群版本 6.8.5
设置source索引 "index.blocks.write": true 之后,执行split API异常:
{
"error": {
"root_cause": [
{
"type": "remote_transport_exception",
"reason": "[es-log-all-2][10.xx.x.xx:9300][indices:admin/resize]"
}
],
"type": "illegal_state_exception",
"reason": "the number of routing shards [5] must be a multiple of the target shards [20]"
},
"status": 500
}
即:目标索引的主分片个数必须是index.number_of_routing_shards
的因数;
注意:number_of_routing_shards 不可以动态修改
结论:ES6.8无法通过split API解决索引分片过少的问题
官方doc:Split index API | Elasticsearch Guide [8.9] | Elastic
Shrink index API
如果需要将当前index的primary shard数量减少时,可以使用shrink index api。
会生成一个新index,但会保留原来的index。
(Shrinks an existing index into a new index with fewer primary shards.)
POST /my-index-000001/_shrink/shrunk-my-index-000001
步骤
# 新建index
PUT test_shrink
{
}
# 查看索引的shard在哪些node
GET _cat/shards/test_shrink?v
# 将所有主分片分配到node1,副本设置为0,设置readOnly
PUT test_shrink/_settings
{
"settings": {
"index.number_of_replicas": 0,
"index.routing.allocation.require._name": "node-es-0",
"index.blocks.write": true
}
}
# 执行shrink API
POST /test_shrink/_shrink/new_test_shrink
{
"settings": {
"index.number_of_replicas": 1,
"index.number_of_shards": 1,
"index.codec": "best_compression"
},
"aliases": {
"my_search_indices": {}
}
}
如果上述命令修改成:
POST /test_shrink/_shrink/new_test_shrink
{
"settings": {
"index.number_of_replicas": 1,
"index.number_of_shards": 2,
"index.codec": "best_compression"
},
"aliases": {
"my_search_indices": {}
}
}
新的number_of_shards不是source index的number_of_shards的因子,那么出现如下错误:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "the number of source shards [3] must be a multiple of [2]"
}
],
"type": "illegal_argument_exception",
"reason": "the number of source shards [3] must be a multiple of [2]"
},
"status": 400
}
官方doc:Shrink index API | Elasticsearch Guide [8.9] | Elastic