es节点失效，手动重置primary，迁移分区

es节点失效，手动重置primary，迁移分区
es节点失效，重置primary，迁移分区

接手另一个团队的elasticsearch服务，服务布署在某云上，迁移计划执行期间，集群状态yellow，多sharding UNASSIGNED

集群版本5.5,使用kibana作监控,未使用cerebro

简单排查一番后，某云反应是硬盘失效，确定数据无法完全恢复

10个有效节点，2个节点的数据完全丢失，多index异

elasticsearch 只在数据无损的情况在后台自动执行迁移复制

可能会导致数据损坏的迁移，需要明确手动来执行

对主节点数据无法完全恢复的场景，es提供两种操作方式，都需要明确指定 "accept_data_loss":true

https://www.elastic.co/guide/en/elasticsearch/reference/5.5/cluster-reroute.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-reroute.html
- 1 allocate_stale_primary 选择一个从节点作为主节点(存在有效从节点的情况下)，若原主节点恢复，则会以新主(原从)覆盖旧主的数据
  
  node选择一个存在从shard的node
```
post http://3464.xyz.com:9200/_cluster/reroute
{
    "commands" : [
            {
            "allocate_stale_primary" : {
                "index" : "uc_2020", "shard" : 2,
               "node" : "3456","accept_data_loss":true
            }
        }
    ]
}
```
- 2 allocate_empty_primary 指定一个空主，若原主节点恢复，则旧主会被完全清除
```
post http://3464.xyz.com:9200/_cluster/reroute
{
    "commands" : [
            {
            "allocate_empty_primary" : {
                "index" : "uc_2019", "shard" : 2,
               "node" : "3456","accept_data_loss":true
            }
        }
    ]
}
```
因为旧主已经不可能恢复了，查看sharding的状态，还有从的指定allocate_stale_primary，所有从都失效，或index 的replicas为1的指定allocate_empty_primary

因为两个节点的丢失，shard分配一团乱，顺带手动执行一些shard的迁移
```
POST http://3464.xyz.com:9200/_cluster/reroute
{
    "commands" : [
        {
            "move" : {
                "index" : "test", "shard" : 0,
                "from_node" : "node1", "to_node" : "node2"
            }
        }
    ]
}
```
End
相关阅读:
2022年官网下安装Redis最全版与官网查阅方法
 vSphere 高级特性FT配置与管理
 光纤交换机长距离级联设置
 vSphere中Storage vMotion的流程详解
 vSphere HA 原理与配置
 vSphere 计算vMotion的迁移原理
 存储网络交换机SNS2124联链路未配置TRUNK导致性能问题
 Windows 远程时提示CredSSP 加密数据库修正问题的简单处理.
vSphere vSwitch网络属性配置详解
 什么是virtual Machine
原文地址：https://www.cnblogs.com/zihunqingxin/p/14459564.html