./ahmedhashim

Reindexing Elasticsearch with Zero-Downtime

Elasticsearch doesn’t let you modify field mappings in place. To change a mapping you have to create a new index, reindex into it, and atomically swap an alias over to point at the new one. The reindex API plus aliases make this possible without downtime.

Say you’ve got a books index and you want to change how the description field works. Maybe a different analyzer, maybe a different type. Either way, you’re reindexing. To do it without downtime, route traffic through an alias and swap it over once the new index is ready.

Define the new index mapping

Create a new index with the updated mapping:

PUT /books_v2
{
  "settings": {
    "analysis": {
      "analyzer": {
        "custom_analyzer": {
          "tokenizer": "standard",
          "filter": ["lowercase", "stop"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "standard"
      },
      "author": {
        "type": "keyword"
      },
      "description": {
        "type": "text",
        "analyzer": "english"
      },
      "publication_year": {
        "type": "integer"
      }
    }
  }
}

Start the reindex

Pause any pipelines populating the old index before proceeding. Anything written during the reindex won’t make it to the new index, which leaves you with a count mismatch at the end. Depending on how long reindexing takes, documents in your corpus may also go stale during the pause.

Kick off the reindex asynchronously:

POST /_reindex?wait_for_completion=false
{
  "source": {
    "index": "books"
  },
  "dest": {
    "index": "books_v2"
  }
}

For very large indexes, throttle with requests_per_second to ease the load:

POST /_reindex?wait_for_completion=false
{
  "source": {
    "index": "books",
    "size": 5000
  },
  "dest": {
    "index": "books_v2"
  },
  "requests_per_second": 1000
}

You’ll get back a TASK_ID you can use to monitor progress.

Monitor progress

Check the reindex status using the TASK_ID from the previous step:

GET /_tasks/TASK_ID

Or watch it tick in real time (pretty satisfying on a large reindex):

# Watch progress every 5 seconds
watch -n 5 'curl -s -X GET "http://your-cluster:9200/_tasks/TASK_ID" | jq ".task.status"'

You can also list every running reindex operation:

GET /_tasks?actions=*reindex&detailed=true

Verify everything copied over

Compare counts between the old and new index:

GET /books/_count

GET /books_v2/_count

If the numbers don’t match, the original index was almost certainly updated during the reindex. Pause your pipelines, drop the new index, and run it again.

The atomic switch

In a single _aliases request, delete the old index and create an alias pointing at the new one:

POST /_aliases
{
  "actions": [
    {
      "remove_index": {
        "index": "books"
      }
    },
    {
      "add": {
        "index": "books_v2",
        "alias": "books"
      }
    }
  ]
}

Clients still hitting books won’t notice the change. The deletion and the alias creation are applied together, so there’s no window where the name resolves to nothing.

Trust but verify

GET /_cat/aliases/books?v

GET /books/_search
{
  "query": {
    "match": {
      "description": "adventure"
    }
  }
}

Once you switch to aliases, stick with them. The next migration is the same operation you just did.