Reindexing Elasticsearch with Zero-Downtime
Elasticsearch doesn’t let you modify field mappings in place. To change a mapping you have to create a new index, reindex into it, and atomically swap an alias over to point at the new one. The reindex API plus aliases make this possible without downtime.
Say you’ve got a books index and you want to change how the description
field works. Maybe a different analyzer, maybe a different type. Either way,
you’re reindexing. To do it without downtime, route traffic through an alias
and swap it over once the new index is ready.
Define the new index mapping
Create a new index with the updated mapping:
PUT /books_v2
{
"settings": {
"analysis": {
"analyzer": {
"custom_analyzer": {
"tokenizer": "standard",
"filter": ["lowercase", "stop"]
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "standard"
},
"author": {
"type": "keyword"
},
"description": {
"type": "text",
"analyzer": "english"
},
"publication_year": {
"type": "integer"
}
}
}
}
Start the reindex
Pause any pipelines populating the old index before proceeding. Anything written during the reindex won’t make it to the new index, which leaves you with a count mismatch at the end. Depending on how long reindexing takes, documents in your corpus may also go stale during the pause.
Kick off the reindex asynchronously:
POST /_reindex?wait_for_completion=false
{
"source": {
"index": "books"
},
"dest": {
"index": "books_v2"
}
}
For very large indexes, throttle with requests_per_second to ease the load:
POST /_reindex?wait_for_completion=false
{
"source": {
"index": "books",
"size": 5000
},
"dest": {
"index": "books_v2"
},
"requests_per_second": 1000
}
You’ll get back a TASK_ID you can use to monitor progress.
Monitor progress
Check the reindex status using the TASK_ID from the previous step:
GET /_tasks/TASK_ID
Or watch it tick in real time (pretty satisfying on a large reindex):
# Watch progress every 5 seconds
watch -n 5 'curl -s -X GET "http://your-cluster:9200/_tasks/TASK_ID" | jq ".task.status"'
You can also list every running reindex operation:
GET /_tasks?actions=*reindex&detailed=true
Verify everything copied over
Compare counts between the old and new index:
GET /books/_count
GET /books_v2/_count
If the numbers don’t match, the original index was almost certainly updated during the reindex. Pause your pipelines, drop the new index, and run it again.
The atomic switch
In a single _aliases request, delete the old index and create an alias
pointing at the new one:
POST /_aliases
{
"actions": [
{
"remove_index": {
"index": "books"
}
},
{
"add": {
"index": "books_v2",
"alias": "books"
}
}
]
}
Clients still hitting books won’t notice the change. The deletion and the
alias creation are applied together, so there’s no window where the name
resolves to nothing.
Trust but verify
GET /_cat/aliases/books?v
GET /books/_search
{
"query": {
"match": {
"description": "adventure"
}
}
}
Once you switch to aliases, stick with them. The next migration is the same operation you just did.