Painless Enrichment
Search requirements change faster than index schemas. In a retail product catalog, you may need new derived fields for filtering or reporting after the products index is already live.
Painless is the built-in scripting language for Elasticsearch. If the base fields already exist (price, list_price, cost, inventory_count), /_update_by_query can enrich documents in place without a full reindex.
Add target fields to the mapping
Before touching documents, add the derived fields to the existing products mapping. This does not modify existing source fields.
PUT /products/_mapping
{
"properties": {
"discount_pct": { "type": "integer" },
"margin_pct": { "type": "integer" },
"price_band": { "type": "keyword" },
"stock_state": { "type": "keyword" }
}
}
Dry run with inline script execution
Before running updates at scale, use /_scripts/painless/_execute with sample values. It catches math and type issues early.
POST /_scripts/painless/_execute
{
"script": {
"source": """
double discount = ((params.list_price - params.price) / params.list_price) * 100.0;
double margin = ((params.price - params.cost) / params.price) * 100.0;
return [
"discount_pct": (int) Math.round(discount),
"margin_pct": (int) Math.round(margin)
];
""",
"params": {
"price": 79.99,
"list_price": 99.99,
"cost": 42.00
}
}
}
Store the enrichment script
Once the formula looks right, store a reusable script in cluster state so update jobs can call it by ID. This version adds null checks, computes numeric fields, and sets simple categories.
PUT /_scripts/products-enrichment-v1
{
"script": {
"lang": "painless",
"source": """
if (ctx._source.list_price != null &&
ctx._source.list_price > 0 &&
ctx._source.price != null) {
double discount =
((ctx._source.list_price - ctx._source.price) / ctx._source.list_price) * 100.0;
ctx._source.discount_pct = (int) Math.round(discount);
}
if (ctx._source.price != null &&
ctx._source.price > 0 &&
ctx._source.cost != null) {
double margin =
((ctx._source.price - ctx._source.cost) / ctx._source.price) * 100.0;
ctx._source.margin_pct = (int) Math.round(margin);
}
if (ctx._source.price != null) {
if (ctx._source.price < 25) {
ctx._source.price_band = "budget";
} else if (ctx._source.price < 100) {
ctx._source.price_band = "mid";
} else {
ctx._source.price_band = "premium";
}
}
if (ctx._source.inventory_count != null) {
if (ctx._source.inventory_count > 0) {
ctx._source.stock_state = "in_stock";
} else {
ctx._source.stock_state = "out_of_stock";
}
}
"""
}
}
Backfill existing products without reindexing
Now run /_update_by_query asynchronously so the request returns immediately with a task id. The query targets only documents missing at least one derived field so reruns stay efficient.
POST /products/_update_by_query?conflicts=proceed&wait_for_completion=false&slices=auto
{
"query": {
"bool": {
"should": [
{ "bool": { "must_not": { "exists": { "field": "discount_pct" } } } },
{ "bool": { "must_not": { "exists": { "field": "margin_pct" } } } },
{ "bool": { "must_not": { "exists": { "field": "price_band" } } } },
{ "bool": { "must_not": { "exists": { "field": "stock_state" } } } }
],
"minimum_should_match": 1
}
},
"script": {
"id": "products-enrichment-v1"
}
}
The response will include a task id, for example:
{
"task": "r1A2WoRbTwKZ516z6NEs5A:36619"
}
Then monitor progress every 5 seconds until the task reports "completed": true.
watch -n 5 'curl -s "http://your-cluster:9200/_tasks/r1A2WoRbTwKZ516z6NEs5A:36619" | jq "{completed: .completed, updated: .task.status.updated, version_conflicts: .task.status.version_conflicts, failures: .response.failures}"'
Verify
Finally, run a small search and inspect _source to confirm the enriched fields are present. Keep the result size low for a quick sanity check.
GET /products/_search
{
"size": 3,
"_source": [
"name",
"price",
"discount_pct",
"margin_pct",
"price_band",
"stock_state"
],
"query": {
"match_all": {}
}
}
This is a simple maintenance loop: add fields, backfill safely, and avoid reindexing when the index structure stays the same.