7 Common Elasticsearch Mistakes (And How to Fix Them)

Elasticsearch is powerful but unforgiving. Small configuration mistakes can cause 10x performance degradation. As Elasticsearch consultants, we've audited dozens of clusters. These seven mistakes appear in almost every underperforming deployment.

1. Using Dynamic Mapping in Production

The mistake: Letting Elasticsearch auto-detect field types instead of defining explicit mappings.

Why it's bad:

Elasticsearch guesses wrong. A field that should be keyword becomes text
Strings are indexed as both text and keyword by default (doubling storage)
Mapping conflicts when different documents have different types for the same field
You can't change mappings after indexing without reindexing everything

The fix: Always define explicit mappings before indexing data.

PUT /my-index
{
  "mappings": {
    "dynamic": "strict",  // Reject unmapped fields
    "properties": {
      "title": { "type": "text", "analyzer": "standard" },
      "status": { "type": "keyword" },
      "created_at": { "type": "date" },
      "price": { "type": "float" }
    }
  }
}

Setting "dynamic": "strict" rejects documents with unmapped fields, preventing silent failures.

2. Over-Sharding

The mistake: Creating too many shards "for future scalability."

Why it's bad:

Each shard has overhead (memory, file handles, threads)
Queries hit all shards, so more shards = more coordination overhead
Small shards are inefficient (Lucene prefers larger segments)
A common anti-pattern: 50 shards for an index with 100,000 documents

The fix: Size shards between 10GB and 50GB. For most use cases:

Data Size	Recommended Shards
< 10GB	1 primary shard
10GB - 50GB	1-2 primary shards
50GB - 200GB	3-5 primary shards
> 200GB	Plan based on nodes and use case

You can always split shards later with the Split API. You can't easily merge them.

3. Using Queries Instead of Filters

The mistake: Putting filter clauses in the must section instead of filter.

Bad:

{
  "query": {
    "bool": {
      "must": [
        { "match": { "title": "elasticsearch" } },
        { "term": { "status": "published" } },  // WRONG!
        { "range": { "date": { "gte": "2025-01-01" } } }  // WRONG!
      ]
    }
  }
}

Good:

{
  "query": {
    "bool": {
      "must": [
        { "match": { "title": "elasticsearch" } }
      ],
      "filter": [
        { "term": { "status": "published" } },  // CORRECT
        { "range": { "date": { "gte": "2025-01-01" } } }  // CORRECT
      ]
    }
  }
}

Why it matters:

filter clauses are cached. After the first query, subsequent queries skip evaluation.
filter clauses don't calculate relevancy scores (faster).
This single change can improve query performance by 2-5x for filtered queries.

Rule of thumb: If a clause is yes/no (not "how relevant"), put it in filter.

4. Returning Too Much Data

The mistake: Returning full documents when you only need a few fields.

Why it's bad:

Large _source fields consume network bandwidth and memory
Serialization/deserialization adds latency
If you're only showing titles in search results, why return the full document body?

The fix: Use source filtering to return only what you need.

{
  "query": { "match": { "content": "elasticsearch" } },
  "_source": ["title", "author", "created_at"],
  "highlight": {
    "fields": {
      "content": { "fragment_size": 150 }
    }
  }
}

This returns only title, author, and created_at fields, plus highlighted snippets from content. Response size drops dramatically.

5. Missing Refresh Interval Tuning

The mistake: Using the default 1-second refresh interval for batch indexing.

Why it's bad:

Every refresh creates a new Lucene segment
High refresh frequency = many small segments = slow queries
During bulk indexing, you're wasting resources refreshing data that isn't being searched yet

The fix: Disable refresh during bulk indexing, then refresh once at the end.

// Disable refresh before bulk indexing
PUT /my-index/_settings
{ "index": { "refresh_interval": "-1" } }

// Bulk index your data...

// Re-enable refresh and force a refresh
PUT /my-index/_settings
{ "index": { "refresh_interval": "30s" } }

POST /my-index/_refresh

For production search indexes that don't need real-time indexing, consider 30s or 60s intervals instead of the default 1s.

6. No Synonym or Analyzer Strategy

The mistake: Using default analyzers for domain-specific search.

Why it's bad:

Legal search: "plaintiff" doesn't find "claimant"
E-commerce: "laptop" doesn't find "notebook computer"
Users get frustrated and blame "the search doesn't work"

The fix: Build domain-specific synonyms and custom analyzers.

PUT /products
{
  "settings": {
    "analysis": {
      "filter": {
        "product_synonyms": {
          "type": "synonym",
          "synonyms": [
            "laptop, notebook, portable computer",
            "phone, mobile, smartphone, cell phone",
            "tv, television, flat screen"
          ]
        }
      },
      "analyzer": {
        "product_analyzer": {
          "tokenizer": "standard",
          "filter": ["lowercase", "product_synonyms", "stemmer"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "product_analyzer"
      }
    }
  }
}

Pro tip: Build your synonym list from search logs. See what users search for and what they actually click on.

7. Not Monitoring Query Performance

The mistake: No visibility into which queries are slow or why.

Why it's bad:

Slow queries degrade over time as data grows
One bad query pattern can bring down the cluster
Without metrics, you're guessing at what to optimize

The fix: Enable slow query logging and monitor with Kibana.

PUT /my-index/_settings
{
  "index.search.slowlog.threshold.query.warn": "1s",
  "index.search.slowlog.threshold.query.info": "500ms",
  "index.search.slowlog.threshold.fetch.warn": "500ms",
  "index.search.slowlog.level": "info"
}

Also monitor these Elasticsearch metrics:

Query latency (p50, p95, p99): Median vs tail latency
Indexing rate: Documents indexed per second
JVM heap usage: Should stay below 75%
Segment count: High counts indicate merge pressure
Search thread pool rejections: Indicates overload

Quick Checklist

Before deploying Elasticsearch to production:

Define explicit mappings (no dynamic mapping)
Right-size your shards (10-50GB each)
Use filter context for yes/no clauses
Return only the fields you need
Tune refresh interval for your use case
Build domain-specific synonyms
Enable slow query logging and monitoring

Need an Elasticsearch Audit?

We've optimized clusters that went from 5-second queries to under 100ms. As Elasticsearch specialists, we know where to look.

Common results from our performance audits:

5-10x query speed improvements
30-50% reduction in infrastructure costs
Elimination of timeout errors