January 8, 2026 · 14 min read

How We Built 100ms Legal Document Search with Elasticsearch

Sub-100ms queries on 1M+ legal documents. Here's the architecture, optimization techniques, and real-world results from building enterprise search for a legal tech platform.

Legal document search is brutal. Lawyers spend hours hunting through contracts, case files, and research materials. Twenty minutes to find a single document. Multiply that across a firm, and you're burning hundreds of thousands in billable hours.

We built a search system that returns results in under 100 milliseconds — even with over 1 million documents indexed. The client recovered $220,000+ per year in time savings.

Here's exactly how we did it.


The Problem: Search That Didn't Scale

Before Elasticsearch, the legal firm was using PostgreSQL full-text search. It worked fine with 10,000 documents. At 100,000+ documents, it collapsed:

  • Slow queries: 5-10 seconds for common searches, sometimes timing out entirely
  • Rigid matching: Exact phrase matching only — search "contract termination" wouldn't find "terminating contracts"
  • No relevancy ranking: Results were alphabetical, not useful
  • No typo tolerance: Misspell "plaintiff" as "plantiff" and you'd get zero results
  • No autocomplete: Users had to know exactly what they were searching for

The kicker: Legal search has domain-specific challenges. "Discovery" means something different in law than in everyday English. Synonyms matter. Context matters. Generic search doesn't cut it.


The Architecture

We built a full-stack Elasticsearch implementation integrated with their existing Laravel application:

Layer Technology
Search Engine Elasticsearch 7.x (AWS Elasticsearch Service)
Backend Laravel + elasticsearch/elasticsearch PHP client
Frontend Vue.js 3 with instant search UI
Analytics Kibana for query performance monitoring
Hosting AWS (Elasticsearch Service + EC2 for Laravel)

Why This Stack?

AWS Elasticsearch Service: Managed Elasticsearch means we don't have to worry about cluster management, backups, or scaling. It just works.

Laravel Integration: Their app was already Laravel. We used the official Elasticsearch PHP client to keep the integration clean and maintainable.

Vue.js: Instant search requires reactive UI updates. Vue made it trivial to build autocomplete with real-time results.


Index Mapping: The Foundation of Speed

Most Elasticsearch performance problems start with bad mapping. We designed our index mapping specifically for legal documents:

{
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "legal_analyzer",
        "fields": {
          "keyword": {
            "type": "keyword"
          },
          "ngram": {
            "type": "text",
            "analyzer": "ngram_analyzer"
          }
        }
      },
      "content": {
        "type": "text",
        "analyzer": "legal_analyzer"
      },
      "document_type": {
        "type": "keyword"
      },
      "created_date": {
        "type": "date"
      },
      "download_count": {
        "type": "integer"
      },
      "tags": {
        "type": "keyword"
      }
    }
  }
}

Key Design Decisions

Multi-field mapping for title: We index the title three ways:

  • text - Full-text search with legal analyzer
  • keyword - Exact match for sorting and aggregations
  • ngram - Partial matching for autocomplete

Keyword types for filters: Document type, tags, and other facets use keyword type. This makes filtering fast — Elasticsearch doesn't have to analyze these fields during queries.

Download count: We track how often documents are accessed and use this for relevancy boosting (more on that later).


Custom Analyzers: Domain-Specific Intelligence

Generic analyzers don't work for legal search. We built custom analyzers tailored to legal terminology:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "legal_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "legal_synonyms",
            "english_stop",
            "english_stemmer"
          ]
        },
        "ngram_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "ngram_filter"
          ]
        }
      },
      "filter": {
        "legal_synonyms": {
          "type": "synonym",
          "synonyms_path": "synonyms/legal_synonyms.txt"
        },
        "english_stop": {
          "type": "stop",
          "stopwords": "_english_"
        },
        "english_stemmer": {
          "type": "stemmer",
          "language": "english"
        },
        "ngram_filter": {
          "type": "ngram",
          "min_gram": 3,
          "max_gram": 20
        }
      }
    }
  }
}

Synonym Handling

Legal terminology is full of synonyms. Our legal_synonyms.txt file maps related terms:

plaintiff => complainant, claimant
defendant => respondent, accused
contract => agreement, accord
terminate => end, cancel, conclude
discovery => disclosure, inspection

Now when someone searches for "plaintiff," they also get results for "complainant" and "claimant" — without having to remember all the legal synonyms.

N-gram Autocomplete

The ngram_analyzer breaks text into 3-20 character chunks. This powers instant autocomplete:

  • User types "cont" → Shows "contract", "contractor", "contractual"
  • User types "discov" → Shows "discovery", "discoverable"

Autocomplete suggestions are ranked by how often users download those documents — popular terms appear first.


Query Optimization: Fuzzy + Boost + Filters

Fast indexing doesn't matter if your queries are slow. Here's how we structure search queries for speed and relevancy:

{
  "query": {
    "bool": {
      "must": [
        {
          "multi_match": {
            "query": "contract termination",
            "fields": ["title^3", "content"],
            "fuzziness": "AUTO",
            "type": "best_fields"
          }
        }
      ],
      "filter": [
        {
          "term": {
            "document_type": "contract"
          }
        }
      ],
      "should": [
        {
          "rank_feature": {
            "field": "download_count",
            "boost": 2.0
          }
        }
      ]
    }
  },
  "highlight": {
    "fields": {
      "title": {},
      "content": {
        "fragment_size": 150,
        "number_of_fragments": 3
      }
    }
  }
}

How This Works

Multi-match with field boosting: We search both title and content, but title matches are weighted 3x higher (title^3). If the query appears in the title, that document ranks higher.

Fuzzy matching: fuzziness: "AUTO" handles typos. "plantiff" matches "plaintiff" automatically.

Filters (not queries): Document type filtering uses filter, not must. Filters are cached and don't affect relevancy scoring — much faster.

Popularity boosting: Documents that get downloaded frequently rank higher. Real user behavior informs relevancy.

Highlighting: We return 3 snippets from the content showing where the match occurred. Users see context without opening the document.


Performance Tuning: The Details That Matter

1. Shard Strategy

We configured the index with 5 primary shards and 1 replica. Why?

  • 5 shards distribute the 1M documents across multiple nodes for parallel query execution
  • 1 replica provides redundancy without excessive storage cost
  • More shards = more overhead. 5 is the sweet spot for this data volume

2. Request Caching

Elasticsearch caches filter results automatically. Common filters like document_type: "contract" get cached after the first query. Subsequent queries skip the filter evaluation entirely.

3. Index Refresh Interval

We set refresh_interval: 30s. Legal documents don't change every second, so we don't need real-time indexing. Longer refresh intervals reduce I/O load and improve query performance.

4. Source Filtering

We only return the fields we need in search results:

"_source": ["title", "document_type", "created_date", "download_count"]

The full content field isn't returned — only highlighted snippets. This massively reduces response size and network transfer time.


The Results

Before Elasticsearch:

  • Average query time: 5-10 seconds
  • Frequent timeouts on complex searches
  • Rigid exact matching only
  • No autocomplete
  • Zero typo tolerance

After Elasticsearch:

  • Average query time: 85ms (even with 1M+ documents)
  • Zero timeouts
  • Fuzzy matching: Handles typos automatically
  • Synonym support: Finds related legal terms
  • Instant autocomplete: Results appear as you type
  • Relevancy boosting: Popular documents rank higher

Business Impact

Time savings: From 20 minutes to 2 minutes per search. That's a 90% reduction.

ROI: $220,000+ recovered annually in billable hours. The system paid for itself in under 3 months.

User satisfaction: Lawyers actually use the search now. Before, they avoided it because it was too frustrating.


Key Takeaways

If you're building Elasticsearch for legal (or any domain-specific) search:

  1. Design your mapping first. Multi-field mappings give you flexibility. Get this wrong and you'll have to reindex everything.
  2. Custom analyzers are essential. Generic analyzers don't understand domain terminology. Build synonym lists with actual users.
  3. Use filters, not queries, for facets. Filters are cached and don't affect scoring. Much faster.
  4. Boost by popularity. User behavior (downloads, clicks) is the best relevancy signal.
  5. Monitor query performance. Use Kibana to identify slow queries and optimize them.
  6. Don't over-shard. More shards = more coordination overhead. Start with 5 primaries for 1M documents.
  7. Only return what you need. Filter _source to reduce response size.

Want Help Building Elasticsearch Search?

We've implemented Elasticsearch for legal tech, healthcare, and enterprise clients. Every project is different, but the patterns are similar:

  • Design index mappings for your data structure
  • Build custom analyzers with domain-specific synonyms
  • Optimize queries for sub-100ms performance
  • Implement autocomplete and fuzzy matching
  • Set up monitoring with Kibana
  • Deploy on AWS with proper scaling

If you're drowning in slow search or considering Elasticsearch for your platform, we can help.

Need help with Elasticsearch performance or architecture?

We offer free 30-minute architecture reviews. We'll analyze your current setup, identify bottlenecks, and give you actionable recommendations — whether you hire us or not.

Schedule Free Architecture Review
← Back to Blog