June 3, 2026 · 11 min read

Clio API Rate Limits and Batch Design: Safely Normalizing Thousands of Records

If you want to clean up messy data across thousands of Clio matters, the rate limit decides your architecture before anything else does. Here's what we learned building the open-source Clio MCP connector: the real throttle, the 429 backoff, the custom-field id trap that silently corrupts writes, and how to make a long batch resumable so an interruption never means starting over.

A common job for a growing law firm: a custom field that should hold a clean, court-formatted street address actually holds it five different ways, because intake captured whatever the client typed. Multiply that by a couple of hundred custom fields and a few thousand matters, and "just fix the data" turns into a batch-processing problem.

The instinct is to write a quick script that loops over every matter and PATCHes the fix. That script will work for the first dozen records and then start getting throttled. We hit this ourselves building the Clio MCP connector, and the same gotchas show up whether you're normalizing data, migrating off another system, or syncing Clio with something else.

This post is the field guide we wish we'd had. It's specific to the Clio Manage API, but the batch-design principles apply to MyCase, PracticePanther, or any practice-management API with a tight rate limit.


What is the Clio API rate limit?

Plan for roughly 3 requests per second per app against the Clio Manage API. Clio's documentation also references an older 50-requests-per-minute figure. Treat both as ceilings, not targets.

That number sets the shape of everything. At ~3 req/s, a job that has to read and write a few thousand matters does not finish in seconds. It runs for minutes, sometimes hours, especially when each matter needs a read (to fetch current values) and a write (to PATCH the fix). So the batch isn't a quick script you babysit. It's a long-running process you design to survive interruptions.

Clio returns the standard rate-limit headers on every response. Read them instead of guessing:

  • X-RateLimit-Limit and X-RateLimit-Remaining tell you how much budget is left in the current window.
  • Retry-After on an HTTP 429 tells you exactly how long to wait before the next call.

The mistake is hardcoding a fixed sleep between calls and hoping. Honor the headers, throttle to stay comfortably under the limit, and treat a 429 as a normal event you recover from cleanly, not an error that kills the run.

Quotable rule of thumb: at ~3 req/s, every 1,000 matters you have to read-then-write is roughly 2,000 requests, or about 11 minutes of pure API time before you add a single second of your own processing. Budget the job in minutes-per-thousand, then design it to be resumable so the runtime doesn't matter.

The custom-field trap that silently corrupts writes

This is the single most expensive gotcha in the Clio API, and it costs nothing at compile time. It costs you when you discover, after the batch ran, that the writes landed on the wrong values.

Reading custom fields is straightforward. You request them on the matter:

GET /matters/{id}?fields=custom_field_values{id,value,field_name,custom_field{id}}

Writing them back is where people get burned. You PATCH the matter with a nested custom_field_values array:

PATCH /matters/{id}
{
  "data": {
    "custom_field_values": [
      { "id": 987654, "value": "123 Main St, Suite 400" }
    ]
  }
}

The trap: the id inside custom_field_values is the value-instance id, not the field-definition id. The value-instance id is the id of that field's value on that specific matter. It's different on every matter. The field-definition id (the thing you see as custom_field:{id}) is the same everywhere, which is exactly why it's tempting to reuse, and exactly why reusing it is wrong.

Send the definition id where the value-instance id belongs and the write does not error in the way you'd hope. It goes somewhere you didn't intend. So the correct sequence for normalization is always read-first: GET the matter, pull the value-instance id for the field you're fixing, then PATCH using that id. You cannot batch the writes blind from a list of field definitions. Every write is paired with a read.

That pairing is also why the rate limit bites harder than people expect. You don't get to do one request per matter. You do two.

Inventory your fields by type before promising "normalize everything"

Not every Clio custom field accepts an arbitrary cleaned-up string. The field type constrains what you can write, and a normalization engine that ignores types will fail on the first picklist it hits.

Clio custom field types and their write constraints

Field type What you can safely write
Free text Anything. Normalizes cleanly. This is where most of your wins are.
Picklist Only predefined options (each up to ~55 chars). You must map a messy value onto an existing option, not invent one.
Currency Rejects decimals in the value. Plan your formatting accordingly.
Date Needs a valid date. A "cleaned" string that isn't a real date fails.

So step one of any real normalization job is not writing code. It's an inventory: pull every custom-field definition, group by type, and decide the rule per type. Free-text fields are where an LLM like Claude earns its keep, reading the messy entry and emitting your house format. Picklists are a mapping problem, not a generation problem. Currency and date fields are validation problems. Promise "we'll normalize every field" before you've done this inventory and you'll be wrong about a meaningful slice of them.

One more boundary worth stating plainly: the OAuth user running the batch needs write access to every matter it touches. If your token belongs to someone whose permissions don't cover the whole book of business, the batch will skip or fail on the matters they can't reach.


How to design a resumable batch runner

Here's the part the quick script always skips. A job that runs for an hour against a throttled API will get interrupted. Your laptop sleeps, the token expires, a 500 comes back from Clio, the network blips. The question isn't whether it stops. It's whether stopping costs you the whole run.

The design that survives all of that has five properties:

1. Paginate explicitly and checkpoint your position

Clio paginates at up to 200 records per page. Walk the pages in order and persist the last successfully processed cursor or matter id. If the job dies on page 47, it restarts on page 47, not page 1. The checkpoint lives outside the process: a small state file or table, written after each page commits.

2. Make every write idempotent

If you re-run a page after a crash, re-processing an already-fixed matter must be a no-op. The cheapest way: before writing, compare the current value to the target value and skip if they already match. Normalization is naturally idempotent when you do this, because a clean value normalizes to itself.

3. Treat 429 as flow control, not failure

On a 429, sleep for the Retry-After duration and retry the same request. On a 5xx, use exponential backoff with a cap, then retry. Only after repeated failures do you log the matter to a dead-letter list and move on. The run should never die because one matter misbehaved.

4. Dry-run and preview before you write a single byte

The most important feature for trust isn't speed, it's a preview mode. Run the whole job read-only first, produce a per-matter diff of "current value to proposed value," and let a human eyeball it. Nobody approves an unattended write across thousands of privileged matters on faith. They approve it after seeing the diff.

5. Keep a per-matter rollback record

Before each write, log the old value alongside the new one. If the normalization rule turns out wrong on field 12, you can replay the log in reverse and restore it. Without this, "undo" means manual cleanup, which is the exact chore you were trying to automate away.

The honest tradeoff: all five of these make the batch slower to build than the naive loop. They also make it the difference between a tool a law firm will run against live client data and a script nobody trusts to touch their matters. For privileged data, the audit trail and the rollback log aren't nice-to-haves. Our connector keeps an append-only audit log for exactly this reason, framed to support the kind of recordkeeping ABA Opinion 512 and equivalent guidance expect.


Two more Clio API behaviors that change your design

Document upload is a two-step presigned-S3 flow, not a multipart POST

If your batch also writes documents back into matters (say, a generated summary or a corrected form), don't expect a single upload endpoint. Clio uses a three-call dance: POST to /documents to register the file and get a put_url, PUT the raw bytes directly to that S3 URL, then PATCH the document with fully_uploaded: true to commit it. Skip the final PATCH and the document exists but stays invisible. We implement this in the upload_document tool of the connector so callers don't have to hand-roll the S3 step.

There is no document webhook, so you poll or trigger on matter updates

If your goal is to react to new documents (not just bulk-fix existing data), know that Clio has no document webhook. Webhooks exist for activity, bill, calendar_entry, communication, contact, matter, and task, and they auto-expire after 3 days by default (31 max), so a long-lived integration has to renew them on a schedule. To detect new documents you either poll on an interval or, more cleverly, trigger on the matter updated webhook when a stage change is the thing that causes document generation. Clio Draft's document automation is UI-only, with no API to trigger it, so the "generate then fill" pattern has to be assembled from the pieces that do have API surface.


Where the rate limit meets data residency

One reason batch design matters more for legal data than for a typical SaaS migration: if an LLM is doing the normalization on free-text fields, you're sending field values to a model provider, and for a Canadian family-law practice under PIPEDA and Law Society guidance, that crosses a line you have to account for.

A few facts worth knowing before you architect this. Clio runs a Canadian region (ca.app.clio.com), so your practice-management data can stay in Canada, and your batch must call the CA base URL if the account lives there. Anthropic, by contrast, has no Canadian data region today: Claude processes in the US. The practical mitigation is a zero-data-retention arrangement on the Claude API (note that ZDR is available org-level on the API, not on the Team chat plan), pinned to US inference, with the cross-border step documented for your confidentiality obligations. The slow pace forced by the ~3 req/s limit is almost helpful here: nothing about this pipeline needs to be fast, so "pull, normalize, write back, retain nothing" is an easy posture to hold.

This is the kind of decision we'd rather you make on purpose than discover after the fact. There isn't one correct architecture. A purely deterministic normalizer that never touches an LLM avoids the cross-border question entirely, at the cost of handling fewer messy cases. An LLM-assisted normalizer handles the long tail of human-entered chaos but adds a data-handling step you have to be honest about. The right call depends on your data and your obligations, not on which is more impressive.


The short version

  • Design around ~3 req/s. Read X-RateLimit-* and Retry-After; never hardcode a sleep.
  • Custom-field writes use the value-instance id, not the field-definition id. Always read before you write.
  • Inventory fields by type first. Free text is easy; picklist, currency, and date have format rules you map before promising anything.
  • Make the batch resumable, idempotent, and reversible, with a dry-run preview, because it will get interrupted and it's touching privileged data.
  • Documents upload via presigned S3 (POST, PUT, PATCH fully_uploaded), and there's no document webhook, so poll or trigger on matter updated.
  • If an LLM normalizes free text, account for data residency: Clio has a Canadian region, Anthropic doesn't, so use ZDR + US-pinned inference and document the cross-border step.

Frequently asked questions

What is the Clio API rate limit?

Plan for roughly 3 requests per second per app. Clio's docs also reference a 50-requests-per-minute legacy figure. Read the X-RateLimit-Remaining and Retry-After headers on every response, throttle below the limit, and back off on any 429. At that pace a job touching thousands of records runs for minutes to hours, so design it to be resumable.

How do you update a Clio custom field via the API?

PATCH the matter with a nested custom_field_values array. The id inside that array is the value-instance id (the id of that field's value on that specific matter), not the field-definition id (custom_field:{id}). Read the value-instance id from GET matter?fields=custom_field_values{...} first, then PATCH with it. Sending the definition id silently writes to the wrong place.

How do you upload a document to Clio via the API?

Clio uses a two-step presigned-S3 upload. POST to /documents to get a put_url, PUT the file bytes directly to that S3 URL, then PATCH the document with fully_uploaded set to true. It's not a single multipart POST, and skipping the final PATCH leaves the document invisible.

Does Clio have a webhook for new documents?

No. There's no document webhook. Webhooks exist for activity, bill, calendar_entry, communication, contact, matter, and task, and they auto-expire (3 days default, 31 max) so you must renew them. To react to new documents, poll on an interval or trigger on the matter updated webhook when a stage change causes document generation.


Want a second set of eyes on your Clio integration?

We build privilege-aware Clio and MyCase integrations for law firms, and we open-sourced the connectors that handle the upload flow, the audit logging, and the custom-field handling described above. If you're staring at a few thousand messy records, or a workflow you keep doing by hand, we'll look at your setup and give you an honest read on what's a config change, what's net-new connector work, and what the rate limit means for your timeline.

Book a 30-minute technical review →

Or read more from our legal AI integration practice:

Legal Tech

Related Articles

View all Legal Tech articles ➔

Book a Call