Who is responsible when an AI drafts something wrong in a legal matter?

The lawyer. Both the Law Society of Ontario guidance and ABA Formal Opinion 512 require a human to verify AI outputs, and you cannot bill AI time as lawyer hours. That is why a mandatory human-review gate belongs in the workflow by design: the AI drafts, a person approves before anything is final.

Giving Claude Write Access to Your Case Management System Safely

We build connectors that let Claude talk to legal case management systems. Our open-source Clio MCP connector ships 26 tools, our MyCase connector is live on the same model, and along the way we hit nearly every sharp edge there is to hit. The hardest question isn't "can Claude read a matter." It's "can Claude write to a matter without me lying awake wondering what it just did to 400 client files."

This post is the answer we arrived at, written for the technically literate founder who already installed an MCP server, already writes good prompts, and now wants the same workflow to run automatically across a team without becoming a liability.

Reading is cheap to get wrong. Writing is expensive. So the entire design has to be built around making writes boring and reversible.

Why "give Claude write access" is the wrong framing

When people say they want Claude to update Clio, they usually picture handing the model an OAuth token with full scope and letting it do whatever the prompt implies. That's the version that should scare you. An LLM is non-deterministic. It will, occasionally, decide a field needs a value you never intended, or apply a change to the wrong matter because two matters had similar names.

The fix isn't a better prompt. It's architecture. You don't give Claude write access to your case management system. You give a piece of software you control write access, expose a small set of narrow tools to Claude, and make every one of those tools refuse to do anything outside its lane. Claude proposes. Your code disposes.

That single reframe is what separates a demo from something a 10 to 15 person legal network can actually run every day.

Pattern 1: Read-first, write-last

Start every workflow read-only. Claude should be able to pull a matter, its notes, its custom fields, and any linked documents long before it can change a single byte. In practice the read side and the write side are different tools with different permissions, and you build and trust the read side first.

This matters more than it sounds because of an asymmetry in most case management APIs. Reading is forgiving. Writing has rules you only discover when you violate them.

One concrete gap we found in our own connector: a create_note tool can be write-only. It happily creates a note, but it can't read existing notes back. If your workflow needs to combine a lawyer's handwritten notes with new content, "write-only note support" looks complete in a demo and falls over in production. Read-first design forces you to find that gap on day one, not in week six.

The diagnostic: For every write your workflow needs, ask "can I read back exactly what I'm about to overwrite?" If the answer is no, the write is unsafe regardless of how good the model is. Build the read path first, every time.

Pattern 2: Scoped writes (the most important rule)

A scoped write can only touch named things. Not "update the matter." Instead: update these fields on this matter, where the field list is fixed and the matter id is verified before the call goes out.

Clio's custom fields are the perfect example of why scoping is non-negotiable, because the API has a trap waiting for you. You read custom fields via GET matter?fields=custom_field_values{...} and write them via PATCH matter with a nested custom_field_values array. Here's the gotcha that cost us real time:

The value-instance-id trap: the id inside each custom_field_values entry is the value-instance id, the id of that value as it sits on that specific matter. It is NOT the field-definition id (custom_field:{id}). Send the definition id and you write to the wrong place or fail outright. An AI that "knows the Clio API" from training data gets this wrong constantly, because the distinction is buried.

A scoped write tool encodes that distinction once, correctly, in code, and never exposes it to the model. Claude says "set the mailing address field on matter 8842 to this normalized value." Your tool resolves the right value-instance id, validates that the field is writable, and rejects anything it doesn't recognize. The model never touches the raw PATCH.

Scoping also catches field-type constraints that a generic "normalize everything" prompt will trip over. In Clio, free-text fields normalize cleanly, but a picklist field only accepts predefined options of 55 characters or less, a currency field rejects decimals, and a date field needs a valid date. So before you promise blanket normalization across a couple hundred fields, you inventory the fields by type. Free text gets the LLM treatment. Picklists get mapped to allowed options. Currency and date get strict validators. The scoped tool refuses to write a value that doesn't fit the field type, which means a model hallucination becomes a rejected call instead of corrupted client data.

MyCase, PracticePanther, and the rest have their own versions of these constraints. The principle holds across all of them: enumerate what can be written, validate every value against its field type, and make the tool incapable of doing anything broader.

Pattern 3: Confirm gates and dry-run previews

There are two flavors of "are you sure" and you want both available depending on the risk of the write.

Confirm gate. For anything that touches a live matter or produces a client-facing artifact, a human approves before commit. The AI drafts a consult note, fills a field, or assembles a document, then it stops. A person reviews and clicks go. This isn't just good hygiene, it's a professional duty (more on that below).

Dry-run preview. For bulk operations, the tool runs the entire job without writing and shows you a diff: matter by matter, field by field, old value to proposed new value. You read the preview, you approve, then the same job runs for real. We consider this mandatory before any operation that writes to hundreds of fields across many matters. Nobody should trust a batch normalization run they haven't seen previewed first.

Pair the dry-run with per-matter rollback. If the model proposed something wrong and it slipped through, you can revert that one matter without unwinding the whole batch. The combination of preview plus rollback is what lets a founder actually trust automation against their entire book of clients.

Diagnostic: if a write can't be previewed before it happens and reverted after, treat it as a one-off that always needs a human in the loop, not something to automate at scale.

Pattern 4: Append-only audit log

Every write Claude triggers should produce a record that the application cannot retroactively alter: who initiated it, when, which matter, which field, the before and after value, and the model's stated reason. Append-only. Exportable.

Our connectors store this with local AES-256-GCM encrypted token storage and an append-only audit log with an export function, framed to line up with ABA Formal Opinion 512's supervision expectations. The point of "append-only" is that the log is evidence. If a value changed, the log says exactly when, by what process, and from what to what. When a client, a partner, or a regulator asks "why does this field say that," you have a clean answer instead of a shrug.

This is the layer most DIY automations skip, and it's the one that turns a clever script into something a regulated practice can defend.

How a real write actually completes: the upload flow

Closing the loop, getting a finished document back into the right matter, has its own mechanics that surprise people. In Clio you don't just POST a file. Uploads are a two-step presigned-S3 flow: you POST /documents to get a target, PUT the bytes to the returned put_url, then PATCH the document to mark it fully_uploaded. Miss the final PATCH and the document exists but never finalizes.

There's a related detection problem worth flagging. Clio has no document webhook. The webhooks that exist cover activity, bill, calendar_entry, communication, contact, matter, and task, and they auto-expire (3 days by default, 31 max) so you have to renew them on a schedule. If you need to react to a document being generated, you either poll, or you trigger on the matter updated webhook because a stage change is usually the actual cause of generation. Designing around the webhook you wish existed is a common way to ship something that silently never fires.

And respect the rate limits. Treat Clio conservatively at roughly 3 requests per second per app, honor the X-RateLimit-* headers, back off on 429s, and paginate at 200 per page. A batch normalization job across a few hundred matters is going to be slow by design, and a tool that ignores throttling will get itself banned mid-run. Build resumability in so a paused job picks up where it stopped.

The privacy layer: where does the data actually go?

Scoped writes protect your data inside Clio. The other half of the question is what happens when content leaves Clio to reach the model. For a regulated firm this is where the architecture earns or loses trust.

Use the right Anthropic surface. Zero Data Retention is available on the Anthropic API at the organization level, and a small firm can obtain it. It is not available on Claude Team chat. If your instinct was that the chat product felt wrong for confidential client work, that instinct was correct, it's the wrong surface, not the wrong vendor. Keep the pipeline on the plain Messages API where ZDR applies, and avoid surfaces that retain data by design.

Mind the geography. Anthropic has no Canadian region; the model runs in the US at rest. For a PIPEDA-regulated Canadian practice, the workable mitigation is ZDR plus a US inference-geo pin plus a plain cross-border disclosure to the client. Worth knowing: Clio itself runs a Canadian region at ca.app.clio.com, so your matter data can stay resident in Canada even while the inference step crosses the border for that one call. Confirm which Clio server your account is on and point the integration at the matching base URL.

None of this requires an enterprise contract. We'd steer a small firm away from Anthropic's Enterprise tier here, the seat minimums (20 self-serve, 50 sales-assisted) mean you'd be buying far more than a 10 to 15 person network needs. Org-level ZDR on the API gets you the guarantee without the over-buy.

When the source is a recording: a worked example

A common workflow is turning a client consult call into a structured note. It's a clean illustration of read-first plus scoped writes plus a confirm gate, and it has its own platform gotchas if you wire in Zoom.

The reliable backbone is the cloud-recording VTT transcript. It requires cloud recording turned on (a paid Zoom tier) and transcription enabled, and the transcript is typically ready around twice the meeting length, so a 30-minute consult yields a transcript roughly an hour later. Design for that delay; don't expect it instantly.

You can poll for recordings fully locally. GET /users/me/recordings needs no public endpoint, so the trigger doesn't require a hosted server. What's fragile is the AI Companion summary via the API: the read scope is blocked in a Server-to-Server OAuth app (you'd need a General or account-managed app), and those summaries auto-delete at 30 days. So treat the AI summary as best-effort. If it isn't available, summarize the VTT transcript yourself. Build on the transcript, not the summary.

The write side is the same disciplined flow: Claude drafts a standardized note from the transcript and the matter's existing notes, then the document gets uploaded into the correct matter via the two-step presigned-S3 flow, and the lawyer reviews and finalizes before it counts. Read context, scope the write to one matter, gate the result behind a human.

The human gate isn't optional, it's the rule

The Law Society of Ontario's 2024 guidance and ABA Formal Opinion 512 land in the same place: confidentiality, competence, supervision, mandatory human verification of AI outputs, and no billing AI time as lawyer hours. The LSO guidance is non-binding but evolving, and it's conceptually aligned with Opinion 512.

Translated into architecture, that means the confirm gate from Pattern 3 isn't just a safety feature you might add. It's a professional obligation you have to design in. The AI drafts. A licensed person verifies. Only then is anything final. Any vendor or automation that quietly skips the review step to look more "autonomous" is selling you a compliance problem.

Build it yourself or have it built?

If you're technical enough to install an MCP server and write your own prompts, you can absolutely stand up a version of this. The value isn't the capability, it's the reliability: getting the value-instance-id handling right, building the dry-run and rollback so you trust a large batch run, handling transcript latency and rate limits and webhook expiry, and packaging it so the same workflow runs identically across every lawyer in a network without each person re-solving the same gotchas.

That packaging and replication is where most DIY efforts stall. A script that works on your laptop is not the same as a tool ten colleagues can run safely against live client data. There are no-code routes too. You can wire some of this with n8n or similar automation platforms, and that's a reasonable place to prototype, but the scoped-write validation, the audit log, and the field-type guards are exactly the parts those tools leave to you.

We built our Clio MCP connector and MyCase connector as open source precisely so the read side is free to anyone. The paid work is the safe write layer and the workflow glue around it.

Frequently Asked Questions

Is it safe to give Claude write access to Clio or MyCase?

Yes, if you constrain it. The safe pattern is read-first design (Claude reads context but a human or rule controls writes), scoped writes (each tool can only touch named matters and named fields, never everything), a confirm gate or dry-run preview before anything is committed, and an append-only audit log of every write. Never give an AI agent a blanket write token to your whole case management account.

What is the most common bug when writing Clio custom fields?

Using the field-definition id instead of the value-instance id. When you PATCH a matter with a custom_field_values array, the id inside each entry is the id of that value instance on that specific matter, not the id of the field definition (custom_field:{id}). Send the definition id and you write to the wrong place or fail. It's the single most common trap when building against the Clio API.

Does Clio send a webhook when a document is created?

No. Clio has no document webhook. Webhooks exist for activity, bill, calendar_entry, communication, contact, matter, and task, and they auto-expire (3 days by default, 31 max) so you must renew them. To detect a new or generated document you poll, or trigger on the matter updated webhook when a stage change is the cause of generation.

Can I use Claude for legal work with zero data retention?

Zero Data Retention is available on the Anthropic API at the organization level, and a small firm can obtain it. It is not available on Claude Team chat, so an instinct that the chat product was the wrong surface for confidential work is correct. Keep the pipeline on ZDR-eligible API surfaces (the plain Messages API), avoid surfaces that hold data, and pin inference geography where the API supports it.

Can Claude run on a Canadian region for a PIPEDA firm?

Anthropic has no Canadian region; the model runs in the US at rest. For a PIPEDA-regulated Canadian firm the mitigation is Zero Data Retention plus a US inference-geo pin plus a plain cross-border disclosure. Clio itself runs a Canadian region (ca.app.clio.com), so your case data can stay in Canada even when the AI inference step crosses the border.

Who is responsible when an AI drafts something wrong in a matter?

The lawyer. Both the Law Society of Ontario guidance and ABA Formal Opinion 512 require a human to verify AI outputs, and you cannot bill AI time as lawyer hours. That's why a mandatory human-review gate belongs in the workflow by design: the AI drafts, a person approves before anything is final.

Next Step

If you're running Claude against Clio, MyCase, or PracticePanther and you want it to write back without becoming a liability, the architecture matters more than the model. Read-first, scoped writes, confirm gates, dry-run plus rollback, append-only audit. Get those right and automation stops being scary.

We offer a free 30-minute architecture review. Bring your current setup or your planned workflow, and we'll map where the write risk lives, which gotchas your case management API is hiding, and what a safe version looks like. No pitch deck, just an honest technical conversation.

Book a free architecture review →

Or read more from our legal tech practice: