Ambient scribing for ambulatory care.

The problem

Documentation burden is the single biggest driver of clinician burnout. Existing scribing tools generated text — but not the structured note clinicians could sign. Citations to the source utterance were essentially nonexistent.

Our approach

We built a scribe pipeline with two non-negotiable invariants:

Every claim has a source. The clinician can hover any sentence and see the ambient transcript span that produced it.
Templates are owned by the clinical team. Note structure (SOAP, narrative, etc.) is defined per-specialty in the CMS; the agent fills the slots.

The first version we shipped had a 12% citation-failure rate. The third had 0.8%. The evals caught every regression.— Eng lead, Healthtech, US

The architecture

02 · Reference architecture

What we actually ship.

Every system we build follows this shape. Client at the edge, tools in a sandbox, traces everywhere, evaluators gating output. No black boxes, no "it works on my machine."

EDGE

Client

web · mobile · API

Gateway

auth · rate-limit · PII redact

ORCHESTRATION

Orchestrator

planner · router · memory

Durable · exactly-once

AGENTS & TOOLS

Retrieval

hybrid · rerank · cite

Tool calls

sandboxed · timeboxed

Evaluator

gates · rubrics · LLM-judge

DATA & TRUST

Vector + BM25

tenant-isolated

Traces / logs

OTel · replayable

Signed output

auditable · rollback

A run, on this system

04 · Run

Watch an agent do the job.

Three real production scenarios, replayed at observed latency. Every box is a span; every span has tokens, cost, and an eval gate. This is what shows up in your traces, not a marketing animation.

POST/api/v1/agent/run7-minute ambient consult → clinician-ready note with source-linked citations.

trace · clinicalspan_id 7c1f…

orchestrator.run0ms

PLANorchestrator.planorchestrator · 80ms · 178 tok · $0.0013

Plan rationale

Scribe pipeline: ASR with medical lexicon, retrieve patient context + template, draft per-SOAP-section, evaluator gates clinical-safety claims.

Subtasks

tool.asrretrieval.contextreasonerevaluator

0ms/3.07s

LATENCY0msbudget 3.00s

TOKENS0in + out

COST$0.0000budget $0.025

EVAL GATE—deterministic + LLM-judge

READY · clinician review

SOAP note drafted. 14 citations resolved. 2 low-confidence claims surfaced for clinician.

latency3.12scost$0.022tokens3,104evals9/9

Outcomes

0.0h

Time saved / clinician / week

0.0%

Citation-failure rate

Clinician sign-off rate