A clinician-facing chronic-care platform with patient intake, structured encounter notes, and an AI scribe that earns clinician trust.
CADUCEUS needed to ship a working clinician console — intake, scheduling, encounter notes, an AI-assisted scribe, and an audit trail — for a pilot with two clinics. Two cycles, sequenced. The scribe was the thing they were afraid would not work.
▷ outcomes
94%
Scribe accuracy (cardio + endocrine)
−63%
Time-per-encounter note
0
PHI incidents in pilot window
T+1,402h
Pilot launch (2 cycles · 38h spare)
[ §01 ] the cycle
How 720 hours
actually ran.
-
Cycle 1 · Day 01 — 30
Console + intake + encounter notes
First cycle: clinician console, patient intake flow, scheduling, structured encounter notes with a hard-coded template per condition. No AI in the loop yet — we wanted the human workflow correct before agents touched it.
↳clinician console ↳intake flow ↳encounter template v1 -
Cycle 2 · Day 01 — 18
AI scribe + eval harness
build.agent shipped the scribe — Whisper transcription, Claude-based note structuring, clinician confirmation step. Crucially, qa.agent built an eval harness that ran every PR against 47 anonymized historical transcripts. Below threshold, the PR didn't merge.
↳scribe v1 ↳eval harness ↳47 anonymized transcripts -
Cycle 2 · Day 19 — 26
Privacy posture + audit trail
Scoped IAM, encrypted-at-rest with customer-managed keys, signed audit log, ephemeral inference (no transcript retention beyond clinician confirmation). Their compliance counsel reviewed the architecture mid-cycle.
↳IAM scoped ↳CMK keys ↳audit log -
Cycle 2 · Day 27 — 30
Pilot rollout
Two clinics, six clinicians, soft launch behind a feature flag. Daily check-ins with the founding clinician for the first 14 days post-launch. Scribe accuracy threshold met from day three.
↳pilot live ↳clinician training ↳30d on-call
[ §02 ] agent log · selected
What the loop
looked like.
[ §03 ] notes from the cycle
CADUCEUS sits in the category of healthtech that fails ninety percent of the time: clinician-facing AI in a regulated workflow. The technical risk isn’t the model. The risk is shipping something a clinician will turn off after the second use because it gets their language wrong.
How we sized the engagement
Two cycles, sequenced — not concurrent. The first cycle shipped the human-only workflow. The second cycle layered the scribe on top. This sequencing was non-negotiable: we don’t build AI into a workflow that doesn’t yet exist as a human process.
What the eval harness actually looks like
qa.agent maintains a corpus of 47 anonymized historical transcripts (provided with explicit patient consent, redacted of identifiers, hashed). Every PR that touches the scribe is run against the corpus and scored on three axes — structural compliance with the encounter template, clinical-vocabulary fidelity, and omission rate of clinically significant information. Below threshold on any axis, the PR does not merge.
The corpus is the product. We treat it that way.
What HIPAA-aware actually means in our delivery
We’re not a HIPAA auditor and we don’t claim to be. What we do is bring a posture: scoped IAM by default, customer-managed encryption keys, signed audit trails, ephemeral inference, business-associate-agreement-ready architecture. We partnered with CADUCEUS’s compliance counsel from week two. The certifications belong to the client; the readiness is what we deliver.
from the founder
"I've been a part of three health-tech builds before this one. The first time the engineering team understood that the scribe is the clinical relationship — not just an LLM call."
— Founding clinician · CADUCEUS