INSIGHTS

Writing on agents in production.

What we've learned from shipping. No takes on whether agents are real.

// featured post

EVALSMAY 12, 20268 MIN

Why your eval suite belongs in CI.

If your evals don't run on every PR, they're not evals — they're a screenshot from a meeting in March.

Read · Engineering team

ENGINEERINGAPR 28, 20266 MIN

Durable workflows are not retries.

A reasonable mental model for when you actually need a durable orchestrator vs. just better backoff.

Engineering team

INDUSTRYAPR 14, 20265 MIN

A2A and MCP: pick both.

They solve adjacent problems. The mistake is treating them as a fork in the road.

EVALSAPR 02, 202610 MIN

How to build a retrieval eval set in a week.

A practical recipe. 200 questions, your real corpus, baselines, and a "do not optimize this" set.

PROCESSMAR 20, 20267 MIN

The runbook that ships with every agent.

Failure modes, escalations, rollback conditions. Boring on purpose.

Operations team

PROCESSMAR 04, 20264 MIN

Why we run on-call for 60 days after handoff.

Because the first 60 days surface every assumption the eval suite missed.

Operations team

START

Ship the first system.

Fixed-price discovery in 2 weeks. You leave with an architecture, a working spike, and a build plan.

Start a project See engagement models