Writing

Field notes.

Hands-on writing on AI-native engineering, evals, durable workflows, and the operator side of building companies. From what I have shipped.

AI-native engineering

Latest
AI engineering 16 min read

OCR and the offline reality: a primer on reading text on a phone with no signal

Why OCR is harder than it looks, why the public datasets are skewed Latin, and why production offline OCR still runs on classical and CNN/transformer engines like Tesseract, Apple Vision, ML Kit, and PaddleOCR rather than a full VLM on the phone.

Jul 2026 Read
Engineering 14 min read

Scaling WebSockets: what changes when the connection never closes

A realtime chat UI is easy with one server. The hard part starts when a message for user A arrives on a server that does not hold A's socket. This is a primer on why WebSockets are different to scale, the pub/sub backplane that fixes the fan-out, and the connection, presence, backpressure, and reconnect problems you inherit by keeping the connection open.

Jul 2026 Read
Engineering 13 min read

Websockets, SSE, and HTTP/3 from first principles

A primer on the real-time web protocols, derived from scratch: what problem each one solves, how it works on the wire, when to use which, and how HTTP/2 and HTTP/3 (QUIC) change the tradeoffs. With real-world analogies and connection-model diagrams.

Jul 2026 Read
AI engineering 13 min read

Building a ChatGPT-style streaming chat from scratch

A primer on token streaming for chat UIs, built from first principles: why streaming matters, SSE vs websockets on the wire, server and client message state, rendering partial tokens, cancellation, and streaming tool calls. With concrete examples from the open-source Alfred codebase.

Jul 2026 Read
AI engineering 16 min read

Propose, verify, iterate: what a compiler paper and my coding-agent fleet have in common

ComPilot (PACT 2025) lets an off-the-shelf LLM optimize loops by proposing transformations a compiler checks for legality and times for speedup. Alfred, my open-source coding-agent fleet, runs the same propose-verify-iterate shape against tests, review, and CI instead of a compiler. A first-principles comparison of the loop, the verifier, and the memory, with the paper's numbers cited.

Jul 2026 Read
AI engineering 12 min read

Writing the verifier is the real work in agent RL

A reward function is a specification of correctness, not a scalar loss. Strict final-state equality is fine for a leaderboard and wrong for RL in real domains. The verifier is about 215 lines; understanding the domain is the cost.

Jun 2026 Read
Engineering 12 min read

Durable by design: two ways to survive failure in AI-native workflows

Every AI-native workflow eventually fails partway through. There are two coherent ways to survive it: resume the work with a durable-execution engine, or make every step idempotent and re-derivable. When to use which, from systems running both.

Jun 2026 Read
AI engineering 50 min read

Deploying a customer-support AI agent to production: testing it, then trusting the numbers

A practitioner's primer on taking a vendor-hosted AI support agent to production and operating it: test it like a customer before launch, then measure what it solves once it is live, because the headline resolution rate overstates the win. Built around Zendesk AI Agents and Ultimate AI.

Jun 2026 Read
AI engineering 13 min read

Three levers for cutting your coding agent's token bill, and how to tell which one works

Coding agents burn tokens on naive grep-and-read loops and raw tool output. Three independent levers cut the bill. I ran a real internal benchmark on a code-graph stack over 17 repos and the lever I expected to win did not. Here is what I measured, per engine, behind a quality gate.

May 2026 Read
AI engineering 14 min read

Loops, harnesses, and memory: what it takes to run an autonomous coding fleet

What it takes to run coding agents unattended in production: the loop that fires and recovers, the harness that contains a fully-trusted model, and memory that stays reviewable. A practitioner's guide from building Alfred.

May 2026 Read
Engineering 15 min read

Where ML actually fits in a strategy trading system, from first principles

Most 'AI trading' talk skips the part that matters: the system a model has to live inside. A strategy needs features, a label, an offline train step, a backtest that reuses the live code exactly, a serving path into the signal, and guardrails that can veto the model. Here is the whole loop built from scratch, grounded in a real quantitative engine.

May 2026 Read
Engineering 11 min read

The translation layer: bridging static-auth clients to short-lived-token backends

A lot of practical AI infra work is impedance matching between API shapes that almost agree. Here is what I learned putting a small local proxy in the seam between static-bearer clients and Vertex AI's 60-minute OAuth tokens.

Apr 2026 Read
Engineering 14 min read

Event sourcing and CQRS for a trading engine, from first principles

A trading engine has to answer 'how did we get here' for every position and never lose an order to a crash. Event sourcing stores the facts, not the state, and CQRS splits the write path from the read path. Here is how both work, built up from scratch, with sanitized Axon code from a real order book.

Apr 2026 Read
Engineering 12 min read

Idempotency is the load-bearing primitive for ingesting other people's data

When you ingest data from external providers you control none of the delivery semantics. The fix is not cleaner delivery. It is making application correctness invariant to how data arrives, with idempotency as the one primitive everything else leans on.

Apr 2026 Read
Engineering 12 min read

A daily hotel KPI pipeline that survives repeated snapshots

Building a daily hotel-performance KPI pipeline with dbt and DuckDB: dedup before you aggregate, treat occupancy and revenue status asymmetrically, validate at two grains, build byte-reproducibly, and gate CI on an independent reconciliation.

Mar 2026 Read
Engineering 14 min read

Attributing Snowflake cost to the query that caused it

Snowflake bills compute by the second but tells you the total, not who spent it. Here is the architecture I built to reconstruct per-query cost from ACCOUNT_USAGE, the apportionment math that makes it work, the auto-suspend lever that paid for most of the savings, and how it runs as a Snowflake Native App.

Jan 2026 Read

Building companies

Event marketing 12 min read

How to measure B2B event ROI

Events run 10-20% of a B2B marketing budget and most teams cannot prove the return. A first-principles framework: fully loaded cost, sourced vs influenced pipeline, an attribution window matched to your sales cycle, cost per opportunity, and why follow-up speed is the cheapest lever on the funnel.

Feb 2026 Read
Startup 13 min read

From idea to MVP in 3 months: a solo founder's playbook

How I built Mainteny's MVP solo in three months by shipping one workflow and cutting everything else, including the routing detour that cost me two sprints. Core loop first, infrastructure last, boring stack, customers before the product worked.

Dec 2025 Read
Founder sales 12 min read

Technical founders can sell: here's how

Engineers have a real advantage in modern B2B sales: they understand the product deepest and can answer the question that decides the deal. A first-principles guide to founder-led selling: help instead of pitch, do discovery, handle objections with honesty, and know when to hire.

Nov 2025 Read
Fundraising 13 min read

The seed fundraising playbook

A first-principles guide to raising a first seed round in a tight market: how the raise is a funnel, cold outreach that gets replies, what investors actually underwrite, and the mistakes that cost us weeks. From co-founding Mainteny.

Oct 2025 Read
Startup journey 16 min read

Building a $3.6M ARR product inside a consulting firm

What it takes to build a real software business inside a large consulting firm: distribution before product, the organizational immune system, and why I owned the data layer myself. Lessons from taking Aura at Bain from $0 to $3.6M ARR.

Sep 2025 Read
Founder lessons 12 min read

Too early: what a teen banking startup taught me about timing and unit economics

My first startup was Spendzer, a banking app for teenagers in Germany. The idea was right and the timing was wrong. Here is what building it taught me about unit economics, being early, and why two engineers have to learn to sell.

Aug 2025 Read