Senior AI Systems Architect
Strategy. Architecture. Implementation. Operations.
Most AI architects stop at #2 and hand you a deck. I run all four — and the ship gates that keep the system honest once it ships. Receipts, not claims: one operator, ~30 concurrent coding agents, 55 merged PRs/day on average, built on Next.js, Supabase, Fly, and MCP.
What does a senior AI systems architect do?
A senior AI systems architect owns the path from strategy to production: readiness audit, agent and MCP topology, typed contracts, eval gates, and the ship gates that keep the system honest.
The job is the whole loop, not a deck. On this practice that means agent fleets, MCP servers, and typed Next.js + Supabase stacks running under branch protection, evals, and audit trails — one coherent system, one operator accountable end-to-end.
The four domains — and what most architects skip
Strategy
- Readiness auditYes — 14-point checklist
- Pilot-to-production roadmapYes — 90-day default
- Build-vs-buy callsYes — defended in writing
- Risk & compliance framingYes — incl. EU AI Act framing
Architecture
- Agent & MCP topologyYes — production patterns
- Typed contractsYes — Zod, OpenAPI
- Eval gatesYes — golden sets per route
- Cost & latency budgetsYes — explicit per-endpoint
Implementation
- Agent fleet buildYes — 30+ concurrent agents
- MCP servers in productionYes — AppHandoff, MCP Beast
- Next.js + Supabase + FlyYes — default stack
- CI/CD + branch protectionYes — CI Gate, auto-merge on green
Operations
- Evals & regressionsYes — per-output schemas
- Audit & review trailsYes — every agent action logged
- Cost & drift monitoringYes
- On-call runbooksYes — handed off, not hoarded
Most architects stop at #2. The interesting part is #3 and #4 — where designs survive Friday deploys, or don't.
Why this practice is different
MCP fluency, in production
AppHandoff is a named MCP server running in production — agent orchestration, ticket lifecycle, typed tool catalogues. Not a tutorial. A live system.
Multi-agent orchestration patterns
~30 concurrent coding agents (Claude Code, Cursor, Codex, Copilot) coordinated by one human. Patterns that don't collapse under load — written up, not whiteboarded.
Agent-fleet ops
infra-gha-runners-fly: self-hosted GitHub Actions fleet on Fly.io, CI Gate fail-closed, GitHub-native auto-merge on green, and 63 reusable composite actions across the fleet.
Receipts, not claims
55 merged PRs/day on average over the last 30 days. Peak 111 on 2026-05-21. Green-to-merge in minutes, not days. Every number is measured, not aspirational.
Related work
Frequently asked
- What does a senior AI systems architect actually do?
- A senior AI systems architect owns the path from strategy to production: AI readiness audit, agent and MCP topology, typed contracts, eval gates, and the ship gates that keep the system honest. The job is the whole loop — not a deck and a hand-off. On our work, that means agent fleets, MCP servers, typed Next.js APIs, branch protection, evals, and audit trails living in one coherent stack.
- How is this different from a fractional CTO?
- A fractional CTO owns the whole technical org for a few days a week — hiring, vendor calls, board reporting, the lot. An AI systems architect goes deeper on one thing: the architecture and ship gates of the AI system itself. Most engagements run as a fractional CTO with the architect work baked in, because the same person designs the agent topology and signs off the build.
- Fractional vs full-time AI architect — which one do I need?
- Full-time makes sense once you have a permanent AI roadmap, a team to architect for, and the budget for a senior salary. Fractional is the right call before that: when you're past the demo and the production design has to be picked once, correctly. Two to three days a week, three to six months, then either continue or hand off to the hire that the architecture justified.
- What stack do you architect on?
- Next.js + Supabase + Fly + Cloudflare + Infisical as the default, with MCP servers wherever agents need typed tool catalogues. The agent fleet runs on infra-gha-runners-fly — about 30 concurrent coding agents averaging 55 merged PRs/day. New stacks are evaluated against that baseline, not adopted because they trend on Twitter.
- How do we start?
- Three steps. 1) A 20-minute call to scope what's actually being architected. 2) A two-week architecture sprint: readiness audit, topology, eval plan, ship-gate design. 3) Implementation, either with our build team or alongside yours. Book a call on the contact page.
One operator, one swarm. If the architecture has to be picked once, correctly — book a call.