Methodology · Not a gimmick
The experiment
Subterrans is a real game being built by a single developer with AI coding agents doing a real share of the work. Not a toy, not a demo reel for prompt engineering, not a pair-programming anecdote — a game with architectural discipline, a real test suite, and code that has to stand up to another agent's review before it merges.
This page is the short version of how that's structured. Longer write-ups land in the devlog.
How the work flows
- 01
PRD before code
Every phase starts as a requirements document, not an implementation. Goals, success criteria, and explicit non-goals. A separate planning pass breaks the PRD into a set of atomic plans with dependencies. Only then does anything get coded.
- 02
GSD workflow
Project ships as a series of numbered phases, each with its own roadmap, plans, research, and audit. Artifacts live in a private `.planning/` directory versioned alongside the code. The structure is what makes long-running AI-assisted work stay coherent across sessions.
- 03
Dual reviewer on every PR
Claude and Codex review every PR independently. Either can block; both must approve for merge. Two perspectives catch more than one, and the disagreements are often more instructive than the agreements.
- 04
Architectural tripwires
ESLint rules reject `Math.random()`, `Date.now()`, float arithmetic, and cross-boundary imports inside `src/sim/`. A custom `subterrans-sim-discipline` skill loads on any sim-layer change and walks the checklist. Violations fail the build, not a code review.
- 05
Headless by default
The simulation runs in Node with no browser. Tests assert on WorldState, not pixels. Determinism tests replay the same seeded input and compare final state bit-for-bit. The render layer is a thin skin over something fully verifiable.
- 06
Atomic commits, goal-backward verification
Each plan lands as one commit that satisfies one slice of the phase goal. A verifier re-reads the PRD after the phase ships and proves the delivered code actually satisfies the stated criteria — not just that tasks got checked off.
The seven non-negotiables
These are the architectural rules the agents work inside. They're what make multiplayer, testability, multi-platform, and agent comprehension all feasible on the same foundation.
- Simulation isolated from rendering (ESLint-enforced).
- Fixed 20 Hz tick. No variable timestep.
- Single seeded PRNG; no wall-clock time in the sim.
- Integer fixed-point math for all simulation quantities.
- Lightweight ECS — data is data, behavior is functions.
- Shared colony code for player and AI.
- Snapshot saves with replay logging.
They're non-negotiable in the literal sense: breaking them doesn't produce a review comment, it fails the build.
Tools
- Claude Code
- Primary orchestrator. Planning, execution, code review.
- Codex
- Delegated subagent for mechanical work. Second independent reviewer on every PR.
- GSD
- The planning-and-execution framework. Open source. What keeps multi-day agent work coherent.
- subterrans-sim-discipline
- Custom skill. Loads whenever work touches the sim layer; walks the seven principles as a checklist before any change ships.
- subterrans-pr-review
- Custom skill that reviews diffs against the architectural principles. Used locally before opening PRs and by both reviewer agents in CI.
What this isn't
It isn't "the AI built a game while I watched." It's a loop where a human picks scope, writes direction, and reviews the output; the agents do the bulk of the typing, the dumb parts, and a shocking amount of the thinking, inside a shape where mistakes get caught fast. The interesting observation isn't that they can code — it's that they can stay inside a constraint system for weeks without drifting, provided the constraints are real.