Methodology · Not a gimmick

The experiment

Subterrans is a real game being built by a single developer with AI coding agents doing a real share of the work. Not a toy, not a demo reel for prompt engineering, not a pair-programming anecdote — a game with architectural discipline, a real test suite, and code that has to stand up to another agent's review before it merges.

This page is the short version of how that's structured. Longer write-ups land in the devlog.

How the work flows

01
PRD before code

Every phase starts as a requirements document, not an implementation. Goals, success criteria, and explicit non-goals. A separate planning pass breaks the PRD into a set of atomic plans with dependencies. Only then does anything get coded.
02
GSD workflow

Project ships as a series of numbered phases, each with its own roadmap, plans, research, and audit. Artifacts live in a private `.planning/` directory versioned alongside the code. The structure is what makes long-running AI-assisted work stay coherent across sessions.
03
Dual reviewer on every PR

Claude and Codex review every PR independently. Either can block; both must approve for merge. Two perspectives catch more than one, and the disagreements are often more instructive than the agreements.
04
Architectural tripwires

ESLint rules reject `Math.random()`, `Date.now()`, float arithmetic, and cross-boundary imports inside `src/sim/`. A custom `subterrans-sim-discipline` skill loads on any sim-layer change and walks the checklist. Violations fail the build, not a code review.
05
Headless by default

The simulation runs in Node with no browser. Tests assert on WorldState, not pixels. Determinism tests replay the same seeded input and compare final state bit-for-bit. The render layer is a thin skin over something fully verifiable.
06
Atomic commits, goal-backward verification

Each plan lands as one commit that satisfies one slice of the phase goal. A verifier re-reads the PRD after the phase ships and proves the delivered code actually satisfies the stated criteria — not just that tasks got checked off.

The seven non-negotiables

These are the architectural rules the agents work inside. They're what make multiplayer, testability, multi-platform, and agent comprehension all feasible on the same foundation.

Simulation isolated from rendering (ESLint-enforced).
Fixed 20 Hz tick. No variable timestep.
Single seeded PRNG; no wall-clock time in the sim.
Integer fixed-point math for all simulation quantities.
Lightweight ECS — data is data, behavior is functions.
Shared colony code for player and AI.
Snapshot saves with replay logging.

They're non-negotiable in the literal sense: breaking them doesn't produce a review comment, it fails the build.

Tools

Claude Code: Primary orchestrator. Planning, execution, code review.
Codex: Delegated subagent for mechanical work. Second independent reviewer on every PR.
GSD: The planning-and-execution framework. Open source. What keeps multi-day agent work coherent.
subterrans-sim-discipline: Custom skill. Loads whenever work touches the sim layer; walks the seven principles as a checklist before any change ships.
subterrans-pr-review: Custom skill that reviews diffs against the architectural principles. Used locally before opening PRs and by both reviewer agents in CI.

What this isn't

It isn't "the AI built a game while I watched." It's a loop where a human picks scope, writes direction, and reviews the output; the agents do the bulk of the typing, the dumb parts, and a shocking amount of the thinking, inside a shape where mistakes get caught fast. The interesting observation isn't that they can code — it's that they can stay inside a constraint system for weeks without drifting, provided the constraints are real.

The experiment

How the work flows

PRD before code

GSD workflow

Dual reviewer on every PR

Architectural tripwires

Headless by default

Atomic commits, goal-backward verification

The seven non-negotiables

Tools

What this isn't