Lights-out software delivery

The factory runs dark. The people moved up.

A dark factory isn't no humans. It's no humans on the floor — because the floor is specified, jigged, and gated tightly enough to build web software unattended. We didn't remove anyone. We moved them up a level: from writing code to defining intent and the checks that decide what ships.

›Pick who you are — the line reframes around what you care about.

View as

The line / six stations

Raw intent goes in. Shippable software comes out.

Warm stations run automated. Cool stations are where human judgment lives. Nothing ships until it passes inspection.

Specification

The raw material. Domain models, stories, contracts — unambiguous, machine-readable. Output quality is capped here.

Human layer

Jigs

House standards as tooling — skills, conventions, guardrails. The fixtures that stop the machine drifting from your style.

Tooling

Machines

Headless agents under orchestration. They produce candidates in parallel — never the final word on quality.

Automated

Inspection

Every build is checked by deterministic inspectors — build, full tests, mutation, architecture, contracts. Nothing ships until it passes.

Automated

Exception Desk

A person handles escalations, judgement calls, and owns the release. The lights-on layer the dark floor depends on.

Human layer

Ratchet

Every escaped defect upgrades a jig or gate. The line gets better while you sleep — defects can't recur as a class.

Compounding

Machine layer — runs automated Human layer — where judgment lives

For everyone

One claim, stated plainly.

"Can a software company raise its quality bar and get far more done — without it turning into AI hand-waving I can't trust?"

The shift

Most "AI delivery" pitches ask you to trust the model. This one doesn't. The model produces candidates; a fixed set of deterministic checks decides what's acceptable. The reliability comes from the checks, not from faith in the machine.

Why it holds

Quality is enforced before anyone sees the work. Standards live in tooling, not in one person's head. Every build leaves an audit trail. And a human still owns the release.

The honest part

It's only as good as the spec and the checks. Garbage intent in, garbage out — so the real work moves up front, to getting the requirement right. This is a twilight factory, not a fully dark one: lights mostly off, a human at the exception desk and the release sign-off.

What changes, in one line each

Owners get more delivered, to a higher standard, against a signed spec.
Developers design the line instead of typing boilerplate.
Design & accounts own the scarce human layer: intent and relationship.
Finance sees margin decouple from headcount, with controllable variable cost.

Station 04 / the trust mechanism

Inspection is the whole safety system.

This is what lets the lights go off. Each check is a deterministic inspector the agent must pass — no negotiation. The model's job is to produce something that survives inspection; inspection, not your trust in the model, is the guarantee.

Compile & analyzers

It builds clean, with nullability and static analysis enforced — not warned.

Test suite

The full suite passes. Coverage thresholds are a gate, not a dashboard.

Mutation testing

Catches tests that assert nothing — proving the suite actually bites.

Architecture fitness

Layering and boundary rules are tested like any other behaviour.

Contract tests

API output is checked against the signed contract from the spec.

Secrets & security scan

Hooks block anything that leaks a secret or trips a security rule.

AI review — advisory only

A model reviews for smell and intent. It advises. It never authorizes a release.

For finance / unit economics

The shape of the numbers.

An illustrative feature of representative size, traditional delivery versus the line. The structure is the point; the figures are placeholders for you to set from your own data.

Traditional delivery

Quality baselineBest-effort

Tests & monitoringOften deferred

ThroughputBaseline

Variable cost—

Cost driverHeadcount-hours

The line

Quality baselineEnforced

Tests & monitoringBuilt in

ThroughputMultiplied

Variable costModel spend (capped)

Cost driverSpec + checks

// Illustrative only — replace every figure with your measured numbers before presenting.

Read this first / where it stays human

Twilight, not fully dark — by choice.

The credible version of this is honest about its edges. Three places the human stays, deliberately.

It can't decide what you haven't

The line can't disambiguate a requirement the client hasn't settled. Undecided intent surfaces as an escalation, not a guess.

More of the known, not the unknown

It excels at producing well-understood shapes at speed. Novel architecture and genuinely new problems are still human work.

A human owns the release

Unattended merge to a client's production is a commercial and contractual decision, kept human on purpose — not a technical limitation.

Adoption / a real sequence

How a company stands the line up.

Smallest trustworthy loop first — not the whole cathedral. Each step earns the next.

Codify the spec format

One accepted input: story, acceptance criteria, contract. Intent becomes machine-readable.

Encode standards as jigs

Consolidate house patterns into skills, conventions, and hooks the agent can't ignore.

Build inspection first

Make the deterministic checks genuinely strict before automating any generation.

Wire one event-triggered loop

Spec → headless build → inspection → human review. One task class, one repo.

Add orchestration

Only once that loop is boring: parallel agents under an orchestrator you control.

Install the ratchet

Every escaped defect updates a jig or check, so its whole class can't return.

Widen the envelope

Grant more autonomy per task class as trust data accumulates — never globally at once.