Level 3 · The Full Workflow

Part 3 of 3, for Janika_

Inside the
machine
room.

You’ve driven the car. Now let’s look under the bonnet. Feature packs, phase gates, parallel reviews, the whole workflow that makes bigger builds possible.

ProjectShip It Stage03 of 03 ScaleGraduate RevA

Chapter 01 · Welcome back

Six things you
already know.

Say the answer out loud before you reveal it. You’ve met all of these in Parts 1 and 2.

The four files in state/?

NOW.md, IMPLEMENTATION_PROGRESS.md, KNOWN_PITFALLS.md, DECISION_LOG.md. The project’s heartbeat. Claude reads them at session start.

tap to reveal

What a hook does?

Fires at a fixed tool-call moment. Enforces a rule automatically. Auto-format, scope-guard, no-verify blocker.

tap to reveal

What /implementation-guide enforces?

The visual-phase gate (design audit + screenshots required before greenlighting a visual phase) and a context-budget check before each phase.

tap to reveal

Worktrees, in one line?

Two copies of your project on one Mac. Two Claudes can work on different features in parallel without colliding.

tap to reveal

The 70% problem?

AI gets you 70% fast. The last 30% (edges, consistency, the details) is where projects stall. CLAUDE.md + quality gates close the gap.

tap to reveal

/wrap vs asking for a commit?

Asking Claude to commit saves a local checkpoint. /wrap tidies the state/ files (NOW.md, progress, pitfalls) and commits those locally at session end. Neither pushes. Say “push” when you want it on GitHub.

tap to reveal

Self-efficacy check

You’ve already run half of the commands you’ll see in this deck. This isn’t new territory, it’s the same ground, one floor deeper.

Chapter 02 · The big picture

Three decks,
one journey.

Before we dive in, here’s where each deck lives. Part 3 is the inside of the stages Part 2 named.

Think of it as a drawing scale: Part 1 is 1:200 (whole site). Part 2 is 1:50 (floor plans). Part 3 is 1:5 (details). Same building.

Chapter 03 · The unit of work

A product is
a stack of packs.

Trying to build a whole product at once is like laying out a whole hospital in one drawing. It doesn’t work. Ship It breaks every product into feature packs: a single feature’s worth of spec, contract, states, acceptance, and build plan, all in one folder.

01

What’s in a pack

__spec.md (what it does), __contracts.md (exact field names and API shapes), __states.md (every UI state), __acceptance.md (pass/fail tests), and, when planning begins, __build.md (the phase plan).

02

Vertical slices, not horizontal layers

Each phase inside a pack builds a vertical slice: a small end-to-end piece that works and is testable. Types + DB + backend + UI for one feature. Never “do all the backend, then all the UI”, that way lies endless integration pain.

Architectural parallel

Like issuing a single room in full: plans, sections, details, M&E, finishes schedule, the works. When it’s signed off, you move to the next room. The room is the unit of delivery, not the drawing type.

Chapter 04 · Catching the mistakes before building

Before you write
a single line, audit.

Writing a spec is not the same as writing a good spec. Ship It runs five passes over every blueprint before building begins. Each pass catches different kinds of problem.

A

Hygiene

Terminology consistent? Names match? Cross-references resolve? Every named thing defined? Catches the silly stuff before it hardens into the build.

A.5

Navigation

A dedicated sweep just for screen-to-screen flow. Does every “tap here” lead somewhere real? Any orphaned screens? This is where UX confusion is caught.

B

Contract tracing

Every field in every contract traced back to the spec. If the spec says “booking has a date” and the contract doesn’t, the contract is wrong.

C

Flow validation

End-to-end user journeys walked through. If booking requires login but login isn’t in the flow, that’s a blocker. Found at audit, not at build.

D

Dry-run buildability

Can someone build this from the docs alone, without asking you questions? If the answer is “maybe”, the blueprint isn’t ready.

✓

Result: ready to build

(not a pass, an outcome)

Only when all five passes above are clean. This discipline is the single biggest difference between projects that ship and projects that stall.

Chapter 05 · Before you draw, list everything that needs drawing

Screen inventory,
room data sheets.

Before starting any visual build, Ship It takes a deliberate pause. Two skills fire in sequence.

01

/screen-inventory

Produces a comprehensive list of every screen in the product: what it is, which feature pack owns it, what states it has. Once per project, at the top.

Like a room data sheet before detailed design starts.

02

/design-preflight

Fires before each visual phase. Enriches the builder prompt with motion specs, touch targets, glass/material choices, asset needs, and emotional tone. Closes gaps that would otherwise get winged.

Like pre-issue checklist before any drawing set leaves your desk.

Why both

Screen inventory gives you the total picture. Design preflight makes each phase specific. Together, the design never feels improvised.

Chapter 06 · Your materials library

Every project
needs stuff.

Icons, images, illustrations, animations, screen mockups, copy. Collectively: assets. Ship It’s /asset-forge inventories what you need and writes locked prompts for each. The trick is picking the right tool for each type.

A flat-lay of design assets: colour swatches, an icon reference card, typography specimen, a pen and bulldog clip

Hero images

GPT Image 2.0

OpenAI’s latest image model. Great for atmospheric, magazine-style visuals. Best when you want something photoreal or illustrated with real style.

Paid, via OpenAI API

Variations & volume

Gemini 2.5 / Imagen

Google’s image generation. Good when you need ten variants cheaply, or want to explore styles before committing.

Free tier, then paid

Full screen mockups

Stitch (via MCP)

Claude can drive Stitch directly to generate complete screens in your design system, with layout, components, and typography applied. Best for fast visual iteration.

Subscription

Icons, always

Icons8

Curated library of thousands of icons in matching families. Drop-in consistent assets, every time. Nothing you generate will match this for uniformity.

Free tier generous

Precision diagrams

Inline SVG (from Claude)

For charts, architectural diagrams, custom shapes. Claude writes SVG directly. Sharper, lighter, more controllable than any raster alternative.

Free

Motion

CSS, Framer Motion, Lottie

CSS keyframes for simple reveals. Framer Motion for complex interactive UI animation. Lottie for designer-authored JSON animations imported from After Effects.

Free / library-dependent

The principle

Match the tool to the task. GPT Image 2.0 for a hero photo, never for a chart icon. Inline SVG for your itinerary’s route line, never for a person’s face. Icons8 for the button glyphs, never hand-drawn.

Try this Claude, draw me an SVG of a five-stop trip as a transit-style route. One coloured line, a round marker at each stop, the next stop filled in, stop names underneath.

Chapter 07 · Checks that fire at every phase

Seven gates.
Plus a design loop.

Every phase of every pack passes seven gates before it can be called done. For visual phases, a small between-phase design loop fires too. This is how Ship It stops Claude from shipping mess.

The seven gates (sequential)

1

Spec check

Does the code match the spec exactly?

2

Design check

Tokens, layout, no raw hex.

3

Pitfalls

Traps from KNOWN_PITFALLS avoided?

4

Security

Auth, injection, exposed secrets.

5

Code quality

No duplicates, no dead imports.

6

Terminal

lint + typecheck + test = zero.

7

Done-when

Every done-when checklist item verified.

The visual loop (after visual phases)

/design-audit → code-only check
↓
/design-advisor → suggest fixes
↓
/design-build → apply them
↓
/feel-check → on-device test

Runs between every visual phase. Small, tight, repeatable. Design never drifts.

Chapter 08 · Hooks deeper, plus parallel development

Two force
multipliers.

Hooks beyond the basics

Part 2 introduced three beginner hooks. Ship It’s starter pack adds a couple more worth knowing once you’re committing real work:

credential-read-guard: blocks reads of credential files
known-pitfalls: reminds of recorded traps before risky ops
playbook-capture-nudge: soft reminder to capture learnings
filing-scratch-guard: prevents writes to scratch paths
code-quality-check: runs terminal gates after edits

Grow the set by incident. Every hook that earns a place there does so because something real went wrong once.

Parallelism: two agents, two features

/parallelism-advisor studies your plan and tells you:

→ which packs can run in parallel worktrees
→ which phases inside a pack are independent (the Diamond Pattern)
→ a ready-made worktree setup command
→ sequential vs parallel timeline comparison

Diamond Pattern example
Phase 1 (types + DB)
↓ fan out
Phase 2 (backend)
Phase 3 (UI shell)
↓ converge
Phase 4 (integration)

Chapter 09 · The truth about AI-written code

LLMs make
the same mistakes
every human does.

Just faster. Dead code, repeated logic, silent failures, drifted names, scope creep. Ship It’s whole pipeline is, in effect, a set of specialised mistake-catchers. Here’s what each one catches.

Dead code (never-run branches)

caught by /pr-review-toolkit:code-simplifier (every 3rd phase)

Duplicated logic across files

caught by /pr-review-toolkit:code-reviewer

Silent failures (try/catch swallow)

caught by /pr-review-toolkit:silent-failure-hunter

Field names drift from spec

caught by /review-diff

Security holes (auth bypass, injection)

caught by /security-review (required every pack)

Scope creep (edits outside phase)

caught by scope-guard hook + /diff-audit

Missing tests

caught by /pr-review-toolkit:test-coverage-analyzer

Comments that lie about the code

caught by /pr-review-toolkit:comment-accuracy-checker

The real win

These don’t run one by one. At the end of a pack, /pr-review-toolkit:review-pr spawns its six specialists in parallel, and the other three skills (/review-diff, /security-review, /diff-audit) run alongside. Most of the team reviews your work at once, in about a minute. Then you read the reports.

Chapter 10 · The gate at the end

Four specialists.
In parallel.

At the end of every pack, Ship It runs a 4-step review. Steps 1 to 3 fire in parallel. Step 4 runs last. The whole thing takes about as long as one specialist working alone.

01

/pr-review-toolkit:review-pr

Six review specialists spawn in parallel: simplifier, reviewer, silent-failure hunter, test coverage, type design, comment accuracy.

02

/review-diff

Spec-compliance check. Catches naming drift, invented fields, and contract mismatches.

03

/security-review

Injection, auth bypass, crypto, data exposure, service-account leaks, dependency vulnerabilities.

04

/diff-audit

Scope sanity. Only intended files changed? No accidental formatting churn? File count matches the plan? Runs after the first three.

Why this exists

Study after study finds AI-generated code ships more security holes than reviewed human code. You can’t eye-review your way out of that. You need specialists.

Chapter 11 · Quick check

What would you
do if...

One scenario. Pick your answer. Reveal to check. No grade, just noticing.

You’re at Phase 4. Gate 3 (Pitfalls) flags a trap recorded from a previous project, but the fix adds a day to your schedule. Do you:

A. Fix it now before moving on.

B. Log it in DECISION_LOG.md with reasoning and continue.

Answer: it depends. If the pitfall is on the critical path (touches code you’re about to build on), fix it now, your Phase 5 builds on this. If it’s cosmetic or touches an area you’re not editing this phase, log it in DECISION_LOG with “deferred, will address in Phase N” and move on. The rule: never ignore a pitfall silently. You either fix it or you explicitly defer it.

tap to reveal

Chapter 12 · Now, the real thing

Plan the next stage
with the real tools.

This is not a rebuild. Your trip planner already works. This is a planning exercise, seeing what /spec-writer and /build-pack produce when you point them at a real growth decision. Takes fifteen minutes.

Step 1 of 5 · Run /spec-writer

Pick one slice, feed it in.

Your trip planner could grow three ways: a memory book of past trips with photos and notes, sharing a trip with a partner so two people can tick one packing list, or a simple map view of your stops. Each of those is a feature pack in its own right: one feature’s worth of spec, contracts, states, acceptance. Pick the one you actually want. Open Claude in your trip-planner workspace, paste this with your pick filled in, and answer its questions honestly. At the end you’ll have a SPECS.md that’s better than anything you’ve written for this project before. (Drop the weather clause from the paste if you skipped the stretch goal.)

Paste /spec-writer

My trip planner shows one trip: name and dates, a countdown, a packing checklist, itinerary stops as stations on a line, notes, and live destination weather. The next feature is [your pick: the memory book / partner sharing / the map view]. Treat it as a single feature pack. Walk me through the five-phase spec flow.

Step 2 of 5 · Run /feature-plan

Turn the spec into the pack.

Paste /feature-plan

Use the SPECS.md you just wrote. Plan [your pick] as a single feature pack, kept small.

This turns the spec into the pack: the contracts (what it does and doesn’t do), the states, the acceptance criteria. The detailed drawing for this one thing.

Step 3 of 5 · Run /screen-inventory

List the screens before anyone draws.

Paste /screen-inventory --coverage

It lists every screen the feature adds or changes, and the states each screen can be in. Inventory before drawing, same instinct as Chapter 05.

Step 4 of 5 · Run /build-pack

Now the phase plan.

Paste /build-pack

Use the SPECS.md and the feature pack you just wrote. Plan it as a single pack, kept small, no more than 5 phases.

Its gates now have what they need: the pack files and the screen inventory. Read what it produces. Notice the screen list came first (inventory before drawing), then done-when checklists for each phase, context budgets, pitfalls flagged from KNOWN_PITFALLS. This is what a production plan looks like.

Scale contrast

Your feature-pack plan probably came out with 3, 4, or 5 phases. The same tool produced a 15-phase plan for the finance dashboard build, about 8,400 lines of code when it was done. Same workflow, different scale. The tools adjust; the discipline doesn’t.

Step 5 of 5 · Look at the difference

Three things you couldn’t do in Part 2.

Take two minutes. Type what’s genuinely new.

01

02

03

(Saved in your browser.)

Chapter 13 · Starting your own project

Ship It
is a template.

The trip planner was your practice. Now you take the same playbook and start something you actually care about. Six steps.

A worked example · your TfL widget

Remember the TfL delay-times widget, the app that tells you how your lines are running before you leave the house? It was your idea, and it’s the example this whole workshop was first built around (before the trip planner took its place in these decks). It is also exactly the right size for a first solo project: one screen, one free data feed, one person who genuinely wants it. That’s the pattern to copy. Pick something you personally care about, and keep it small. If nothing else is calling to you yet, build the widget. Two practical notes from experience: register at api.tfl.gov.uk (the activation email loves the spam folder), and pick the free 500-requests-per-minute product when it asks.

01

Get the playbook onto your Mac

Ship It lives in a private GitHub repository, shared by invitation. Munim invites your GitHub account (a one-time email invite; accept it and you’re in). Then ask Claude: “clone github.com/munimmoiz0-sys/ship-it into my home folder as ship-it”. It uses the GitHub login you set up in the step below.

02

Make a project folder

Decide the one thing you’re building (nothing too big, a tiny first feature is fine). Tell Claude: “make a new folder called my-project next to ship-it and start me in it”.

03

Run the starter script

One prompt does this:

Set up this project from the Ship It playbook: run ~/ship-it/scripts/init.sh ~/my-project for me and tell me when you’re done.

04

Spec it first

Run /spec-writer and answer its questions honestly. It writes SPECS.md. Read what it writes before you go on; ten minutes here saves hours later.

05

Scaffold from the spec

Run /bootstrap-project. It reads the spec and writes your CLAUDE.md and the state/ files, and installs the working skills. Then copy in the five Tier 1 rules from templates/CLAUDE.md.starter, delete anything you’re not ready for, and keep it under 60 lines.

06

Plan, build, repeat

For each phase: /feature-plan, then /build-pack, then /build-team writes the code and Claude runs the 7 quality checks automatically. Ask Claude to commit when something works. End of pack: the 4-check review. Then merge and start the next pack. Same loop, forever.

Remember

Ship It doesn’t build for you, it gives you the scaffolding and the discipline. The muscle you’ve built across these three decks, that is the thing that ships your project.

When you want it saved online · GitHub

Local commits live on your Mac only. GitHub is where you put them to back up, share, and (eventually) collaborate. Three one-time steps, then it’s one prompt away for every future project.

One-time · 1. Make a GitHub account Open github.com and click Sign up. Free is fine. Pick a username you don’t mind being public (e.g. your first name plus a number). One-time · 2. Install the GitHub CLI A clean way to talk to GitHub from your project folder. In Terminal (or ask Claude to run it for you), paste:

brew install gh

If Terminal says brew: command not found, ask Claude “install Homebrew for me” and it will walk you through it. One-time · 3. Pair the CLI with your account

gh auth login

Accept the defaults (GitHub.com, HTTPS, Login via web browser). It shows a one-time code, click the link, paste it into the browser, authorise. Come back to Terminal. You’re paired. From now on · Publish any project Open Claude inside the project folder and say:

create a private github repo for this project and push everything to it

Claude runs the right commands (gh repo create --private --push --source=.) and you have a backed-up repository in under a minute. From then on, after any commit, just say “push” and your latest work lands on GitHub.

Start every repo private by default. You can flip it public later. Anything with an API key, a draft, or anyone else’s data stays private, always.

When you want to put it live on the web

Static sites (decks, demos, landing pages) deploy to Cloudflare Pages in one command: wrangler pages deploy <folder> --project-name=<name>. The deploy script at scripts/deploy-decks.sh in Ship It is the pattern, copy and adapt.

You’re actually ready now.

You know what happens inside each stage. You know why the gates exist. You know which skill catches which kind of mistake. Nothing in a real project will be a surprise.

★

Your one next action
Pick one real project you care about. Run /spec-writer on it. Ship a single pack. Notice the difference.

For monthly maintenance, a quick ten minutes with /tidy /prune /playbook-feedback keeps the repo breathing.

Open Part 4: The Studio →

End of the coursework. The studio is next.

Inside themachineroom.

Six things youalready know.

Three decks,one journey.

A product isa stack of packs.

What’s in a pack

Vertical slices, not horizontal layers

Before you writea single line, audit.

Hygiene

Navigation

Contract tracing

Flow validation

Dry-run buildability

Result: ready to build

Screen inventory,room data sheets.

/screen-inventory

/design-preflight

Every projectneeds stuff.

GPT Image 2.0

Gemini 2.5 / Imagen

Stitch (via MCP)

Icons8

Inline SVG (from Claude)

CSS, Framer Motion, Lottie

Seven gates.Plus a design loop.

The seven gates (sequential)

The visual loop (after visual phases)

Two forcemultipliers.

Hooks beyond the basics

Parallelism: two agents, two features

LLMs makethe same mistakesevery human does.

Four specialists.In parallel.

/pr-review-toolkit:review-pr

/review-diff

/security-review

/diff-audit

What would youdo if...

Plan the next stagewith the real tools.

Pick one slice, feed it in.

Turn the spec into the pack.

List the screens before anyone draws.

Now the phase plan.

Three things you couldn’t do in Part 2.

Ship Itis a template.

Get the playbook onto your Mac

Make a project folder

Run the starter script

Spec it first

Scaffold from the spec

Plan, build, repeat

You’re actually ready now.

Inside the
machine
room.

Six things you
already know.

Three decks,
one journey.

A product is
a stack of packs.

Before you write
a single line, audit.

Screen inventory,
room data sheets.

Every project
needs stuff.

Seven gates.
Plus a design loop.

Two force
multipliers.

LLMs make
the same mistakes
every human does.

Four specialists.
In parallel.

What would you
do if...

Plan the next stage
with the real tools.

Ship It
is a template.