Designing Unreal Expeditions

A complete product design process — from a one-paragraph idea to pixel-ready pages — conducted entirely as a conversation between a human and an AI design partner.

6 stages · 11 sessions · 140 messages · 19 components · 3 pages

5
Design stages
19
Components
3
Pages assembled
50
User decisions
11
Sessions
Introduction

What is this?

This is a documentation of a complete product design process for Unreal Expeditions — a website for a Discord community that runs collaborative research expeditions into Unreal Engine source code.

The entire process was conducted as a conversation between a human (Nick) and an AI design partner (Claude), using a custom pipeline that enforced structured stages: Discovery, Information Architecture, Style Direction, Component Generation, and Page Assembly.

What makes this worth reading isn't the output — it's the process. The pushback moments, the variant debates, the scope negotiations, and ultimately the meta-realization that good decision-making doesn't automatically produce good output without a self-evaluation loop.

The Pipeline

1. Discovery 2. Information Architecture 3. Style Direction 4. Components 5. Assembly

The Product

Unreal Expeditions is a public-facing website with two jobs:

Three routes: Landing page (/), Archive (/expeditions), and Expedition detail (/expeditions/:slug).

The Machinery

The Design Agent Pipeline

A Python orchestrator + CLAUDE.md system prompt that enforces structured stages before any pixels get pushed

This conversation didn't happen freeform. It was guided by a pipeline — a set of Python scripts and a system prompt that enforce a specific order of operations. The AI can't jump to components without finishing discovery. It can't assemble pages without approved components.

The pipeline was itself designed in a separate Claude session before this project began. What you're reading is the first real use of it.

How it works

The orchestrator manages stage transitions. Each stage has its own module that:

The CLAUDE.md system prompt shapes the AI's behavior — defining it as an opinionated design partner (not an order-taker), specifying conversation phases, and providing JSON schemas for all output artifacts.

The eval scripts were built to evaluate visual output quality — detecting layout issues, accessibility problems, and design token violations. These informed the self-evaluation protocol that emerged in Chapter 6.

Stage scripts

orchestrator.py

Master controller — routes between stages, loads context, enforces ordering

View source →

scaffold.py

Stage 0 — initializes project directory structure

View source →

discovery.py

Stage 1 — brief, features, flows. Forces problem understanding before solutions.

View source →

ia.py

Stage 2 — sitemap, wireframes, responsive strategy

View source →

style.py

Stage 3 — token candidates, comparison, selection, evolution

View source →

components.py

Stage 4 — component generation, variants, promotion to shared library

View source →

assembly.py

Stage 5 — page composition from approved components

View source →

System prompt & evaluation

CLAUDE.md

The system prompt that defines the AI's role, conversation protocol, and output schemas

View source →

SPEC.md

Full specification for the design agent pipeline

View source →

Eval scripts

Visual quality evaluation — adversarial testing, layout detection, two-pass review

Browse →
Chapter 1

Discovery

Cold start to design brief — problem understanding, persona definition, feature prioritization, flow mapping

The conversation started with a product pitch. The AI's job was to interrogate assumptions, force prioritization, and produce a structured brief before any visual work began.

Key moments:

Artifacts produced:

Design Brief

Product definition, personas, success metrics, constraints

View brief.json →

Feature List

MoSCoW-prioritized features with user stories

View features.json →

User Flows

Step-by-step paths for 4 key user journeys

View flows.json →
Chapter 2

Information Architecture

Sitemap, screen extraction, responsive strategy, wireframe generation

With the brief locked, the next stage extracted concrete screens from the approved flows, defined the sitemap, and established responsive strategies for each page.

Key decisions:

Sitemap

3 pages with global elements and external destinations

View sitemap.json →

Responsive Strategy

Reflow vs. divergent strategy per screen

View responsive_strategy.json →
Chapter 3

Style Direction

Token exploration, 4 candidates, user ranking, and the birth of "Wayfinder Evolved"

Four style directions were generated as complete design token sets. The user ranked them, provided specific feedback, and a fifth hybrid option was created from the best elements.

The candidates:

Basecamp

Warm, earthy, exploration-themed

View preview →

Cartographer

Technical, precise, blueprint-inspired

View preview →

Signal

High-contrast, bold, signal-to-noise

View preview →

Wayfinder

Near-black + amber accent, expedition-themed

View preview →

The user preferred Wayfinder's palette but wanted refinements. This produced Wayfinder Evolved — the final token set that powers everything from here forward.

Wayfinder Evolved — Final token set preview Open full →
Chapter 4

Component Iteration

19 components, multiple variant rounds, side-by-side comparisons at both breakpoints

Each key component went through variant generation → side-by-side preview → user selection → refinement. The previews were standalone HTML files showing components at both 1280px (desktop) and 375px (mobile).

Notable debates:

Component variant previews:

Hero Section (Round 1)

Variants A, B, C — centered, split, compact

View preview →

Hero Section (Round 2)

Variants A, D, E, F — more visual energy

View preview →

How It Works (Round 1)

Timeline, cards, compact

View preview →

How It Works (Round 2)

Full explainer variants X and Y

View preview →

Mobile NavBar Variants

4 alternatives to hamburger menu

View preview →

Expedition Cards

Standard, minimal, featured

View preview →

Phase Progress Bar

Dots, numbered, contained dark bar

View preview →

Shared Components

NavBar, StatusElement, JoinCTA, badges

View preview →
Chapter 5

Page Assembly

Composing 19 components into 3 complete pages at desktop and mobile breakpoints

All three pages were assembled simultaneously, each rendered as a side-by-side desktop/mobile preview. Real sample content (Replication Graph expedition data) was used throughout.

Landing Page

Landing Page — Desktop (1280px) + Mobile (375px) Open full →

Expedition Page

Expedition Page — Full article with code blocks, video, related Open full →

Archive Page

Archive Page — Divergent layout: horizontal cards (desktop), vertical cards (mobile) Open full →
Chapter 6

The Self-Evaluation Gap

The meta-lesson: good decision processes don't automatically produce good output

When the user reviewed the assembled pages, they found: missing filter chips, an overflowing progress bar, inadequate code block padding on mobile, and spacing inconsistencies. All issues that should have been caught before presenting.

"To be honest, I feel like all this work we did to make sure the screens are evaluated by Claude with special design tools and principles applied didn't even happen? The process of IA, style decisions, components and variants, they all went really well and did a good job of focusing what we were building. But we're still severely lacking in the ability to self evaluate these generations."

— Nick, after reviewing page assemblies

The evaluation protocol that emerged

After this feedback, a structured self-audit loop was established: generate → audit against checklist → fix critical issues (max 3 passes) → present with evaluation notes.

Check Result Notes
Completeness PASS All 19 registry components present on correct pages
Overflow PASS All layouts fit containers (after fixing progress bar)
Spacing PASS Consistent rhythm using token scale
Mobile CONCERN 3 touch target groups below 44px minimum
IA cross-reference PASS Routes, strategy, global elements all match spec

The takeaway

The upstream stages (discovery, IA, style, component variants) did their job — scoping, prioritizing, constraining. But the generation step had no quality gate. Output was produced and handed directly to the user as the only reviewer.

The fix wasn't complicated: add a self-audit step between generation and presentation. The hard part was recognizing that the problem existed — it took a user calling it out for the gap to become visible.

Appendix

All Artifacts

Every file produced during the design process

Discovery

brief.json

Product definition, personas, constraints

View →

features.json

Prioritized feature list with user stories

View →

flows.json

Critical user flows

View →

Architecture

sitemap.json

3 routes + global elements

View →

responsive_strategy.json

Reflow vs divergent per screen

View →

Style

selected.json

Wayfinder Evolved — complete token set

View →

tailwind.config.js

Generated Tailwind configuration

View →

Components

registry.json

19 approved components with metadata

View →

JSX Source Files

23 files — components + page assemblies

Browse →

Full Conversation

parsed.json

140 messages, 6 chapters, structured timeline

View →

parser.py

Script that produced parsed.json from the local transcript

View →

Pipeline Source

7 stage scripts + orchestrator

The Python pipeline that enforced the design process

Browse →

CLAUDE.md + SPEC.md

System prompt and full pipeline specification

View CLAUDE.md →

Eval scripts

Visual quality evaluation tools

Browse →

Project Changelog

9 changelog entries

From project creation through page assembly

Browse →

project.json

Stage completion metadata

View →