87 Prompts in One Evening: Claude Code Subagents

Sergey Golubev 2026-03-25 6 min read

87 types of UX interviews. Each one needs its own analysis prompt and a sample JSON output. I thought I’d knock them out with a template over a couple of evenings - turns out without a system it’s 174 files of manual work stretching over three weeks. In 4 hours with Claude Code subagents I built a pipeline that generates them on its own.

Context: I’m building a platform for AI-powered analysis of UX research. Discovery, usability, JTBD, exit interviews - 87 subtypes. Each subtype is analyzed differently. Each one needs a prompt + sample output with computed metrics like satisfaction_score, feature_demand_score, usability_score.

Sat down with Claude Opus 4.6 at 8pm. Got up at midnight with a working pipeline.

Phase 1: JSON Output Structure (~1 hour)

The first hour went into agreeing on the format. Boring work, but one broken format multiplies into 87 broken outputs.

7 open questions resolved through AskUserQuestion with a visual preview right in the terminal. Key forks:

Single sections[] array instead of two separate ones - easier to parse on the backend
Flat topics + hierarchical mindmap with IDs - both search and visualization covered
confidence: 0-1 required on every block - noise filtering
Computed metrics: envelope + polymorphic scores - shared wrapper, but each interview type adds its own metrics inside

9 architectural decisions documented. Sounds bureaucratic. But when a subagent on the 80th prompt asks “what’s the confidence format?” - the answer is already in _common/schema.md.

Phase 2: Mass Generation Architecture (~30 min)

The main decision: one backend assistant, 87 swappable prompts. One JSON schema for all. Only the prompt changes. I went with a unified schema even though some interview types (say, diary study vs. card sorting) differ significantly - but at least the backend doesn’t turn into a zoo of parsers.

Folder structure:

prompts/
├── _common/           # Shared files
│   ├── schema.md      # JSON schema
│   └── metrics.md     # Metric formulas
├── discovery/
│   └── general/
│       ├── prompt.md   # Analysis prompt
│       └── example.json # Output example
├── usability/
│   └── moderated/
│       ├── prompt.md
│       └── example.json
└── ... (85 subtypes)

Phase 3: Generator Subagent

Created .claude/agents/insights-prompt-agent.md - an autonomous agent on Opus. Each instance gets one subtype and produces two files: a prompt + a sample JSON.

Key decisions in agent design:

The agent doesn’t ask questions. Everything it needs comes from _common/ and a metrics synthesis doc. With 87 runs, any question to a human is a bottleneck that kills the whole autonomy.

The metrics synthesis doc is 53KB. Reading it whole is pointless: context gets polluted, hallucinations increase. The agent greps by subtype name. The first run showed that without this constraint the agent starts mixing up metrics from neighboring subtypes.

Every example.json gets piped through python3 -m json.tool. Broken JSON gets caught immediately - not when the backend crashes on the 93rd subtype.

Orchestration: 18 batches of 5 subagents in parallel. Only the orchestrator updates the tracker - subagents don’t touch it. Otherwise race condition.

Phase 4: Metrics Research (~40 min)

Before generating 87 prompts, I needed to understand what computed metrics actually exist for UX research.

40 sources via Exa + 5 queries in Perplexity + synthesis in Claude. Result - a 53KB document classifying metrics by interview type.

Found 10 new metric types I hadn’t planned:

satisfaction_trend - satisfaction dynamics across interviews
feature_advocacy - how willing the user is to recommend a feature
task_completion_confidence - confidence in task completion

Pattern: envelope + discriminated union. A shared wrapper with metric_type, value, confidence. Inside - specific fields for each metric type.

Confidence thresholds: >=0.70 show it, 0.50-0.70 soft warning, <0.50 suppress it. Based on Gainsight’s health score practices, adapted for UX context.

Phase 5: Agent Review via /btw - 8 Bugs With Zero Effort

A trick that levels up any Claude Code session. The /btw command launches a parallel agent right from the terminal - it works in the background while you’re busy with the main task. I asked it to thoroughly analyze all created files and find errors and inconsistencies. A couple of minutes later I got a report, copied it into the main terminal - and Claude fixed everything itself. 8 issues:

Interactive skill in an autonomous agent - the skill asks questions, the agent is autonomous. Removed it, baked the logic straight into the prompt
Race condition on the tracker - subagents were updating the tracker in parallel. Moved updates to the orchestrator
No mkdir -p - the agent tried writing to nonexistent directories
Reading 53KB in full - replaced with grep by subtype
No JSON validation - added python3 -m json.tool
First subtype specifics in the shared schema - generalized
No error handling - added FAILED status + retry list
No low-confidence example - added an example with confidence: 0.48

Three of the eight are potentially critical at scale. The tracker race condition would have broken half the batches. Reading 53KB in each of 87 agents is roughly 4.5MB of extra context - more hallucinations and slower execution. And missing mkdir causes the agent to fail silently with no error.

Result

In 4 hours:

9 architectural decisions
25+ files
40+ research sources
1 of 87 prompts fully complete (template for the rest)
A pipeline for mass-generating the remaining 86

One autonomous subagent + an orchestrator with a tracker = scaling without losing quality. Writing one good prompt is half the battle. Designing a system that produces 87 with predictable quality - that’s where architecture matters.

And one more thing: the parallel review via /btw caught 8 bugs the main agent missed. I often forget to run it - but every time I do, something critical shows up.

87 Prompts in One Evening: Claude Code Subagents

Phase 1: JSON Output Structure (~1 hour)

Phase 2: Mass Generation Architecture (~30 min)

Phase 3: Generator Subagent

Phase 4: Metrics Research (~40 min)

Phase 5: Agent Review via /btw - 8 Bugs With Zero Effort

Result

Sources

Other posts

The Agent Is the New Distribution Channel

Agent-First: The Infrastructure of the New Internet