Spec-Driven Development: AI Development Frameworks Catalog

20+ frameworks and patterns that turn chaotic vibe coding into a manageable process. Collected over 2 weeks from YouTube streams, GitHub, Telegram channels, and articles - so you don't have to.

Why SDD

  • METR study (July 2025): developers using AI were 19% slower on real tasks. Root cause - debugging loops from unstructured prompts. Confirmed in a follow-up study in February 2026
  • "Why do we sometimes get slop? Because we under-specified it" - Denis Kiselyov. The cause of low-quality code is always under-specification
  • Vibe coding breaks at scale. A prototype in an evening - sure. A maintainable product - no. Monolith, 0.5% test coverage, process debt - a typical outcome after "wild vibe coding"
  • 30+ frameworks in the past year. The industry is searching for answers. Medium, Thoughtworks Technology Radar (Assess ring), Augment Code - everyone is tracking SDD
  • The formula: Spec → Plan → Tasks → Code. Instead of "write code from a prompt" - "write a spec, and the agent implements it"

SDD Maturity Model

Level 1: Spec-First

A spec per task. Can be discarded after implementation. Quick start, minimal ceremony. Good for prototypes and one-off tasks.

Level 2: Spec-Anchored

Spec is a living document. All changes start with updating the spec. Code follows the spec, not the other way around. Works for products in active development.

Level 3: Spec-as-Source

Spec is the only artifact. Code is compiled output. Radical approach: you change the spec, the agent regenerates the code. Tessl ($125M raised) is building a platform on this principle.

Category 1: SDD Frameworks (spec - plan - code)

BMAD-METHOD ~41k stars

Type: Agentic Agile · GitHub

12+ AI roles: Analyst, PM, Architect, Scrum Master. Essentially simulates an agile team inside a single agent. Creates PRD, architecture, dev stories with full context. Node.js v20+, v6-alpha - full rewrite.

The most popular SDD framework. Powerful but complex - the learning curve is steep. For those willing to invest time in setup to get a full-fledged workflow.

"BMAD, OpenSpec, and all the rest - these are purely applied tools aimed at breaking a task into manageable pieces" - Ivan (QuintCode/NeuralStack)
Best for: Teams and advanced individuals who need a full agile cycle with AI.

GitHub Spec Kit

Type: Static spec (Markdown) · GitHub

MIT license, by Microsoft/GitHub. Agent-agnostic - works with 8+ AI agents. Standardizes the specification format in Markdown. Not tied to a specific IDE or model.

Focuses on format, not process. Spec Kit defines how to write specs, not how to implement them. Pairs well with other frameworks.

Best for: Teams that need a unified spec format across different tools.

OpenSpec

Type: Semi-living spec (delta markers) · GitHub

Semi-living specifications with delta markers. 20+ supported agents. Built for brownfield projects and iterative changes - when the code already exists and needs to evolve, not be written from scratch.

Key difference from Spec Kit: specs aren't static but "semi-living" - they track the delta between the current and desired state.

Best for: Developers of existing projects who need a spec-driven approach without rewriting from scratch.

GSD (Get Shit Done) - lightweight framework for prototypes. Minimal ceremony, quick start. When you need to build an MVP, not an enterprise process.

PromptX - structured prompts for the SDD approach. Organizes prompts into reusable templates.

LeanSpec - lightweight alternative to BMAD and Spec Kit. For those who find full frameworks overkill, but are tired of prompt chaos.

Category 2: Reasoning & Decision Making

QuintCode + FPF ~241 stars (FPF)

Type: Structured Reasoning / Thinking OS · QuintCode GitHub · FPF GitHub · Substack

Two related tools. FPF (First Principles Framework, Anatoly Levenchuk) - a "thinking operating system for LLMs." Formal, complex, powerful. QuintCode (Ivan Zakutniy) - a practical wrapper around FPF for developers.

ADI Cycle in 5 phases: Hypothesize (Abduction) - generate 3-5 competing approaches. Verify (Deduction) - check logical consistency. Test (Induction) - gather evidence. Audit - WLNK analysis, check blind spots. Decide - create a Design Rationale Record.

Solves a key problem: three months later you won't remember why you chose that approach. The decision lives in a chat thread you'll never find again. QuintCode captures reasoning in Design Rationale Records.

"Got 52 pages of documentation and about 280 feature files in Gherkin/BDD. Fed it to Claude Code - burned through a week's subscription limits overnight. The project came out pretty good, tests from the feature files actually work." Ivan (QuintCode) - on using FPF + ChatGPT Pro, 2 evenings of work
"Claude Code on its own suggested Kubernetes, EKS. QuintCode struggled, asked questions, and ultimately arrived at the cheapest but most stable solution - a single-node Docker Swarm." Ivan (QuintCode) - ADI cycle chose Docker Swarm over intuitive Kubernetes
"Learn to program, not a language. Same here - you need to learn to think." Ivan (QuintCode/NeuralStack)
"Nobody will think for you, not even FPF." Anatoly Levenchuk
Best for: Developers and architects who want evidence-based decisions, not intuitive ones.

SGR (Scheme Guided Reasoning) - structured output + chain of thought. A reasoning_steps field in the JSON response improves accuracy of non-reasoning models. Debuggable output.

Category 3: Autonomous Loops

Ralph Loop / Smart Ralph

Type: Agentic loop · Original GitHub · Claude Code GitHub · Website

Autonomous AI agent: runs a coding tool (Claude Code / Codex) again and again until every item in the PRD is done. Each iteration gets fresh context. Simple principle, sometimes surprisingly effective.

Smart Ralph adds an SDD workflow on top: asks clarifying questions before generation, structures the process.

Critics call it "monkey coding" - just pressing the button until it works. It handles certain tasks well, but without a spec you can easily end up in an infinite loop.

"Letting an agent loose on production is how you become a headline" - Denis Kiselyov
Best for: Automating routine tasks by PRD. Use with caution in production.

Taskmaster AI - task management drop-in for Cursor, Lovable, Windsurf, Roo. PRD → tasks → implementation. Works with any AI chat. GitHub

Category 4: Session/Memory Management

Claude-Flow ~21k stars

Type: Agent orchestration platform · GitHub · Beginner's Guide

Enterprise-grade AI agent orchestration platform. Multi-agent swarms, autonomous workflow coordination, distributed swarm intelligence, RAG integration, 100+ MCP tools. Native Claude Code and Codex integration.

Tested personally for autonomous MVP and POC development. It works - agents execute 80-90% of tasks correctly. The remaining 10-20% are minor bugs that surface during testing. Fixable in a couple of additional sessions, but don't expect fully autonomous bug-free builds just yet.

Vatsal Shah's Beginner's Guide is a great starting point - used it myself, highly recommend.

Best for: Solo developers and teams who need multi-agent orchestration for autonomous product builds (MVPs, POCs, personal projects).

cc-sdd ~2,880 stars

Type: AI-DLC lifecycle · GitHub

Kiro-style SDD commands for 8 tools: Claude Code, Cursor, Gemini CLI, Codex CLI, GitHub Copilot, Qwen Code, OpenCode, Windsurf. "Stop losing 70% of development context."

Enforced workflow: Requirements - Design - Tasks. Won't let you skip planning and jump straight into code. Structured AI-DLC (AI-Driven Development Lifecycle).

Best for: Developers who want a Kiro-style workflow in any IDE.

Memory Bank / MBB (Denis Kiselyov) - project memory using the C4 model. Memory Bank Bible - seed principles for project structure. Progressive disclosure - gradually revealing context to the agent. More on YouTube

Category 5: Platforms & Orchestrators

Intent (Augment Code) - living spec platform. Bidirectional specs: code and spec stay in sync. Coordinator + specialists. $60/mo. augmentcode.com

Kiro (AWS) - static spec with EARS notation. Claude-focused via Amazon Bedrock (Claude 3.7 and 4.0, more models planned). Free (50 credits/mo). For AWS greenfield projects.

Zenflow (Zencoder) - PRD-to-code orchestrator. PRD - spec review - plan - phased implementation via Claude + Codex.

Compyle (YC F25) - "Lovable for engineers" with an SDD approach. Asks clarifying questions before code generation. Claude Code under the hood. Multi-repo support, free.

Tessl ($125M) - radical spec-as-source approach. Spec = the only artifact, code is generated automatically. The most ambitious bet in the SDD space.

Devika - open-source autonomous SE agent. An alternative to Devin. Full cycle: planning - research - coding. Web interface. GitHub

Category 6: Patterns & Techniques

AI-TDD - iteratively running tests until they pass. Caveat: LLMs may cheat (stubs instead of real code).

Metaprompting - an LLM iteratively improves a prompt through a TDD cycle. It's not the code that improves - it's the prompt.

Context Engineering - the key discipline: preparing context for AI agents. MD files as a navigation layer, dependency graphs.

AI-Ready Codebase - adapting legacy code for AI agents. Hierarchy of MD files, minimal documentation set, grounding the agent on existing code.

Reflection Pattern - the agent double-checks itself at each step, finds errors, restarts the block.

Judge Pattern - agent + judge sub-agents with iterative improvement. Up to 5 judges evaluate the result.

Comparison Table: Tier-1 Frameworks

Framework Focus Complexity IDE Memory Best for
BMAD-METHOD Agile team of AI roles High Any Docs-as-code Teams, enterprise
Spec Kit Spec format (standard) Medium Any None Cross-agent teams
OpenSpec Brownfield, iterative specs Medium Any Delta markers Existing projects
QuintCode/FPF Reasoning, decision rationale High Claude Code, Cursor .fpf/context.md Architects
Ralph/Smart Ralph Autonomous loop to completion Low Claude Code, Codex Fresh context Automation
Claude-Flow Agent orchestration, swarms Medium Claude Code, Codex Persistent memory MVP/POC, solo & teams
cc-sdd AI-DLC lifecycle (8 IDEs) Medium 8 tools Structured Developers

Expert Quotes

"Hallucinations aren't a bug - they're an architectural feature: the model fills in gaps when you don't set boundaries" Denis Kiselyov, DEKSDEN
"BMAD, OpenSpec, and all the rest are purely applied tools for breaking tasks into pieces. First FPF, then OpenSpec, BMAD, and so on" Ivan (QuintCode/NeuralStack)
"Learn to program, not a language. Same thing here - you need to learn to think" Ivan (QuintCode/NeuralStack)
"Stakhanovism 2026: there are plenty of machines (agents), what we need are quality supervisors" Anatoly Levenchuk
"Native language for specifications: precision of wording matters more than saving tokens" Rodion Mostovoy
"Vibe coding is neither magic nor garbage - the truth is in the middle" Pavel Molyanov
"Humans hallucinate no worse than LLMs - we also approximate and never question our own adequacy" Ivan (QuintCode/NeuralStack)

Sources

YouTube (AI-Driven Development)

Articles

Telegram Channels

  • @ai_driven - Rodion Mostovoy, AI-Driven Development
  • @deksden - Denis Kiselyov, DEKSDEN
  • QuintCode / NeuralStack - Ivan Zakutniy

GitHub Repositories


More on AI-driven development, spec-driven workflows, and practical experiments:
Telegram: @prodfeatai