36 Framework Fixtures in One Session: How Beads + Claude Code Changed Our Testing Game
We built test fixtures for 36 web frameworks in a single session. Not days. Not a week of grinding through documentation. Hours.
Here’s what happened and why it matters.
The Problem
api2spec is a CLI tool that parses source code to generate OpenAPI specifications. To test it properly, we needed real, working API projects for every supported framework—consistent endpoints, predictable responses, the whole deal.
We started with 5 frameworks: Laravel, Axum, Flask, Gin, and Express. The goal was to cover all 36 supported frameworks with fixture projects we could use to validate our parsers.
What We Actually Built
36 fixture repositories across 15 programming languages. Each one includes:
- Health check endpoints (
GET /health,GET /health/ready) - Full User CRUD (
GET/POST /users,GET/PUT/DELETE /users/:id) - Nested resources (
GET /users/:id/posts) - Post endpoints with pagination (
GET /posts?limit=&offset=) - Consistent JSON response structures
The language coverage tells the story:
- Go: Chi, Echo, Fiber, Gin
- Rust: Actix, Axum, Rocket
- TypeScript/JS: Elysia, Express, Fastify, Hono, Koa, NestJS
- Python: Django REST Framework, FastAPI, Flask
- Java: Micronaut, Spring
- Kotlin: Ktor
- Scala: Play, Tapir
- PHP: Laravel, Slim, Symfony
- Ruby: Rails, Sinatra
- C#/.NET: ASP.NET, FastEndpoints, Nancy
- C++: Crow, Drogon, Oat++
- Swift: Vapor
- Haskell: Servant
- Elixir: Phoenix
- Gleam: Gleam (Wisp)
For languages without local runtimes on my machine—Haskell, Elixir, Gleam, Scala, Java, Kotlin—we created Docker Compose configurations with both an app service and a dev service for interactive development.
How Beads Made This Possible
We used beads (a lightweight git-native issue tracker) to manage the work. The structure was simple:
- 40 total issues created
- 36 closed in one session
- 5 Docker setup tasks marked as P2 priority (these blocked dependent fixtures)
- 31 fixture tasks at P3 priority
- 4 remaining for future work
The dependency tracking was key. Docker environments had to be ready before their fixtures could be worked on, and beads handled that automatically.
When I’d finish a Docker setup task, the blocked fixture tasks became available.
Claude Code agents worked through the fixture implementations in parallel where possible.
The combination of clear task definitions, dependency management, and AI-assisted coding meant we weren’t context-switching between “what do I need to do next?” and “how do I implement this?”
The Numbers
| Metric | Value |
|---|---|
| Total Issues | 40 |
| Closed | 36 |
| Avg Lead Time | 0.9 hours |
| New GitHub Repos | 31 |
| Languages Covered | 15 |
That average lead time of under an hour per framework includes everything: creating the repo, implementing the endpoints, testing, and pushing.
What’s Left
Four tasks queued for follow-up sessions:
- Drift detection - Compare generated specs against expected output
- Configurable report formats - JSON, HTML, and log output options
- CLAUDE.md files - Development instructions for each fixture
- Claude agents - Framework-specific coding assistants
The Takeaway
Doing this today, was like having a super power. “I need to test across 36 frameworks” and actually having those test fixtures ready, but with agents and Opus 4.5 and beads, BAM done!
Beads gave us the structure to track dependencies and progress.
Claude Code agents handled the repetitive-but-different implementation work across languages and frameworks.
The combination let us focus on the interesting problems instead of the mechanical ones.
All 36 repos are live at github.com/api2spec with the api2spec-fixture-* naming convention.
Have you tried this approach yet?
/ DevOps / Open-source / Testing / Ai-tools / Claude-code