AI - LLBBL Blog

AI

How to Write a Good CLAUDE.md File
Every time you start a new chat session with Claude Code, it’s starting from zero knowledge about your project. It doesn’t know your tech stack, your conventions, or where anything lives. A well-written CLAUDE.md file fixes that by giving Claude the context it needs before it writes a single line of code.

This is context engineering, and your CLAUDE.md file is one of the most important pieces of it.

Why It Matters

Without a context file, Claude has to discover basic information about your project — what language you’re using, how the CLI works, where tests live, what your preferred patterns are. That discovery process burns tokens and time. A good CLAUDE.md front-loads that knowledge so Claude can get to work immediately.

If you haven’t created one yet, you can generate a starter file with the /init command. Claude will analyze your project and produce a reasonable first draft. It’s a solid starting point, but you’ll want to refine it over time.

The File Naming Problem

If you’re working on a team where people use different tools: Cursor has its own context file, OpenAI has theirs, and Google has theirs. You can easily end up with three separate context files that all contain slightly different information about the same project. That’s a maintenance headache.

It would be nice if Anthropic made the filename a configuration setting in settings.json, but as of now they don’t. Some tools like Cursor do let you configure the default context file, so it’s worth checking.

My recommendation? Look at what tools people on your team are actually using and try to standardize on one file, maybe two. I’ve had good success with the symlink approach , where you pick your primary file and symlink the others to it. So if CLAUDE.md is your default, you can symlink AGENTS.md or GEMINI.md to point at the same file.

It’s not perfect, but it beats maintaining three separate files with diverging information.

Keep It Short

Brevity is crucial. Your context file gets loaded into the context window every single session, so every line costs tokens. Eliminate unnecessary adjectives and adverbs. Cut the fluff.

A general rule of thumb that Anthropic recommends is to keep your CLAUDE.md under 200 lines. If you’re over that, it’s time to trim.

I recently went through this exercise myself. I had a bunch of Python CLI commands documented in my context file, but most of them I rarely needed Claude to know about.

We don’t need to list every single possible command in the context file. That information is better off in a docs/ folder or your project’s documentation. Just add a line in your CLAUDE.md pointing to where that reference lives, so Claude knows where to look when it needs it.

Maintain It Regularly

A context file isn’t something you write once and forget about. Review it periodically. As your project evolves, sections become outdated or irrelevant. Remove them. If a section is only useful for a specific type of task, consider moving it out of the main file entirely.

The goal is to keep only the information that’s frequently relevant. Everything else should live somewhere Claude can find it on demand, not somewhere it has to read every single time.

Where to Put It

Something that’s easy to miss: you can put your project-level CLAUDE.md in two places.
- ./CLAUDE.md (project root)
- ./.claude/CLAUDE.md (inside the .claude directory)
A common pattern is to .gitignore the .claude/ folder. So if you don’t want to check in the context file — maybe it contains personal preferences or local paths — putting it in .claude/ is a good option.

Rules Files for Large Projects

If your context file is getting too large and you genuinely can’t cut more, you have another option: rules files. These go in the .claude/rules/ directory and act as supplemental context that gets loaded on demand rather than every session.

You might have one rule file for style guidelines, another for testing conventions, and another for security requirements. This way, Claude gets the detailed context when it’s relevant without bloating the main file.

Auto Memory: The Alternative Approach

Something you might not be aware of is that Claude Code now has auto memory, where it automatically writes and maintains its own memory files. If you’re using Claude Code frequently and don’t want to manually maintain a context file, auto memory can be a good option.

The key thing to know is that you should generally use one approach or the other. If you’re relying on auto memory, delete the CLAUDE.md file, and vice versa.

Auto memory is something I’ll cover in more detail in another post, but it’s worth knowing the feature exists. Just make sure you enable it in your settings.json if you want to try it.

Quick Checklist

If you’re writing or revising your CLAUDE.md right now, here’s what I’d focus on:
- Keep it under 200 lines — move detailed references to docs
- Include your core conventions — package manager, runtime, testing approach
- Document key architecture — how the project is structured, where things live
- Add your preferences — things Claude should always or never do
- Review monthly — cut what’s no longer relevant
- Consider symlinks — if your team uses multiple AI tools
- Use rules files — for detailed, task-specific context
That’s All For Now. 👋
Mar 6, 2026 / AI / Programming / Claude-code / Developer-tools

Enjoyed this?
Claude Code Skills vs Plugins: What's the Difference?
If you’ve been building with Claude Code, you’ve probably seen the terms “skill,” “plugin,” and “agent” thrown around. They’re related but distinct concepts, and understanding the difference will help you build better tooling. Let’s focus on skills versus plugins since those two are the most closely related.

Skills: Reusable Slash Commands

Skills are user-invocable slash commands, essentially reusable prompts that run directly in your main conversation. You trigger them with /skill-name and they execute inline. They can be workflows or common tasks that are done frequently.

Skills can live inside your .claude/skills/ folder, or they can live inside a plugin (where they’re called “commands” instead). Same concept, different home.

The important frontmatter you should pay attention to is the allowed-tools property. This defines which tool calls the skill can access, and there are three formats you can use:
1. Comma-separated names — Bash, Read, Grep
2. Comma-separated with filters — Bash(gh pr view:*), Bash(gh pr diff:*)
3. JSON array — ["Bash", "Glob", "Grep"]
I don’t think there’s a meaningful speed difference between them? The filtered format might take slightly longer to parse if you have a huge list, but in practice it’s negligible. Pick whichever is most readable for your use case.

The real power here is that skills can define tool calls and launch subagents. That turns a simple slash command into something that can orchestrate complex workflows.

Plugins: The Full Package

A plugin is a bigger container. It can bundle commands (skills), agents, hooks, and MCP servers together as a single distributable unit. Every plugin needs a .claude-plugin/plugin.json file; which is just a name, description, and author.

Plugins are a good way to bundle agents with skills. If your workflow needs a specialized agent that gets triggered by a slash command, a plugin is a good option for that.

Pushing the Boundaries of Standalone Skills

However, I wanted to experiment with what’s actually possible using standalone skills, so I built upkeep. It turns out that you can bundle actual compiled binaries inside a skill directory and call them from the skill. That opens up a lot of possibilities.

Here’s how I did it:
- The skill has a prerequisite section that checks for a bin/ folder containing the binary
- A workflow calls the binary, passing in the commands to run
- Each step defines what we expect back from the binary
You can see the full implementation in the SKILL.md file. It’s a pattern that lets you distribute real functionality, not just prompts, through the skill.

Quick Summary
- Skills are slash commands. Reusable prompts with tool access that run in your conversation.
- Plugins bundle skills, agents, hooks, and MCP servers together with a plugin.json.
- Skills are more flexible than you might expect, you can call subagents, distribute binaries, and build real workflows.
If you’re just getting started, skills are the easier entry point. When you need to package multiple pieces together or distribute agents alongside commands, that’s when you reach for a plugin.

Have fun building!
Mar 5, 2026 / AI / Development / Claude-code

Enjoyed this?
Claude Code Now Has Two Different Security Review Tools
If you’re using Claude Code, you might have noticed that Anthropic has been quietly building out security tooling. There are now two distinct features worth knowing about. They sound similar but do very different things, so let’s break it down.

The /security-review Command

Back in August 2025, Anthropic added a /security-review slash command to Claude Code. This one is focused on reviewing your current changes. Think of it as a security-aware code reviewer for your pull requests. It looks at what you’ve modified and flags potential security issues before you merge.

It’s useful, but it’s scoped to your diff. It’s not going to crawl through your entire codebase looking for problems that have been sitting there for months.

The New Repository-Wide Security Scanner

Near the end of February 2026, Anthropic announced something more ambitious: a web-based tool that scans your entire repository and operates more like a security researcher than a linter. This is the thing that will help you identify and fix security issues across your entire codebase.

First we need to look at what already exists to understand why it matters.

SAST tools — Static Application Security Testing. SAST tools analyze your source code without executing it, looking for known vulnerability patterns. They’re great at catching things like SQL injection, hardcoded credentials, or buffer overflows based on pattern matching rules.

If a vulnerability doesn’t match a known pattern, it slips through. SAST tools also tend to generate a lot of false positives, which means teams start ignoring the results.

What Anthropic built is different. Instead of pattern matching, it uses Claude to actually reason about your code the way a security researcher would. It can understand context, follow data flows across files, and identify logical vulnerabilities that a rule-based scanner would never catch. Think things like:
- Authentication bypass through unexpected code paths
- Authorization logic that works in most cases but fails at edge cases
- Business logic flaws that technically “work” but create security holes
- Race conditions that only appear under specific timing
These are the kinds of issues that usually require a human security expert to find or … real attacker.

SAST tools aren’t going away, and you should still use them. They’re fast, they catch the common stuff, and they integrate easily into CI/CD pipelines.

Also the new repository-wide security scanner isn’t out yet, so stick with what you got until it’s ready.
Mar 4, 2026 / DevOps / AI / Claude-code / security

Enjoyed this?
Ever wanted your CLAUDE.md to automatically update from your current session before the next compact? There’s a skill for that and it’s been helpful. In case you missed it, here’s a link to the skill:

autoskill

Mar 3, 2026 / AI / Claude-code
Managing Your Context Window in Claude Code
If you’re using Claude Code, there’s a feature you should know about that gives you visibility into how your context window is being used. The /context skill breaks everything down so you can see exactly where your tokens are going.

Here’s what it shows you:
- System prompt – the base instructions Claude Code operates with
- System tools – the built-in tool definitions
- Custom agents – any specialized agents you’ve configured
- Memory files – your CLAUDE.md files and auto-memory
- Skills – any skills loaded into the session
- Messages – your entire conversation history
Messages is where you have the most control, and it’s also what grows the fastest. Every prompt you send, every response you get back, every file read, every tool output; it all shows up in your message history.

Then there’s the free space, which is what’s left for actual work before a compaction occurs. This is the breathing room Claude Code has to think, generate responses, and use tools.

You’ll also see a buffer amount that’s reserved for auto-compaction. You can’t use this space directly, it’s set aside so Claude Code has enough room to summarize the conversation and hand things off cleanly.

Why This Matters

Understanding your context usage helps you work more efficiently. A few ways to keep your context lean:
- Start fresh sessions for new tasks instead of reusing a long-running one
- Be intentional about file reads — only read what you need, not entire directories
- Use sub-agents — when you delegate work to a sub-agent, it runs in its own context window instead of yours. All those file reads, tool calls, and intermediate reasoning happen over there, and you just get the result back. It’s one of the best ways to preserve your primary context for the work that actually needs it.
- Trim your CLAUDE.md — everything in your memory files loads every session, so keep it tight
I’ll dig into sub-agents more in a future post. For now, don’t forget about /context
Mar 2, 2026 / AI / Claude-code / Developer-tools

Enjoyed this?
I published an Agentic Maturity Model on GitHub, a mental framework for thinking about and categorizing AI tools. It’s open to contributions and I’m looking for coauthors.

Agentic Maturity Model

Mar 1, 2026 / AI / Open-source / Agentic
The Anthropic Hive Mind

As you’ve probably noticed, something is happening over at Anthropic. They are a spaceship that is beginning to take off.

Feb 20, 2026 / AI / Claude / links / anthropic
Don’t sleep on OpenClaw. There are a ton of people building with it right now who aren’t talking about it yet. The potential is real, and when those projects start surfacing, it’s going to turn heads. Sometimes the most exciting stuff happens quietly before it hits the mainstream.

Feb 14, 2026 / AI / Open-source / Openclaw
REPL-Driven Development Is Back (Thanks to AI)
So you’ve heard of TDD. Maybe BDD. But have you heard of RDD?

REPL-driven development. I think most programmers these days don’t work this way. The closest equivalent most people are familiar with is something like Python notebooks—Jupyter or Colab.

But RDD is actually pretty old. Back in the 70s and 80s, Lisp and Smalltalk were basically built around the REPL. You’d write code, run it immediately, see the result, and iterate. The feedback loop was instant.

Then the modern era of software happened. We moved to a file-based workflow, probably stemming from Unix, C, and Java. You write source code in files. There’s often a compilation step. You run the whole thing.

The feedback loop got slower, more disconnected. Some languages we use today like Python, Ruby, JavaScript, PHP include a REPL, but that’s not usually how we develop. We write files, run tests, refresh browsers.

Here’s what’s interesting: AI coding assistants are making these interactive loops relevant again.

The new RDD is natural language as a REPL.

Think about it. The traditional REPL loop was:
1. Type code
2. System evaluates it
3. See the result
4. Iterate
The AI-assisted loop is almost identical:
1. Type (or speak) your intent in natural language
2. AI interprets and generates code
3. AI runs it and shows you the result
4. Iterate
You describe what you want. The AI writes the code. It executes. You see what happened. If it’s not right, you clarify, and the loop continues.

This feels fundamentally different from the file-based workflow most of us grew up with. You’re not thinking about which file to open, You’re thinking about what you want to happen, and you’re having a conversation until it does.

Of course, this isn’t a perfect analogy. With a traditional REPL, you have more control. You understood exactly what was being evaluated because you wrote it.
```
 >>> while True:
  ...     history.repeat()
```
Feb 13, 2026 / AI / Programming / Development

Enjoyed this?
I usually brainstorm spec docs using Gemini or Claude, so if you are like me, this prompt is interesting insight into your software decisions.

Based off our previous chats and the previous documents you've helped me with, provide a detailed summary of all my software decisions and preferences when it comes to building different types of applications.

Feb 12, 2026 / AI / Development
Here’s a tip: if you ask Claude (via an API not Code) to Vibe a typing hacker game make sure to tell it not to return valid exploits. I asked Claude to use actual Python code snippets in the game today and… GitHub’s security scanner was not happy with me. Oopsie doopsie. Lesson learned!

Feb 12, 2026 / AI / Claude / Coding
Knowledge Without a Knower

How do we define knowledge in the age of AI? Can new knowledge even be created if we’re outsourcing our thinking to the models or the systems we built around the models?

Let’s start with what knowledge actually is. Traditionally, to know something, you have to believe it’s true and have some justification for that belief. It’s implicit knowledge earned through experience, study, or reasoning.

AI doesn’t work that way. To the tools, it’s a probabilistic map of patterns extracted from massive amounts of text. There’s no belief, no understanding in the human sense. It’s knowledge without a knower.

That distinction matters more than we might think.

From Retention to Curation

The way we work with knowledge is shifting. For centuries, the paradigm was retention: memorize facts, write things down, build personal libraries of information.

Now we have tools that can do that for us, often better and faster than we ever could.

So what’s our new role?

Curation.

The skills that matter now are about what we can retrieve, what we can verify, and what we can synthesize.

We don’t need to remember everything, we need to know how to find it, evaluate it, and combine it in useful ways.

The Skills We Actually Need

If we’re not going to be the primary repositories of knowledge anymore, what should we focus on?

Spotting bullshit. This might be the most important skill of the next decade. When the tool outputs something that doesn’t match what we know to be true, can we catch it? AI systems are confident even when they’re wrong. They don’t hedge. They don’t say “I’m not sure about this.” So we need that internal alarm that goes off when something doesn’t add up.

Asking good questions. This has always been important, but it’s now essential. Understanding the problem means knowing where the gaps in your knowledge actually lie. A well-formed question is half the answer. An AI can give you a thousand responses, but only a good question will get you a useful one.

Reasoning about reasoning. How did the system arrive at that answer? What steps did it take? Why does it think that’s the case? We need to be able to trace the logic, not just accept the output. This is meta-cognition applied to our tools.

The Human in the Loop

New knowledge will continue to need humans. Not for the grunt work of data processing or pattern matching, AI can handle that better than we ever could.

Instead our role is to identify the anomalies. We need to become detectives, finding the errors in the data. Skepticism will be extremely valuable in the times ahead.

Being a critical thinker. We need to be able to evaluate the evidence, weigh the pros and cons, and make informed decisions.

In computing, we see error correcting used in the semiconductor industry, and we see a different technique also used in the quantum computing industry. And while reducing the amount of errors in a given system will continue to be important, what we really after here?

Well, the truth, right?

I propose we come up with a new name for truth. I think it should be called “HAT” or a “human accepted truth.”

The aggregate of HATs is what we shall call “knowledge.” Knowledge is the sum of all human accepted truths.

Feb 10, 2026 / AI / Knowledge / Thinking

Enjoyed this?
The Broken Promise of Reach
AI is changing things we take for granted: the relationship between effort and reach.

For years, the implicit promise of the internet was straightforward.

Put in the work, create something valuable, and you’d find your audience. Maybe not millions, but someone.

The effort you invested had a reasonable correlation to the impact you could achieve. A thoughtful blog post might get shared.

A well-crafted tutorial could help thousands of developers. The work mattered because it reached people who needed it.

That equation is no longer guaranteed.

Now we’re in a world where AI can generate endless content at near-zero cost.

The supply of words, images, and ideas has become functionally infinite.

So, what happens to the value of any individual piece?
- Your carefully researched article drowns in a database of a thousand AI-generated summaries.
- Your authentic ideas are lost in a sea of algorithmic content designed for engagement.
So, why put in the work if the reward isn’t there?

The dream used to be building something sustainable or something big enough to matter. Enough to support yourself while doing work you care about. And it still can be, that, a dream.

We need to return to the act of finding value in the act itself

Don’t let your self-worth depend on metrics decided by a platform.

Your entire creative output shouldn’t be measured in likes, shares, and subscriber counts. That’s when they win.

We can’t hand over the definition of “success” to the algorithms

The old promise of platforms will provide is broken.

We’re not going back.

Now we need to build something new.

What we build is up to us.
Feb 8, 2026 / AI / Indieweb / Publishing

Enjoyed this?
First Impressions of OpenAI's Codex App

I’ve been experimenting with OpenAI’s new Codex app for engineering work that launched on February 2nd, and I’m not impressed.

No subagents from what I can tell. It gets stuck on stuff that shouldn’t be blockers. The gpt-5.2-codex model feels slow. I don’t care about the existing skills enough to try to set one up for it.

I did sign up for a free month trial of ChatGPT Plus though, so I’m going to give it a few more attempts before my time runs out. But so far, it doesn’t feel like a force multiplier the way Claude Code or Amp Code does. Even Open Code feels more productive.

Maybe I’ll have better luck with Codex on the CLI? We’ll see.

There’s something about flat-fee billing that feels so much better than watching tokens drain away. Less constrained and more open to trying new things I guess.

I appreciate that they went through the effort of building what looks like a native Swift app instead of yet another VS Code fork.

I think Codex desktop app is aimed at competing with Cursor’s market share, or maybe it’s some attempt at solving whatever direction Claude Desktop is heading.

It doesn’t feel like it’s solving my problems as a developer who already has workflows that I know work.

Back to the CLI pour moi.

Feb 6, 2026 / AI / Developer-tools / Openai

Enjoyed this?
Why IndiePub Matters More Than Ever

You might have heard the term IndiePub. It’s short for independent publishing—the practice of creating and distributing your work (books, stories, games, articles) without the financial backing or editorial control of a larger corporation.

It’s closely related to the IndieWeb movement in software development, which has been advocating for personal ownership of digital presence for years.

In the last two decades, we’ve watched massive platforms rise to dominance. They promised reach and convenience, and they delivered—for a while. But now we’re seeing the consequences: algorithmic timelines that bury your work, arbitrary policy changes that can wipe out years of audience building, and the degradation of platform quality as engagement-farming content floods every feed.

This platform decay is exactly why anyone publishing content on the internet should own their own distribution.

I’m not saying you have to stop publishing on platforms. But you shouldn’t publish there first.

Publish your work on something you control, your own site, your own domain, and then push it out to the platforms. Your home base stays yours. The platforms get a copy.

The AI Content Flood Changes Everything

Now we’re facing something new, with the rise of AI-generated content, these platforms are becoming saturated with noise. Human authenticity is at a premium.

The irony? Human authentic content is what trained these large language models in the first place.

So there’s a reclaiming happening here. We’re taking our data back.

Our authentic human content belongs on the platforms we control. The AI stuff can continue flowing through the corporate channels.

It’s Never Been Easier

I’m not going to dive deep into the complexities of book publishing or all the decisions writers face when navigating traditional vs. self-publishing. But I will say this: it’s never been easier to find a way to publish your thoughts and ideas online.

There are definitely some better decisions you can make about where and how to publish. If you’re wondering whether you’ve made the best choice for the type of content you want to put out there, I created a guided decision-making tool to help answer those questions: blog-picker.logan.center

IndiePub matters now more than ever because it’s our primary defense against the commoditization of creativity.

When everything becomes content for platforms, when algorithms decide what gets seen, when AI can generate infinite variations of “good enough” content.

Own your words. Own your distribution. The platforms are guests at your table, not the other way around.

Feb 5, 2026 / AI / Indieweb / Publishing / Creativity

Enjoyed this?
Why Data Modeling Matters When Building with AI
If you’ve started building software recently, especially if you’re leaning heavily on AI tools to help you code—here’s something that might not be obvious: data modeling matters more now than ever.

AI is remarkably good at getting the local stuff right. Functions work. Logic flows. Tests pass. But when it comes to understanding the global architecture of your application? That’s where things get shaky.

Without a clear data model guiding the process, you’re essentially letting the AI do whatever it thinks is best. And what the AI thinks is best isn’t always what’s best for your codebase six months from now.

The Flag Problem

When you don’t nail down your data structure upfront, AI tools tend to reach for flags to represent state. You end up with columns like is_draft, is_published, is_deleted, all stored as separate boolean fields.

This seems fine at first. But add a few more flags, and suddenly you’ve got rows where is_draft = true AND is_published = true AND is_deleted = true.

That’s an impossible state. Your code can’t handle it because it shouldn’t exist.

Instead of multiple flags, use an enum: status: draft | published | deleted. One field. Clear states. No contradictions.

This is just one example of why data modeling early can save you from drowning in technical debt later.

Representation, Storage, and Retrieval

If data modeling is about the shape of your data, data structures determine how efficiently you represent, store, and retrieve it.

This matters because once you’ve got a lot of data, migrating from one structure to another, or switching database engines—becomes genuinely painful.

When you’re designing a system, think about its lifetime.
- How much data will you store monthly? Yearly?
- How often do you need to retrieve it?
- Does recent data need to be prioritized over historical data?
- Will you use caches or queues for intermediate storage?
Where AI Takes Shortcuts

AI agents inherit our bad habits. Lists and arrays are everywhere in their training data, so they default to using them even when a set, hash map, or dictionary would perform dramatically better.

In TypeScript, I see another pattern constantly: when the AI hits type errors, it makes everything optional.

Problem solved, right? Except now your code is riddled with null checks and edge cases that shouldn’t exist.

Then there’s the object-oriented problems. When building software that should use proper OOP patterns, AI often takes shortcuts in how it represents data. Those shortcuts feel fine in the moment but create maintenance nightmares down the road.

The Prop Drilling Epidemic

LLM providers have optimized their agents to be nimble, managing context windows so they can stay productive. That’s a good thing. But that nimbleness means the agents don’t always understand the full structure of your code.

In TypeScript projects, this leads to prop drilling: passing the entire global application object down through nested components.

Everything becomes tightly coupled. When you need to change the structure of an object, it’s like dropping a pebble in a pond. The ripples spread everywhere.

You change one thing, and suddenly you’re fixing a hundred other places that all expected the old structure.

The Takeaway

If you’re building with AI, invest time in data modeling before you start coding. Define your data structures. Think about how your data will grow and how you’ll access it.

The AI can help you build fast. But you still need to provide the architectural vision. That’s not something you can blindly trust the AI to handle, not yet, anyway.
Feb 4, 2026 / AI / Programming / Software-development

Enjoyed this?
2026: The Year We Stop Blaming the Tools

Here’s a hard truth we’re going to have to face in 2026: sometimes the bottleneck isn’t the technology, it’s us.

I’ve been thinking about how we use tools, how we find the merit in their use. We have access to increasingly powerful tools, but their value depends entirely on our understanding of them.

A hammer is useless if you don’t know which end to hold. The same goes for AI assistants, automation frameworks, and the growing ecosystem of agentic systems.

The rapid adoption of tools like OpenClaw’s agentic assistant tells me something important: people and companies are starting to see the real potential in building autonomous systems. Not just as toys or experiments, but as genuine productivity multipliers. That’s a shift from where we were even a year ago.

I think 2026 will be the year we see more widespread adoption of genuinely useful tools. The Gartner hype cycle is really interesting and how it applies or doesn’t to AI adoption, but I won’t cover it here. I’d like to write more about that in future articles.

The companies that build genuinely useful tools will be the ones that survive. They’ll be the ones that understand the value of tools and how to use them effectively. They’ll be the ones that embrace the future of work, where humans and machines work together to achieve more.

It’s not about replacing humans. It’s about humans getting better at wielding the tools we’ve built. That’s always been how technology works. This time is no different.

Feb 1, 2026 / AI / Tools / automation / 2026

Enjoyed this?
The Rise of Spec-Driven Development: A Guide to Building with AI
Spec-driven development isn’t new. It has its own Wikipedia page and has been around longer than you might realize.

With the explosion of AI coding assistants, this approach has found new life and we now have a growing ecosystem of tools to support it.

The core idea is simple: instead of telling an AI “hey, build me a thing that does the boops and the beeps” then hoping it reads your mind, you front-load the thinking.

It’s kinda obvious, with it being in the name, but in case you are wondering, here is how it works.

The Spec-Driven Workflow

Here’s how it typically works:
1. Specify: Start with requirements. What do you want? How should it behave? What are the constraints?
2. Plan: Map out the technical approach. What’s the architecture? What “stack” will you use?
3. Task: Break the plan into atomic, actionable pieces. Create a dependency tree—this must happen before that. Define the order of operations. This is often done by the tool.
4. Implement: You work with whatever tool to build the software from your task list. The human is (or should be) responsible for deciding when a task is completed.
You are still a part of the process. It’s up to you to make the decisions at the beginning. It’s up to you to define the approach. And it’s up to you to decide you’re done.

So how do you get started?

The Tool Landscape

The problem we have now is there is not a unified standard. The tool makers are busy building the moats to take time to agree.

Standalone Frameworks:
- Spec-Kit - GitHub’s own toolkit that makes “specifications executable.” It supports multiple AI agents through slash commands and emphasizes intent-driven development.
- BMAD Method - Positions AI agents as “expert collaborators” rather than autonomous workers. Includes 21+ specialized agents for different roles like product management and architecture.
- GSD (Get Shit Done) - A lightweight system that solves “context rot” by giving each task a fresh context window. Designed for Claude Code and similar tools.
- OpenSpec - Adds a spec layer where humans and AI agree on requirements before coding. Each feature gets its own folder with proposals, specs, designs, and task lists.
- Autospec - A CLI tool that outputs YAML instead of markdown, enabling programmatic validation between stages. Claims up to 80% reduction in API costs through session isolation.
Built Into Your IDE:

The major AI coding tools have adopted this pattern too:
- Kiro - Amazon’s new IDE with native spec support
- Cursor - Has a dedicated plan mode
- Claude Code - Plan mode for safe code analysis
- VSCode Copilot - Chat planning features
- OpenCode - Multiple modes including planning
- JetBrains Junie - JetBrains' AI assistant
- Google Antigravity - Implementation planning docs
- Gemini Conductor - Orchestration for Gemini CLI
Memory Tools
- Beads - Use it to manage your tasks. Works very well with your Agents in Claude Code.
Why This Matters

When first getting started building with AI, you might dive right in and be like “go build thing”. You keep then just iterating on a task until it falls apart once you try to do anything substantial.

You end up playing a game of whack-a-mole, where you fix one thing and you break another. This probably sounds familiar to a lot of you from the olden times of 2 years ago when us puny humans did all the work. The point being, even the robots make mistakes.

Another thing that you come to realize is it’s not a mind reader. It’s a prediction engine. So be predictable.

What did we learn? With spec-driven development, you’re in charge. You are the architect. You decide. The AI just handles the details, the execution, but the AI needs structure, and so these are the method(s) to how we provide it.
Jan 31, 2026 / AI / Programming / Tools / Development

Enjoyed this?
AMP Code: First Impressions of a Claude Code Competitor

I tried AMP Code last weekend and came away genuinely impressed. I didn’t think there was anything at Claude Code’s level currently available.

That said, AMP is in a somewhat unfortunate position. Similar to Cursor, they have to pay the Anthropic tax, and you really want your primary model to be Opus 4.5 for the best results.

So while I was able to get some things done, once you start paying per token… you feel constrained. I’m speaking from a personal budget perspective here, but I blew through ten dollars of credits on their free tier pretty easily.

I could see how with billing enabled and all the sub-agents they make super easy to use, you could burn through a hundred-dollar Claude Code Max plan budget in a week, or even a day, depending on your usage.

What I Really Like

There’s a lot to appreciate about what AMP is doing.

Team collaboration is a standout feature. It’s incredibly easy to share a discussion with other people on your team. Being able to collaborate with your team on something using agents is extremely powerful.

Their TUI is exceptional. I mean, it’s so much better than Claude Code’s terminal interface. They probably have the best TUI on the market right now. It’s definitely better than Open Code.

Sub-agents work out of the box. All the complicated sub-agent stuff I’ve set up manually for my Claude Code projects? It just comes ready to go with AMP. They’ve made really smart decisions about which agents handle which tasks and which models to use. You don’t have to configure any of it, it’s all done for you.

The Bottom Line

I think for enterprise use cases, AMP Code is going to make a lot of sense for a lot of companies.

For individual developers on a personal budget, the cost model is something to think carefully about.

Jan 30, 2026 / AI / Developer-tools / Coding-assistants

Enjoyed this?
Here’s a helpful idea on how to use Claude CoWork to update your photos EXIF data so they have descriptions.

📸 Full writeup

Jan 29, 2026 / AI / Photography
Teaching AI to Hide Messages in Plain Sight
It’s more of a party trick than anything else, but here’s the prompt:
```
Task: Construct a perfectly rectangular 4x26 ASCII block.

The "Safe-Width" Symbol Set: ∫ √ ≈ ∆ ∑ ± ∞ ≠ ≡ ≥ ≤ ÷ ç ∂

Instructions:
- Grid Specs: Exactly 4 rows and 26 characters per row
- Hidden Message: Choose a 2-word uppercase phrase
- Center the first word in Row 2 and the second in Row 3
- Fill all non-letter slots with random symbols from the set
- Verify each row is exactly 26 characters
```
Those specific mathematical symbols maintain consistent width with uppercase Latin characters in monospaced fonts. This means the grid stays perfectly rectangular, and the words blend in better.

Is this useful? Not really. But it’s a neat demonstration of how to get AI to produce structured output.

Try it yourself and see what secret phrases you can embed.
Jan 27, 2026 / AI / Prompts / Fun

Enjoyed this?
I’m reading an article today about a long-term programmer coming to terms with using Claude Code. There’s a quote at the end that really stuck with me: “It’s easy to generate a program you don’t understand, but it’s much harder to fix a program that you don’t understand.”

I concur, while building it may be fun, guess what? Once you build it, you got to maintain it, and as a part of that, it means you got to know how it works for when it doesn’t.

Jan 26, 2026 / AI / Programming / Claude-code
Security and Reliability in AI-Assisted Development
You may not realize it, but AI code generation is fundamentally non-deterministic. It’s probabilistic at its core, it’s predicting code rather than computing it.

And while there’s a lot of orchestration happening between the raw model output and what actually lands in your editor, you can still get wildly different results depending on how you use the tools.

This matters more than most people realize.

Garbage In, Garbage Out (Still True)

The old programming adage applies here with renewed importance. You need to be explicit with these tools. Adding predictability into how you build is crucial.

Some interesting patterns:
- Specialized agents set up for specific tasks
- Skills and templates for common operations
- Orchestrator conversations that plan but don’t implement directly
- Multiple conversation threads working on the same codebase via Git workspaces
The more structure you provide, the more consistent your output becomes.

The Security Problem

This topic doesn’t get talked about enough. All of our common bugs have snuck into the training data. SQL injection patterns, XSS vulnerabilities, insecure defaults… they’re all in there.

The model can’t always be relied upon to build it correctly the first time. Then there’s the question of trust.

Do you trust your LLM provider?

Is their primary focus on quality and reliable, consistent output? What guardrails exist before the code reaches you? Is the model specialized for coding, or is it a general-purpose model that happens to write code?

These are important engineering questions.

Deterministic Wrappers Around Probabilistic Cores

The more we can put deterministic wrappers around these probabilistic cores, the more consistent the output will be.

So, what does this look like in practice?

Testing is no longer optional. We used to joke that we’d get to testing when we had time. That’s not how it works anymore. Testing is required because it provides feedback to the models. It’s your mechanism for catching problems before they compound.

Testing is your last line of defense against garbage sneaking into the system.

AI-assisted review is essential. The amount of code you can now create has increased dramatically. You need better tools to help you understand all that code. The review step, typically done during a pull request, is now crucial for product development. Not optional. Crucial.

The models need to review itself, or you need a separate review process that catches what the generating step missed.

The Takeaway

We’re in an interesting point in time. These tools can dramatically increase your output, but only if you build the right guardrails around them should we trust the result.

Structure your prompts. Test everything. Review systematically. Trust but verify.

The developers who figure out how to add predictability to unpredictable processes are the ones who’ll who will be shipping features instead of shitting out code.
Jan 22, 2026 / DevOps / AI / Programming

Enjoyed this?
Everyone crashing out over OpenCode situation. Why not just use Claude Code (2.1+)? Or you know, there’s AMP. AMP exists too, and it looks equally interesting to me.

Jan 19, 2026 / AI / Tools / Development
Claude Code has been working great for me. OpenCode looks interesting, but uh, Opus 4.5 access is necessary for real work. I’m not doing any sketchy workarounds to get it running, and API pricing isn’t appealing either. So for now, OpenCode stays firmly in the “interesting” category.

Jan 19, 2026 / AI / Tools

AI

Why It Matters

The File Naming Problem

Keep It Short

Maintain It Regularly

Where to Put It

Rules Files for Large Projects

Auto Memory: The Alternative Approach

Quick Checklist

Skills: Reusable Slash Commands

Plugins: The Full Package

Pushing the Boundaries of Standalone Skills

Quick Summary

The /security-review Command

The New Repository-Wide Security Scanner

Why This Matters

From Retention to Curation

The Skills We Actually Need

The Human in the Loop

The AI Content Flood Changes Everything

It’s Never Been Easier

The Flag Problem

Representation, Storage, and Retrieval

Where AI Takes Shortcuts

The Prop Drilling Epidemic

The Takeaway

The Spec-Driven Workflow

The Tool Landscape

Why This Matters

What I Really Like

The Bottom Line

Garbage In, Garbage Out (Still True)

The Security Problem

Deterministic Wrappers Around Probabilistic Cores

The Takeaway