Developer-tools

Your Context Window Is a Budget — Here's How to Stop Blowing It
If you’re using agentic coding tools like Claude Code, there’s one thing you should know by now: your context window is a budget, and everything you do spends it.

I’ve been thinking about how to manage the budget. As we are learning how to use sub-agents, MCP servers, and all these powerful capabilities we haven’t been thinking enough about the cost of using them. Certainly the dollars and cents matters too if you are using API access, but the raw token budget you burn through in a single session impacts us all regardless. Once it’s gone, compaction kicks in, and it’s kind of a crapshoot on whether it knows how to pick up where we left off on the new session.

Before we talk about what you can do about it, let’s talk about where your tokens go, or primarily are used.

Why Sub-Agents Are Worth It (But Not Free)

Sub-agents are one of the best things to have in agentic coding. The whole idea is that work happens in a separate context window, leaving your primary session clean for orchestration and planning. You stay focused on what needs to change while the sub-agent figures out how.

Sub-agents still burn through your session limits faster than you might expect. There are actually two limits at play here:
- the context window of your main discussion
- the session-level caps on how many exchanges you can have in a given time period.
Sub-agents hit both. They’re still absolutely worth using and working without them isn’t an option, but you need to be aware of the cost.

The MCP Server Problem

MCP servers are another area where things get interesting. They’re genuinely useful for giving agentic tools quick access to external services and data. But if you’ve loaded up a dozen or two of them? You’re paying a tax at the start of every session just to load their metadata and tool definitions. That’s tokens spent before you’ve even asked your first question.

My suspicion, and I haven’t formally benchmarked this, is that we’re headed toward a world where you swap between groups of MCP servers depending on the task at hand. You load the file system tools when you’re coding, the database tools when you’re migrating, and the deployment tools when you’re shipping. Not all of them, all the time.

There’s likley more subtle problems too. When you have overlapping MCP servers that can accomplish similar things, the agent could get confused about which tool to call. It might head down the wrong path, try something that doesn’t work, backtrack, and try something else. Every one of those steps is spending your token budget on nothing productive.

The Usual Suspects

Beyond sub-agents and MCP servers, there are the classic context window killers:
- Web searches that pull back pages of irrelevant results
- Log dumps that flood your context with thousands of lines
- Raw command output that’s 95% noise
- Large file reads when you only needed a few lines
The pattern is the same every time: you need a small slice of data, but the whole thing gets loaded into your context window. You’re paying full price for information you’ll never use.

And here’s the frustrating part — you don’t know what the relevant data is until after you’ve loaded it. It’s a classic catch-22.

Enter Context Mode

Somebody (Mert Köseoğlu - mksglu) built a really clever solution to this problem. It’s available as a Claude Code plugin called context-mode. The core idea is simple: keep raw data out of your context window.

Instead of dumping command output, file contents, or web responses directly into your conversation, context-mode runs everything in a sandbox. Only a printed summary enters your actual context. The raw data gets indexed into a SQLite database with full-text search (FTS5), so you can query it later without reloading it.

It gives Claude a handful of new tools that replace the usual chaining of bash and read calls:
- ctx_execute — Run code in a sandbox. Only your summary enters context.
- ctx_execute_file — Read and process a file without loading the whole thing.
- ctx_fetch_and_index — Fetch a URL and index it for searching, instead of pulling everything into context with WebFetch.
- ctx_search — Search previously indexed content without rerunning commands.
- ctx_batch_execute — Run multiple commands and search them all in one call.
There are also slash commands to check how much context you’ve saved in a session, run diagnostics, and update the plugin.

The approach is smart. All the data lives in a SQLite FTS5 database that you can index and search, surfacing only the relevant pieces when you need them. If you’ve worked with full-text search in libSQL or Turso, you’ll appreciate how well this maps to the problem. It’s the right tool for the job.

The benchmarks are impressive. The author reports overall context savings of around 96%. When you think about how much raw output typically gets dumped into a session, it makes sense. Most of that data was never being used anyway.

What This Means for Your Workflow

I think the broader lesson here is that context management is becoming a first-class concern for anyone doing serious work with agentic tools. It’s not just about having the most powerful model, it’s about using your token budget wisely so you can sustain longer, more complex sessions without hitting the wall.

A few practical takeaways:
- Be intentional about MCP servers. Load what you need, not everything you have.
- Use sub-agents for heavy lifting, but recognize they cost session tokens.
- Avoid dumping raw output into your main context whenever possible.
- Tools like context-mode can dramatically extend how much real work you get done per session.
We’re still early in figuring out the best practices for working with these tools. But managing your context window? That’s one of the things that separates productive sessions from frustrating ones.

Hopefully something here saves you some tokens.
Mar 15, 2026 / AI / Programming / Developer-tools / Claude

Enjoyed this?
How to Write a Good CLAUDE.md File
Every time you start a new chat session with Claude Code, it’s starting from zero knowledge about your project. It doesn’t know your tech stack, your conventions, or where anything lives. A well-written CLAUDE.md file fixes that by giving Claude the context it needs before it writes a single line of code.

This is context engineering, and your CLAUDE.md file is one of the most important pieces of it.

Why It Matters

Without a context file, Claude has to discover basic information about your project — what language you’re using, how the CLI works, where tests live, what your preferred patterns are. That discovery process burns tokens and time. A good CLAUDE.md front-loads that knowledge so Claude can get to work immediately.

If you haven’t created one yet, you can generate a starter file with the /init command. Claude will analyze your project and produce a reasonable first draft. It’s a solid starting point, but you’ll want to refine it over time.

The File Naming Problem

If you’re working on a team where people use different tools: Cursor has its own context file, OpenAI has theirs, and Google has theirs. You can easily end up with three separate context files that all contain slightly different information about the same project. That’s a maintenance headache.

It would be nice if Anthropic made the filename a configuration setting in settings.json, but as of now they don’t. Some tools like Cursor do let you configure the default context file, so it’s worth checking.

My recommendation? Look at what tools people on your team are actually using and try to standardize on one file, maybe two. I’ve had good success with the symlink approach , where you pick your primary file and symlink the others to it. So if CLAUDE.md is your default, you can symlink AGENTS.md or GEMINI.md to point at the same file.

It’s not perfect, but it beats maintaining three separate files with diverging information.

Keep It Short

Brevity is crucial. Your context file gets loaded into the context window every single session, so every line costs tokens. Eliminate unnecessary adjectives and adverbs. Cut the fluff.

A general rule of thumb that Anthropic recommends is to keep your CLAUDE.md under 200 lines. If you’re over that, it’s time to trim.

I recently went through this exercise myself. I had a bunch of Python CLI commands documented in my context file, but most of them I rarely needed Claude to know about.

We don’t need to list every single possible command in the context file. That information is better off in a docs/ folder or your project’s documentation. Just add a line in your CLAUDE.md pointing to where that reference lives, so Claude knows where to look when it needs it.

Maintain It Regularly

A context file isn’t something you write once and forget about. Review it periodically. As your project evolves, sections become outdated or irrelevant. Remove them. If a section is only useful for a specific type of task, consider moving it out of the main file entirely.

The goal is to keep only the information that’s frequently relevant. Everything else should live somewhere Claude can find it on demand, not somewhere it has to read every single time.

Where to Put It

Something that’s easy to miss: you can put your project-level CLAUDE.md in two places.
- ./CLAUDE.md (project root)
- ./.claude/CLAUDE.md (inside the .claude directory)
A common pattern is to .gitignore the .claude/ folder. So if you don’t want to check in the context file — maybe it contains personal preferences or local paths — putting it in .claude/ is a good option.

Rules Files for Large Projects

If your context file is getting too large and you genuinely can’t cut more, you have another option: rules files. These go in the .claude/rules/ directory and act as supplemental context that gets loaded on demand rather than every session.

You might have one rule file for style guidelines, another for testing conventions, and another for security requirements. This way, Claude gets the detailed context when it’s relevant without bloating the main file.

Auto Memory: The Alternative Approach

Something you might not be aware of is that Claude Code now has auto memory, where it automatically writes and maintains its own memory files. If you’re using Claude Code frequently and don’t want to manually maintain a context file, auto memory can be a good option.

The key thing to know is that you should generally use one approach or the other. If you’re relying on auto memory, delete the CLAUDE.md file, and vice versa.

Auto memory is something I’ll cover in more detail in another post, but it’s worth knowing the feature exists. Just make sure you enable it in your settings.json if you want to try it.

Quick Checklist

If you’re writing or revising your CLAUDE.md right now, here’s what I’d focus on:
- Keep it under 200 lines — move detailed references to docs
- Include your core conventions — package manager, runtime, testing approach
- Document key architecture — how the project is structured, where things live
- Add your preferences — things Claude should always or never do
- Review monthly — cut what’s no longer relevant
- Consider symlinks — if your team uses multiple AI tools
- Use rules files — for detailed, task-specific context
That’s All For Now. 👋
Mar 6, 2026 / AI / Programming / Claude-code / Developer-tools

Enjoyed this?
Managing Your Context Window in Claude Code
If you’re using Claude Code, there’s a feature you should know about that gives you visibility into how your context window is being used. The /context skill breaks everything down so you can see exactly where your tokens are going.

Here’s what it shows you:
- System prompt – the base instructions Claude Code operates with
- System tools – the built-in tool definitions
- Custom agents – any specialized agents you’ve configured
- Memory files – your CLAUDE.md files and auto-memory
- Skills – any skills loaded into the session
- Messages – your entire conversation history
Messages is where you have the most control, and it’s also what grows the fastest. Every prompt you send, every response you get back, every file read, every tool output; it all shows up in your message history.

Then there’s the free space, which is what’s left for actual work before a compaction occurs. This is the breathing room Claude Code has to think, generate responses, and use tools.

You’ll also see a buffer amount that’s reserved for auto-compaction. You can’t use this space directly, it’s set aside so Claude Code has enough room to summarize the conversation and hand things off cleanly.

Why This Matters

Understanding your context usage helps you work more efficiently. A few ways to keep your context lean:
- Start fresh sessions for new tasks instead of reusing a long-running one
- Be intentional about file reads — only read what you need, not entire directories
- Use sub-agents — when you delegate work to a sub-agent, it runs in its own context window instead of yours. All those file reads, tool calls, and intermediate reasoning happen over there, and you just get the result back. It’s one of the best ways to preserve your primary context for the work that actually needs it.
- Trim your CLAUDE.md — everything in your memory files loads every session, so keep it tight
I’ll dig into sub-agents more in a future post. For now, don’t forget about /context
Mar 2, 2026 / AI / Claude-code / Developer-tools

Enjoyed this?

Claude Code Prompts for Taming Your GitHub Repository Sprawl

Some useful Claude Code prompts for GitHub repository management.

1. Archive stale repositories

Using the GitHub CLI (gh), find all of my repositories that haven't
been pushed to in over 5 years and archive them. List them first and
ask for my confirmation before archiving. Use gh repo list <user>
--limit 1000 --json name,pushedAt to get the data, then filter by
date, and archive with gh repo archive <user>/<repo> --yes.

2. Add missing descriptions

Using the GitHub CLI, find all of my repositories that have an empty
or missing description. Use gh repo list <user> --limit 1000 --json
name,description,url to get the data. For each repo missing a
description, look at the repo's README and any other context to
suggest an appropriate description. Present your suggestions to me
for approval, then apply them using gh repo edit <user>/<repo>
--description "<description>".

3. Add missing topics/tags

Using the GitHub CLI, find all of my repositories that have no topics.
Use gh repo list <user> --limit 1000 --json name,repositoryTopics,
description,primaryLanguage to get the data. For each repo with no
topics, analyze the repo name, description, and primary language to
suggest relevant topics. Present your suggestions for approval, then
apply them using gh api -X PUT repos/<user>/<repo>/topics
-f '{"names":["tag1","tag2"]}'.

To make #1 easier, repjan is a TUI tool that pulls all your repos into an interactive dashboard. It flags archive candidates based on inactivity and engagement, lets you filter and sort through everything, and batch archive in one sweep. If you’ve got hundreds of repos piling up, it’s way faster than doing it one by one.

Feb 26, 2026 / Productivity / Claude-code / Developer-tools / Github

First Impressions of OpenAI's Codex App

I’ve been experimenting with OpenAI’s new Codex app for engineering work that launched on February 2nd, and I’m not impressed.

No subagents from what I can tell. It gets stuck on stuff that shouldn’t be blockers. The gpt-5.2-codex model feels slow. I don’t care about the existing skills enough to try to set one up for it.

I did sign up for a free month trial of ChatGPT Plus though, so I’m going to give it a few more attempts before my time runs out. But so far, it doesn’t feel like a force multiplier the way Claude Code or Amp Code does. Even Open Code feels more productive.

Maybe I’ll have better luck with Codex on the CLI? We’ll see.

There’s something about flat-fee billing that feels so much better than watching tokens drain away. Less constrained and more open to trying new things I guess.

I appreciate that they went through the effort of building what looks like a native Swift app instead of yet another VS Code fork.

I think Codex desktop app is aimed at competing with Cursor’s market share, or maybe it’s some attempt at solving whatever direction Claude Desktop is heading.

It doesn’t feel like it’s solving my problems as a developer who already has workflows that I know work.

Back to the CLI pour moi.

Feb 6, 2026 / AI / Developer-tools / Openai

Enjoyed this?
AMP Code: First Impressions of a Claude Code Competitor

I tried AMP Code last weekend and came away genuinely impressed. I didn’t think there was anything at Claude Code’s level currently available.

That said, AMP is in a somewhat unfortunate position. Similar to Cursor, they have to pay the Anthropic tax, and you really want your primary model to be Opus 4.5 for the best results.

So while I was able to get some things done, once you start paying per token… you feel constrained. I’m speaking from a personal budget perspective here, but I blew through ten dollars of credits on their free tier pretty easily.

I could see how with billing enabled and all the sub-agents they make super easy to use, you could burn through a hundred-dollar Claude Code Max plan budget in a week, or even a day, depending on your usage.

What I Really Like

There’s a lot to appreciate about what AMP is doing.

Team collaboration is a standout feature. It’s incredibly easy to share a discussion with other people on your team. Being able to collaborate with your team on something using agents is extremely powerful.

Their TUI is exceptional. I mean, it’s so much better than Claude Code’s terminal interface. They probably have the best TUI on the market right now. It’s definitely better than Open Code.

Sub-agents work out of the box. All the complicated sub-agent stuff I’ve set up manually for my Claude Code projects? It just comes ready to go with AMP. They’ve made really smart decisions about which agents handle which tasks and which models to use. You don’t have to configure any of it, it’s all done for you.

The Bottom Line

I think for enterprise use cases, AMP Code is going to make a lot of sense for a lot of companies.

For individual developers on a personal budget, the cost model is something to think carefully about.

Jan 30, 2026 / AI / Developer-tools / Coding-assistants

Enjoyed this?
Claude Code’s built-in tasks are pretty solid—they work well for what they do. But I still find myself reaching for Beads. There’s something about having persistent issue tracking that lives with your code, syncs with git, and doesn’t disappear when you close your terminal. Different tools for different jobs, I suppose.

Jan 24, 2026 / Claude-code / Developer-tools
Beads: Git-Native Issue Tracking for AI-Assisted Development
If you’re working with AI coding agents like Claude Code, you’ve probably noticed a friction point: context.

Every time you start a new session, you’re rebuilding mental state. What was I working on? What’s blocked? What’s next?

I’ve been using Beads, and it’s changed how I manage work across multiple AI sessions.

What Makes Beads Different?

Beads takes a fundamentally different approach. Issues live in your repo as a .beads/issues.jsonl file, syncing like any other code. This means:
- No context switching: Your AI agent can read and update issues without leaving the terminal
- Always in sync: Issues travel with your branch and merge with your code
- Works offline: No internet required, just git
- Branch-aware: Issues can follow your branch workflow naturally
The CLI-first design is what makes it click with AI coding agents. When I’m working with Claude Code, I can say “check what’s ready to work on” and it runs bd ready to find unblocked issues. No copying and pasting from a browser tab.

Getting Started

Getting up and running takes about 30 seconds:
```
# Install Beads
curl -sSL https://raw.githubusercontent.com/steveyegge/beads/main/scripts/install.sh | bash

# Initialize in your repo
bd init

# Create your first issue
bd create --title="Try out Beads" --type=task
```
From there, the workflow is straightforward:
- bd ready shows issues with no blockers
- bd update <id> --status=in_progress to claim work
- bd close <id> when you’re done
- bd sync to commit beads changes
Why This Matters for AI Workflows

The real power shows up when you’re juggling multiple tasks across sessions. Your AI agent can:
1. Pick up exactly where you left off by reading the issue state
2. Track dependencies between tasks (this issue blocks that one)
3. Create new issues for discovered work without breaking flow
4. Close completed work and update status in real-time
I’ve found this especially useful for longer projects where I’m bouncing between features, bugs, and cleanup tasks. The AI doesn’t lose track because the state is right there in the repo.

Is It Right for You?

Beads isn’t trying to replace GitHub Issues for team collaboration or complex project management.

It’s designed for a specific workflow: developers using AI coding agents who want persistent, agent-friendly task tracking.

If you’re already working with Claude Code, Aider, or similar tools, give it a try. The setup cost is minimal, and you might find it solves a problem you didn’t realize you had.
Jan 14, 2026 / Productivity / AI / Developer-tools / Git

Enjoyed this?

Developer-tools

Why Sub-Agents Are Worth It (But Not Free)

The MCP Server Problem

The Usual Suspects

Enter Context Mode

What This Means for Your Workflow

Why It Matters

The File Naming Problem

Keep It Short

Maintain It Regularly

Where to Put It

Rules Files for Large Projects

Auto Memory: The Alternative Approach

Quick Checklist

Why This Matters

1. Archive stale repositories

2. Add missing descriptions

3. Add missing topics/tags

What I Really Like

The Bottom Line

What Makes Beads Different?

Getting Started

Why This Matters for AI Workflows

Is It Right for You?