DevOps
-
SAST vs AI PR Review: Two Tools, Different Jobs
If you have worked in DevSecOps, you might be wondering if AI pull request review tools are going to replace traditional SAST scanners. Short answer: no. Longer answer: they’re solving different problems, and if you’re picking one over the other, you might be making a mistake.
Here is how I think about it.
SAST is the Compliance Gatekeeper
Static Application Security Testing tools, think Semgrep, SonarQube, Checkmarx, Fortify, parse your source code (usually into an Abstract Syntax Tree) and hunt for known vulnerability patterns. They don’t run the code. They just read it and “pattern-match” against rules.
The focus here is security, compliance, and strict rule enforcement. SAST is the automated gatekeeper that makes sure your code clears the OWASP Top 10 bar before it merges.
What SAST does well:
- It’s deterministic. If a rule matches a pattern, the engine flags it every single time. Run it twice on the same code, get the same result.
- It satisfies auditors. Frameworks like PCI-DSS, SOC 2, and HIPAA expect documented secure-development practices, and a formal SAST scanner is the easiest way to produce that evidence. AI agents don’t count here, at least not yet.
- It can do real taint analysis. Enterprise tools can track untrusted input from the moment it enters your app to the moment it hits a dangerous sink.
Where SAST falls down:
- The false positive rate is brutal. Rigid rules with no context means a lot of noise. Developer fatigue is real, and once your team starts ignoring scanner output, you’ve lost the game.
- It can’t see your business logic. A SAST tool has no idea what your application is supposed to do, so it can’t tell you when the logic itself is broken.
- Comprehensive scans are slow. Hours on large codebases isn’t unusual, though Semgrep has been doing good work on this front.
AI PR Agents are the Peer Reviewer
Tools like CodeRabbit, Qodo, Greptile, GitHub Copilot Code Review, Cursor Bugbot, and Claude Code (set up as a review skill) plug into your version control and read the PR diff with the surrounding code context. They behave less like a scanner and more like a colleague who actually read your changes.
The focus is developer productivity, code quality, logic bugs, and contextual feedback.
What they do well:
- They understand intent. LLMs can reason about why the code is changing, not just whether it matches a rule. That’s a different category of feedback.
- The signal-to-noise ratio is good. When an AI flags something, it usually comes with an explanation that makes sense. Less noise, more useful comments.
- They suggest fixes. Not just “this is wrong” but “here’s a diff you can apply.” That’s huge for actually closing the loop on review feedback.
- The scope is broader. Architecture, performance, style, security, all in one pass.
Where they fall down:
- They’re non-deterministic. Same vulnerability, two PRs, two different outcomes. That’s not a bug, that’s how LLMs work, and it’s why auditors don’t trust them.
- They don’t satisfy compliance. No auditor is going to accept “the AI looked at it” as a substitute for a formal scanner.
- Hallucinations happen. Invented issues, misread intent, suggestions that refactor things that didn’t need refactoring. You still need a human filtering the output.
The Quick Comparison
Feature SAST AI PR Review Primary Goal Security & Compliance Code Quality & Productivity Analysis Method Deterministic rules & AST Non-deterministic LLMs Business Logic Blind Context-aware False Positives Often high Usually low Compliance Proof Accepted as evidence Not accepted Feedback Loop Dashboard / CI output PR comments / chat The Lines Are Starting to Blur
The interesting thing happening right now is convergence from both directions.
On the SAST side, tools like DryRun Security are pitching themselves as “AI-native SAST,” trying to keep the deterministic backbone while using LLMs to filter out the false positives that make traditional scanners painful to live with.
On the AI agent side, CodeRabbit and Greptile keep getting better at catching real security vulnerabilities, not just style issues. They’re slowly creeping into territory that used to belong exclusively to SAST.
This is going somewhere, but it’s not there yet.
Where to Start Your Evaluation
Treat them as complementary, not competitive.
For SAST, evaluate against your audit footprint, the languages in your codebase, and how much false-positive triage your team can absorb. Semgrep, SonarQube, Checkmarx, and Fortify all sit in different price-and-friction zones, and the right one depends on what your business actually needs to prove.
For AI PR review, evaluate based on how it fits your existing review workflow, what languages and frameworks it understands well, and the signal-to-noise ratio in practice on your codebase. CodeRabbit, Qodo, Greptile, Copilot Code Review, Bugbot, and a Claude Code review skill all approach the problem differently.
If you pick one category and skip the other, you’re either passing compliance with mediocre code review, or getting great review feedback while failing your next audit. Neither is a win.
The AI tools aren’t replacing SAST. They’re filling in the gap SAST was never designed to cover.
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
/ DevOps / AI / Programming / security
-
Your Data Lake's Vulnerability Problem Is Really an Identity Problem
I’ve been reading through the post-mortems on the last few years of data lake breaches, and the pattern is depressing. We keep blaming the platforms. We should be blaming ourselves.
Let me give you an example.
The Snowflake Breach Wasn’t a Snowflake Breach
In mid-2024, at least 165 organizations got hit through their Snowflake instances. AT&T lost over 50 billion call records. Ticketmaster, Santander, Advance Auto Parts. The headlines wrote themselves: Snowflake hacked.
Except Snowflake wasn’t hacked. Mandiant, CrowdStrike, and Snowflake all reached the same conclusion in their forensics. No zero-day. No flaw in the cryptographic platform. No internal compromise of Snowflake’s corporate network. No brute-force attacks against API limits.
What actually happened? UNC5537, a financially motivated group also tracked as Scattered Spider and ShinyHunters, walked through the front door with valid stolen credentials. Those credentials were harvested over years by commodity infostealer malware (VIDAR, LUMMA, REDLINE) running on the personal laptops of third-party contractors. The same laptops these contractors used for gaming and pirated software also held the keys to their clients' enterprise data lakes.
One contractor laptop. Multiple enterprise environments compromised. That’s the actual story.
79.7% of the accounts UNC5537 used had prior credential exposure. Some had been valid and un-rotated since November 2020.
The Two Doors They Walked Through
The first attack vector was the SSO side door. Plenty of victim organizations had a perfectly fine enterprise IdP enforcing strong passwords and MFA. They just forgot to make SSO mandatory. A local authentication pathway was left active alongside it. Attackers logged in directly with stolen local credentials, completely bypassing the IdP, and the MFA requirement never fired.
The second was credential stuffing against inactive, orphaned, and demo accounts belonging to former employees. Nobody audits those. Nobody enforces MFA on those. So they don’t get protected by the controls that exist on the production accounts.
Once inside, the kill chain was almost boring.
SHOW TABLESto enumerate.CREATE TEMPORARY STAGEto make an ephemeral staging area that disappears when the session ends, erasing forensic evidence.COPY INTOwithGZIPcompression to keep the payload small enough that volumetric alarms didn’t trigger.GETto pull it down to a VPS in some offshore jurisdiction. Done.No IP allowlisting was in place anywhere. The connections from Mullvad and PIA exit nodes were treated with the same trust as an employee on the corporate VPN.
The Bucket Problem Hasn’t Gone Away Either
Alongside the identity attacks, the boring stuff keeps working. Misconfigured S3 buckets are still the most reliable way to expose a data lake. In late 2024, an open bucket used as a shared network drive was found containing raw customer data, cryptographic keys, and secrets. In 2025, a US healthcare provider left millions of patient records readable for weeks before anyone noticed.
Then there’s Codefinger. In January 2025, that group used compromised AWS credentials to access S3 buckets and then weaponized AWS’s own Server-Side Encryption with Customer-Provided Keys (SSE-C) to ransomware the data in place. They didn’t even need to exfiltrate it. They just encrypted it with a key the victim didn’t have and demanded Bitcoin.
That’s a native cloud feature being turned against you because somebody granted too many permissions to a service account.
The Boring Conclusions Are the Important Ones
Identity is the perimeter now. The encryption-at-rest story we’ve been telling ourselves for a decade is irrelevant when the attacker authenticates as a real user. Stop treating SSO as optional. Stop leaving local auth paths open next to it. Enforce MFA on every account, including the demo and service accounts you forgot about.
Your data lake should not be reachable from the public internet. Route everything through PrivateLink or the equivalent in your cloud. Allowlist the IPs that should be touching analytical workloads, and don’t make exceptions for “just this one contractor.”
And as you start handing access to AI agents, remember that static roles aren’t going to cut it. Just-in-time entitlements and contextual access control are the only way you’re going to keep up with autonomous systems making queries on your behalf.
The data lake industry spent years arguing about table formats, vendor lock-in, and egress fees. Meanwhile, attackers were just collecting passwords from gaming laptops and walking in.
Fix the doors first.
Sources
- UNC5537 Targets Snowflake Customer Instances (Mandiant / Google Cloud) — Forensic analysis, kill chain, infostealer attribution
- Snowflake Data Breach: Lessons Learned (AppOmni) — SSO side door, MFA bypass mechanics
- Major AWS S3 Bucket Breach Exposes Data (NHIMG) — Codefinger SSE-C ransomware tactic
- Misconfigured Cloud Assets: How Attackers Find Them (CybelAngel) — Recent open-bucket exposure incidents
- 5 Key Lessons from the Snowflake Data Breach (Tanium) — Defensive posture summary
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
-
Your Data Lake Has a Permissions Problem
Consolidating every business unit’s data into one giant lakehouse sounds like a win until you realize the security model from your old data warehouse can’t scale to it. You took ten silos, each with their own access rules, and merged them into one location. Now everyone wants in, and your security team is the bottleneck.
Let me walk through three places where the cracks usually show up.
RBAC Falls Over Faster Than You Think
Role-Based Access Control is the model most teams start with. Permissions are tied to a job function. Sales reps get read access to sales tables, data engineers get write access to staging, and so on. It works fine when you have ten roles.
It does not work when you have a thousand.
Say your sales reps should only see accounts in their territory, and only accounts they personally manage. Under pure RBAC, you need a unique role for every territory-by-account-owner combination. That’s role explosion, and it’s how compliance audits become impossible and legitimate access slows to a crawl. The roles list grows faster than anyone can review it, which means stale permissions sit there forever.
The answer is Attribute-Based Access Control. Instead of asking “what role is this user in,” the system asks “what attributes does this user have, what attributes does this data have, and what’s the policy at this exact moment.” Tag a column as
PII. Tag a schema asHR. Write one policy that says anyone outside the HR compliance group sees masked data when they touch a PII column. Done. That single policy replaces hundreds of bespoke roles.This is what Unity Catalog and Starburst Galaxy are built around, and it’s the model that will scale with the data.
Column and Row Security Should Be Boring
Once you have ABAC and a real metadata catalog, column-level masking and row-level filtering become a non-event. You write a SQL expression that masks the first five digits of an SSN for lower-privileged roles. You write a row filter that silently appends
WHERE region = 'user_region'to every executive’sSELECT *.The key word is silently. The user doesn’t see a different table. They don’t have a sanitized copy. The policy is enforced at the catalog layer, so it works the same whether they’re querying through Spark, Trino, a BI dashboard, or a pipeline. One source of truth, one policy, every engine.
If you’re still maintaining separate “sanitized” copies of tables for different audiences, you’re doing it the 2015 way and you’re going to drift.
The IAM Default Problem
Most cloud services ship with default IAM roles, and a surprising number of those defaults attach
AmazonS3FullAccessor something equally permissive.SageMaker does it. The Ray autoscaler role does it. There are more.
Picture the failure mode. An attacker compromises some peripheral app, maybe a forgotten Jupyter notebook, maybe a misconfigured Lambda. That workload has an IAM role attached because that’s how cloud workloads talk to S3 without hardcoded credentials. The attacker inherits the role. And because the role has full S3 access, they’re not constrained to the bucket the application actually uses. They can enumerate every bucket in the entire account.
That’s how a single compromised container becomes a full data lake breach. Researchers call it a bucket monopoly attack. I call it the most predictable incident in the industry.
The fix is not glamorous. Stop using
s3:*in any policy. Write resource-scoped policies that name the exact buckets and prefixes a workload needs. Audit the default roles every cloud service hands you and replace them. Use Security Lake or Detective to flag cross-service API calls that don’t match normal patterns. None of this is fun. All of it is necessary.And Then There’s the Agent Problem
The new wrinkle is that humans are no longer the primary consumers of your data. Autonomous agents are. They issue more queries, hit more tables, and move faster than any human team.
Long-lived credentials and static roles don’t fit that workload. The pattern emerging is Just-In-Time entitlements, where an agent gets a narrow, ephemeral permission for the duration of a single execution thread, then loses it. Pair that with declarative policy metadata baked into the data assets themselves, so the agent knows what it’s allowed to do with a dataset before it ever runs the query.
We’re early on this. Most organizations are still working through the basics, and that’s fine. But if you’re designing access controls today, design them assuming the next thing hitting your lake isn’t a person.
What to Actually Do
If you’re auditing your own data lake security, the order I’d work in:
- Find every IAM role with a wildcard permission. Replace them.
- Move from RBAC to ABAC at the catalog layer. Stop creating new roles.
- Pull your data lake off the public internet. PrivateLink, private endpoints, IP allowlists for the legacy stuff that can’t move.
- Then start thinking about agents.
The lakehouse pitch is unification. The lakehouse reality is that unification multiplies the cost of every bad permission. Get the basics right before you bolt on anything fancy.
Sources
- AWS Default IAM Roles Found to Enable Lateral Movement (The Hacker News) — SageMaker / Ray autoscaler default roles, bucket monopoly attacks
- What Is Fine-Grained Data Access Control? (TrustLogix) — RBAC role explosion, ABAC fundamentals
- Core concepts for ABAC (Databricks Unity Catalog docs) — Tag-driven policy enforcement
- Top 12 Data Governance Predictions for 2026 (Hyperight) — Just-in-time entitlements, declarative policy metadata
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
-
The Real Cost of Your Data Lake (It's Not the Storage)
If you’re sketching out a data platform on a whiteboard right now, I want you to do something. Stop calculating storage costs. They’re not the bill.
I pulled the public pricing for AWS, Azure, GCP, Databricks, and Snowflake and stacked them next to each other. Storage is the cheap part. The expensive part is everything that moves the data, and the expensive part is the part you’re least likely to model correctly when you’re picking a vendor.
Let me walk through what actually shows up on the invoice.
Raw Object Storage Is Basically Free
For hot, frequently accessed data, the big three are within a rounding error of each other:
- Azure Blob (LRS, Hot): $0.018 per GB/month
- Google Cloud Standard: $0.020 per GB/month
- AWS S3 Standard: $0.023 per GB/month (first 50 TB)
Drop into the cool tiers and AWS S3 takes the lead at $0.0125 per GB. Drop into deep archive and you’re paying $0.00099 per GB on either AWS Glacier Deep Archive or Azure Archive. That’s a tenth of a cent per gigabyte, per month, for data you almost never touch.
Good for you, but I think anyone leading with “per-GB storage cost” in a procurement deck is selling you a story. Storage capacity is roughly five percent of a typical Databricks bill. Five. The other 95% is the part nobody wants to talk about.
The Egress Trap
Ingress is free. Always. The cloud providers want your data in.
Getting it back out is where they collect.
- Azure Blob: $0.087/GB external egress
- AWS S3: $0.090/GB
- Google Cloud: $0.120/GB (but free if you stay inside Google’s ecosystem, which is the whole point of that pricing)
Then layer on API operations. A million GET requests on S3 costs about $0.40. The same million GETs on Google Cloud Storage can run closer to $5.00 because they classify operations differently. If your analytics workload is hammering small files, those API calls add up faster than the storage they’re reading.
Storing 10 TB? Maybe $200 a month. Storing 500 TB? You’re at $10,000 a month before a single byte leaves the region or a single query fires.
Databricks: Two Bills, One Headache
Databricks uses what’s commonly called a Two-Bill Model. You get one invoice from your cloud provider for the actual VMs and storage, and a separate invoice from Databricks for the software, measured in DBUs (Databricks Units).
In a typical mid-sized deployment around $18,000/month, the breakdown looks like this:
- VM compute from the cloud provider: ~55%
- Databricks DBU fees: ~30%
- Storage: ~5%
- Network egress: ~5%
The DBU rate changes based on what you’re doing. Automated jobs start at $0.15/DBU. Interactive notebooks for analysts start at $0.40/DBU. That’s not an accident. Databricks wants you running production workloads on cheap job clusters, not on the expensive all-purpose clusters your data scientists love to leave running over a weekend.
If you’re not actively pushing teams toward job clusters and ARM-based instances, you’re leaving real money on the table.
Snowflake: The Hidden Storage Multiplier
Snowflake’s pricing pitch sounds clean. Pass-through storage at $40/TB/month on-demand, dropping to $23/TB/month with a capacity commitment. Compute as Credits. Done.
Except it isn’t done. Snowflake stores data in immutable 16MB micro-partitions. Immutable. You can’t change them in place. Update a single row in a 1 TB table and Snowflake writes a new file and keeps the old one around.
Why keep the old one? Two features:
- Time Travel: query historical states of your data for up to 90 days
- Fail-Safe: a 7-day disaster recovery window you cannot turn off
This is the part that gets people. A 1 TB table that’s getting updated multiple times a day can balloon to 25 TB of billed storage because Snowflake is retaining every prior version of every micro-partition you’ve touched. Your dashboard says “1 TB table.” Your invoice says otherwise.
And compute? Virtual Warehouses bill per second, but with a 60-second minimum every single time you resume or resize. Aggressive auto-suspend sounds like a cost optimization. It’s not. If you’re spinning a warehouse up and down every 30 seconds, you’re paying the 60-second minimum every time and quietly multiplying your bill.
What I’d Actually Do
A few things I’d put on the wall before signing anything:
- Model egress, not storage. Run your worst-case query pattern through the calculator. Storage is noise.
- Lifecycle everything. Cool tier and archive pricing are 10x to 100x cheaper. If your data is older than 90 days and nobody’s queried it, it shouldn’t be in hot storage.
- For Databricks: push every recurring workload to job compute. Audit interactive cluster usage monthly.
- For Snowflake: if you have high-frequency update patterns, profile your actual storage footprint, not your logical table size. The gap will surprise you.
- For multi-cloud: don’t. Egress will eat the savings before you finish the architecture diagram.
The vendors all have a story about why their model is the cheap one. Read past the per-GB number on the slide. The bill is somewhere else.
Happy modeling.
Sources
- Databricks Pricing Explained (Dawiso) — Two-Bill Model, DBU breakdown
- Snowflake Pricing Explained (SELECT.dev) — Time Travel storage multiplier, micro-partition behavior
- Cloud & AI Storage Pricing Comparison 2026 (Finout) — AWS / Azure / GCP per-GB and tier pricing
- S3 vs GCS vs Azure Blob Storage (ai-infra-link) — Egress and API operation pricing
- Snowflake Pricing in 2026 (CloudZero) — Virtual Warehouse 60-second minimum behavior
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
/ DevOps / Cloud / Data / Snowflake / Databricks
-
AI Code Reviewers Won't Save You
Dropping an AI reviewer into your pull request pipeline is just a band-aid. Tools like CodeRabbit or Greptile are great for catching syntax errors or basic anti-patterns, but they can’t assess architectural intent or domain-specific business logic. They’re spell-checkers for code. Useful, sure. But nobody ever said “our codebase is solid because we run spell check.”
AI doesn’t change your engineering baseline. It just accelerates it. If your foundational guardrails are weak, agentic tools will help your team generate technical debt at unprecedented speeds. So the real question isn’t “how do we review AI code?” It’s “how do we build systems that prevent slop from ever reaching production?”
Shift Left, Hard
When engineers use agents to scaffold a new Go service or spin up a SvelteKit frontend, they’re inevitably pulling in generated dependencies or utilizing unfamiliar libraries. Models hallucinate packages. They suggest insecure patterns with total confidence.
Your CI pipeline needs to be ruthless before a human ever looks at the code. Aggressive SAST and SCA should automatically block PRs that introduce vulnerable dependencies or hardcoded secrets. If the agent generates slop, the pipeline rejects it instantly. No discussion.
Make the Agents Write the Tests
Agents are incredibly eager to generate feature code, but humans are historically lazy about writing the tests for it. The influx of AI-generated code means human reviewers can’t possibly step through every logic branch manually.
So flip the script. Use the agentic tools to build the guardrails themselves. Mandate that any generated feature code must be accompanied by generated, human-verified unit tests. If an agent writes a sprawling TypeScript function, the build should fail if the test coverage doesn’t meet a strict threshold. You’re already using AI to write the code. Use it to prove the code works, too.
Context Boundaries Matter
Bloated AI output often happens because the model is given too much context or allowed to generate too much at once. Heavyweight IDEs with aggressive multi-file auto-completion can easily create cascading messes across a codebase.
Define strict architectural boundaries and API contracts upfront. Agents should be tasked with solving small, well-defined, modular problems. “Write a function that parses this specific JSON schema” is a good prompt. “Build the backend” is not. The tighter the scope, the less room for generated nonsense.
Observability Is Your Safety Net
You can’t catch all generated slop at the PR level. Some of it only reveals itself under load. An agent might write a technically correct query that causes an N+1 database issue, or introduce a subtle memory leak that passes all unit tests.
Your ultimate safety net is what happens at runtime. You need an airtight observability stack to trust the velocity AI brings. Logs, distributed tracing, metrics, all feeding into dashboards your team actually watches. When generated code hits staging, you need the immediate telemetry to spot performance regressions before they reach production.
Redefine the Human Review
Because AI makes the “typing” part of coding trivial, the human code review needs to fundamentally shift. Reviewers should no longer be looking for missing semicolons. They should be asking: “Does this component fit our architecture?” and “Did the agent over-engineer this solution?”
Train your senior engineers to review for intent and systemic impact. That’s the stuff AI genuinely can’t do yet. Leave the syntax checking to the robots.
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
/ DevOps / AI / Software-development / Code-review
-
What Is a Runbook and Why Should You Care?
If you’ve ever been woken up at 3 AM by a pager and stared at your screen trying to remember how the database failover works, you already know why runbooks matter. You just might not have had one yet.
A runbook is a step-by-step guide for handling a specific operational scenario. Database goes down? There’s a runbook for that. Failed deployment needs a rollback? Runbook. Routine certificate rotation? You get the idea. They range from simple markdown files to fully automated scripts where a human only needs to click “approve.”
That’s the idea anyways. The impact of having good ones versus not having CAN be massive.
Why They Matter
When something breaks in production, your brain is not at its best. Adrenaline kicks in, Slack is blowing up, and suddenly you can’t remember if you’re supposed to restart the service first or check the connection pool. A runbook takes the thinking out of the equation. You follow the steps. You restore the service. You go back to sleep.
This directly lowers your Mean Time To Recovery (MTTR). Instead of spending twenty minutes in a group call debating what to try next, you open the runbook and start executing.
Runbooks also solve the consistency problem. If five different engineers respond to the same alert five different ways, you’re rolling the dice every time. One of those approaches might cause a secondary outage. A runbook ensures everyone follows the same diagnostic and remediation path, which means fewer surprises.
And then there’s the tribal knowledge issue. Every team has that one senior engineer who knows exactly how to fix the weird thing that happens once a quarter. What happens when they’re on vacation? Or they leave the company? A runbook gets that knowledge out of their head and into a document the whole team can use.
It also makes onboarding way faster. New engineers can start handling on-call rotations with confidence instead of hoping nothing breaks on their watch.
Treat Them Like Code
This is the part a lot of teams get wrong. Runbooks shouldn’t live in a random Confluence page that hasn’t been updated since 2023. They should live in version control. Sometimes they’re kept in the repo with the code. Other times they’re kept separate. It’s up to you. It’s up to the team on where to put it.
If a developer changes how a service authenticates or connects to a database, the associated runbook needs to be updated in the same pull request. An outdated runbook is worse than no runbook at all. It sends engineers down the wrong path during an outage, which burns time and trust.
Share Early, Share Often
A runbook sitting in someone’s private folder is doing exactly nothing for your team.
Start during the draft phase. Have someone who didn’t write the runbook try to follow it. If they get confused or stuck, the runbook needs work. This is the cheapest way to find gaps.
When a new service is heading to production, the runbook should be part of the readiness review. I’d argue a service shouldn’t go live without one. And after an incident, if the runbook was wrong or didn’t exist, creating or fixing it should be a mandatory action item from the post-mortem.
One more thing. Practice them. Run game days where the team actually walks through runbooks before a real emergency happens. The worst time to discover your runbook has a missing step is when production is on fire.
So Here We Are
Runbooks aren’t glamorous. Nobody’s giving a conference talk about the beautiful runbook they wrote last quarter. But they’re the difference between a calm, methodical incident response and a panicked Slack thread full of guesses. Write them, version them, share them, and practice them. Your future self will thank you.
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
/ DevOps / Sre / Runbooks / Incident-response
-
What Temporal Actually Does (And Why You'd Want It)
Building a multi-step process across microservices usually goes something like this. You wire up a message queue, add retry logic, build a state machine backed by a Postgres
statuscolumn, throw in some cron jobs, and pray. It sounds complicated because it is.Temporal is an open-source “durable execution” system that replaces all of that duct tape with a single, opinionated framework. Lets break it down.
Workflows and Activities
Temporal splits your application into two concepts:
- Workflows are your business logic, written in standard code (Go, Python, TypeScript) using a Temporal SDK. They must be deterministic. They define the order of operations, branching, loops, and error handling.
- Activities are the actual tasks your services perform. HTTP requests, database writes, external API calls. Activities are where the non-deterministic, real-world work happens.
When a workflow runs, it executes on your own worker services. Every time it schedules an activity, starts a timer, or completes a step, the Temporal Server records that event internally. If the worker crashes, another worker picks it up, replays the workflow’s event history to the exact point of failure, and resumes. No data loss. No half-finished state.
All of all the things that you would have to build yourself simplified Into A framework that handles it for you.
What It Replaces
Without something like Temporal, teams generally land in one of two camps:
- Choreography (event-driven): Services emit and listen to events through a message broker like Kafka or RabbitMQ. Highly decoupled, sure. But in practice it turns into a pinball machine. There’s no single place to understand the flow of a business transaction. Debugging becomes detective work across dozens of services and topics.
- Ad-hoc orchestration: You build a custom state machine with a database, message queues, background workers, and cron jobs. Then you write a ton of boilerplate for retries, dead-letter queues, and idempotency. Every team ends up building a slightly different version of this, and none of them are great.
Temporal gives you the reliability of a custom state machine without making you build and maintain one.
Why It’s Worth Looking At
A few things stand out:
- Durable sleep. A workflow can execute
sleep(30_DAYS). Temporal suspends the execution, frees the worker’s resources, and wakes it back up a month later exactly where it left off. Hard to do with a cron job. - Built-in resiliency. Exponential backoffs, timeouts, and retry policies are configured on the activity invocation. You’re not writing custom
whileloops andtry/catchblocks to handle network jitter. - Centralized observability. Instead of piecing together distributed traces or searching through logs to figure out why step 4 of 7 failed, the Temporal UI shows the exact execution state of every workflow. Inputs, outputs, errors, all in one place.
- Code over configuration. Unlike AWS Step Functions or YAML-heavy tools like Airflow, you write workflows in a real programming language. You can unit test them, store them in version control, and run them through your normal CI/CD pipeline.
That last point is worth reading and thinking through again. If your orchestration logic lives in code, it gets all the benefits code gets. Reviews, tests, refactoring, IDE support. Visual workflow builders look great in demos, but they don’t scale the way code does.
Should You Use It?
Temporal isn’t free in terms of operational complexity. You’re running the Temporal Server (or paying for Temporal Cloud), and your team needs to understand the replay model and determinism constraints. It’s not something you bolt on to a simple CRUD app.
But if you’re managing distributed transactions with queues, cron jobs, and hand-rolled state machines, Temporal is worth a serious look. It takes the hardest parts of that problem and makes them someone else’s. Durability, retries, observability. All handled.
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
-
pgvector vs Pinecone: You Probably Don't Need a Separate Vector Database
Every time someone starts building a RAG pipeline, the same question will come up: do I need a “real” vector database like Pinecone, or can I just use pgvector with the Postgres I already have?
I can imagine teams agonizing over this decision for weeks. So maybe this will save you some time?
The Case for Staying Put
If you already have a PostgreSQL instance in your stack, adding
pgvectoris almost always the right first move.You manage one stateful service instead of two. Your existing backup strategy, monitoring, and security all stay the same. Your vector embeddings live next to your metadata, so you get ACID compliance and standard SQL joins. No syncing between two data stores. No eventual consistency headaches.
Performance? From what I found, for datasets under a few million vectors,
pgvectorwith HNSW indexes is fast. Really fast. It satisfies the latency requirements of most applications without breaking a sweat.And you’re not paying for another SaaS subscription…
When Pinecone Actually Makes Sense
Pinecone is a purpose-built vector database designed for high-dimensional data at massive scale. It’s serverless and fully managed.
If you’re dealing with hundreds of millions or billions of vectors, a specialized engine handles memory and disk I/O for similarity searches more efficiently than Postgres can. Pinecone also gives you native namespace support, metadata filtering optimized for vector search, and live index updates that are faster than re-indexing a large Postgres table.
Those are real advantages. At a certain scale.
The Decision Is Simpler Than You Think
Stay with Postgres + pgvector if:
- You want to minimize infra sprawl and moving parts
- Your vector dataset is under 5 to 10 million records
- You rely on relational joins between vectors and other business data
- You have existing observability and DBA expertise for Postgres
Consider Pinecone if:
- Your Postgres instance needs massive, expensive vertical scaling just to keep the vector index in memory
- You don’t want to tune HNSW parameters,
mmapsettings, or vacuuming schedules for large vector tables - You need sub-millisecond similarity search at a scale where Postgres starts to struggle
That is what I would use to make that decision.
Most teams are probably nowhere near the scale where Pinecone becomes necessary. They have a few hundred thousand vectors, maybe a million or two. Postgres handles that without flinching. Adding a separate managed vector database at that point is just adding operational complexity for no measurable benefit.
The trap is thinking you need to “plan ahead” for scale you don’t have yet. You can always migrate later if you actually hit the ceiling. Moving from pgvector to Pinecone is a well-documented path. But moving from two services back to one because you overengineered your stack? That’s a conversation nobody wants to have.
Start with what you have. Add complexity when the numbers force you to, not when a vendor’s marketing page makes you nervous.
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
/ DevOps / AI / Programming / Databases
-
How Kong Actually Works in Kubernetes
At some point with microservices in Kubernetes, basic Ingress routing stops being enough. Kong is interesting router that I would like to try in the future.
It’s an API Gateway built on top of NGINX and OpenResty. It operates at the infrastructure layer, managing the actual HTTP traffic flowing into your cluster. Drop it into a Kubernetes environment and it acts as an Ingress Controller. It does that job really well.
The Ingress Controller Problem
We should review what an ingress controller is. In case you’re familiar, unfamiliar with its job in Kubernetes. An
Ingressresource is just a set of routing rules. “Send traffic forapi.example.com/v1to theuser-servicepod.” Kubernetes doesn’t actually route traffic itself. It needs a controller to read those rules and move the packets.The Kong Ingress Controller (KIC) runs as a pod inside your cluster. It watches the Kubernetes API server for changes to Ingress resources, Services, and Endpoints. When someone deploys a new app and creates an Ingress rule, KIC picks it up, translates the Kubernetes config into Kong’s native format, and reloads the proxy. No manual intervention.
How Traffic Actually Flows
When external traffic hits your cluster, the path looks like this:
- External Load Balancer forwards traffic to the Kong proxy pods
- Kong evaluates the incoming request against its routing table (headers, paths, hostnames)
- Plugins execute before routing, handling cross-cutting concerns at the edge instead of inside your application code
- Upstream routing sends traffic directly to Pod IPs, bypassing
kube-proxyfor better performance
That plugin step is where Kong really earns its keep. Rate limiting, API key auth, mTLS, request transformation. All of that happens at the gateway layer so your services don’t have to think about it.
CRDs Make It Actually Useful
Standard Kubernetes Ingress is pretty limited. Host-based routing, path-based routing, and that’s about it. Kong extends this with Custom Resource Definitions:
- KongPlugin lets you attach behaviors to routes or services. Deploy a manifest to enforce rate limits, require API keys, or add mTLS to a specific endpoint.
- KongConsumer manages user identities and credentials directly in Kubernetes, so you can tie routing rules or rate limits to specific clients.
This means your API gateway configuration lives right alongside your application manifests. Version controlled, reviewable, deployable through your normal CI/CD pipeline.
Skip the Database
Kong used to require PostgreSQL or Cassandra to store its routing config. In modern Kubernetes deployments, you almost always run it in DB-less mode instead.
Why? Kubernetes already has
etcdas its source of truth for cluster state. Running a second database just for the API gateway adds overhead and failure modes you don’t need. In DB-less mode, Kong stores its configuration entirely in memory. The Ingress Controller reads state from Kubernetes and pushes updates to the proxy dynamically.This is one of those decisions that sounds minor but changes everything about how you operate Kong. No database backups to worry about. No schema migrations. Your gateway config is just Kubernetes manifests managed through GitOps.
Observability at the Edge
Sitting at the edge of the cluster, Kong is perfectly positioned to capture metrics, logs, and traces. With the right plugins, it exports traffic data (latency, status codes, request volumes) directly into whatever observability stack you’re running.
You get visibility across your entire microservice architecture without instrumenting every individual service.
Kong isn’t the only Ingress controller out there, but the combination of plugin architecture, DB-less mode, and CRD-based configuration makes it a solid choice if you need more than basic routing. If you’re already running Kubernetes and find yourself writing the same auth and rate-limiting logic across multiple services, moving that to the gateway layer is worth your time.
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
/ DevOps / Kubernetes / Kong / Infrastructure
-
Agentic Development Trends: What's Changed in Early 2026
I’ve been following the agentic development space around Claude Code and similar tools and the last couple months have been interesting. Here’s what I’m seeing as we move through March and April 2026.
From Solo Agents to Coordinated Teams
The biggest shift is that more people are moving away from trying to build one agent that does everything. Instead, we’re seeing coordinated teams of specialized agents managed by an orchestrator, often running tasks in parallel. I think this is the more proper use of these systems, and it’s great to see the community arriving here.
If you’re curious about the different levels of working with agentic software development, I created an agentic maturity model on GitHub that goes into more detail on this progression.
Long-Running Autonomous Workflows
Early on, agents handled what were essentially one-shot tasks. Now in 2026, agents can be configured to work for days at a time, requiring only strategic oversight at key decision points. Doesn’t that sound fun? You’re still the bottleneck, but at least now you’re a strategic bottleneck.
Graph-Based Orchestration
Frameworks like LangGraph and AutoGen are converging on graph-based state management to handle the complex logic of multi-agent workflows. I think this makes sense when you consider the branching and conditional logic of real-world tasks could map naturally to graphs.
MCP Is Everywhere
MCP (Model Context Protocol) has become the industry standard for tool integration. All vendors fully support it, and there’s no sign of slowing down. Every week there are new MCP servers popping up for connecting agents to different services and tools.
Unified Agentic Stacks
The developer tooling is becoming more consistent. Cursor is becoming more like Claude Code, and Codex is becoming more like Claude Code. Maybe you see a pattern there… might tell you something about who’s setting the pace.
What is also noteable, people are experimenting with using different tools for different parts of the workflow. You might use Cursor to build the interface, Claude Code for the reasoning and main logic, and Codex for specific isolated tasks. Mix and match based on strengths.
Scheduled Agents and Routines
Claude Code recently released routines or scheduled or trigger-based automations that can run 24/7 on cloud infrastructure without needing your laptop. Microsoft with GitHub Copilot are working on similar capabilities? Cursor had something like this a while back too.
Security Gets Serious
Two things happening here. First, people are getting better at leveraging agents for security reviews and monitoring. Tasks that previously required highly specialized InfoSec expertise. You no longer need to be a hacker to find vulnerabilities; you can let your AI try to hack you.
However, the same capabilities that harden defenses can also be used for offensive attacks. We’re seeing a major push for security-first architecture as a requirement for all new applications, specifically to defend against the rise of agentic offensive attacks. Red team and blue team are both getting AI-pilled.
FinOps: Watching the Bill
Last on the list is financial operations. Inference costs now account for over half of AI cloud spending according to recent estimates. Organizations are prioritizing frameworks that offer explicit cost monitoring and cost-per-task alerts. Getting granular about how much you’re spending to solve specific problems and optimizing at the task level. I think that’s pretty interesting and something we’ll see a lot more tooling around.
The common thread across all of these trends is maturity. We’re past the “wow, an AI wrote code” phase and into “how do we make this reliable, secure, and cost-effective at scale.” That’s a good place to be.
/ DevOps / AI / Development / Claude
-
What Companies Are Actually Paying for Application Security
In the Application Security Testing (AST) market, Static Application Security Testing (SAST) and Software Composition Analysis (SCA) represent the two most critical pillars of preventative cyber defense.
So as a part of that, we should talk about the thing that people normally can’t or don’t talk about and that is cost. Vendors like to hide their pricing behind “contact sales” buttons, and buyers end up negotiating based on hard to find information.
So here’s an unofficial look at what companies are actually paying, pulled from a Deep Research report provided by Gemini. At the very end there is a list of resoruces where you can learn more about these subjects. However it is important to mention, there are not a lot of viable options for the home/hobby market.
What the Market Looks Like
Vendor Average Mid-Market / SMB Spend (Annual) Average Large Enterprise Spend (Annual) Economic Dynamics and Negotiation Factors Snyk ~$47,428 ~$222,516 Costs scale rapidly with developer headcount. Highly susceptible to volume discounting. Total cost includes separate quoting for onboarding and services. Black Duck (Coverity) $60,000 – $120,000 (50-100 devs) $150,000 – $300,000+ (150+ devs) Full platform deployments (SAST + SCA) often range from $300k to $600k+. Volume discounts and custom enterprise agreements are typical. Premium support adds 20-30%. Checkmarx $35,000 – $75,000 $100,000 – $250,000+ Pricing is considered complex. Hidden costs include mandatory professional services, premium support, and infrastructure overhead, adding 15-35% to year-one totals. Veracode $40,000 – $80,000 $100,000 – $250,000+ Application-based pricing feels predictable until microservice architectures cause application counts to explode. Discounts are heavily available for SAST+DAST+SCA bundles. SonarQube $30,000 – $50,000 (up to 5M LOC) $80,000 – $180,000 (5M - 20M+ LOC) Highly predictable LOC model. However, self-managed deployments incur separate infrastructure and administrative overhead costs not reflected in the software license. HCL AppScan $50,000+ $100,000 – $500,000+ Unified platform pricing for large deployments can easily exceed $1M. Implementations often require months of setup and heavy professional service fees. Official Licensing Models and Published Structures
Vendor / Platform Primary Pricing Metric Published Entry-Level / Standard Tier Pricing Enterprise Pricing Status Key Inclusions & Pricing Caveats Snyk Per Contributing Developer Team Tier: ~$52–$98 per developer/month ($624–$1,176/year). Custom / Unpublished Includes Snyk Code (SAST) and Open Source (SCA). Enterprise plans drop per-seat costs at high volume but require minimum seat counts. SonarQube Lines of Code (LOC) Analyzed Developer Edition: ~$15,000 for 1M LOC. Smaller tiers available (e.g., ~$2,500 for 100k LOC). Annual Pricing; Talk to Sales Prices scale strictly by the largest branch of private projects. Enterprise Edition adds legacy languages. Advanced Security is an add-on. GitHub Advanced Security Per Active Committer $19/user/month (Secrets) + $30/user/month (Code) = $49/user/month. Custom / Add-on to Enterprise ($21/user base) GHAS is strictly an add-on to the GitHub Enterprise plan. Tied directly to commit activity within a 30-day window. Mend.io Per Contributing Developer AppSec Platform: Up to $1,000 per developer/year. Included in upper bound limit Includes SAST, SCA, Renovate, and AI Inventory. No limits on LOC, scans, or applications. AI Premium is an extra $300/dev. Checkmarx Custom (Historically Per App or Node) Team Plans: ~$1,188/year base. Enterprise base starts ~$6,850/year. Custom / Unpublished Highly modular pricing based on developer count, module selection (SAST, SCA, DAST), and deployment model. Veracode Per Application or Per Scan Basic plans start at ~$15,000/year for up to 100 applications. Custom / Unpublished Pricing heavily depends on application count, scan frequency, and support levels. SCA alone starts around $12,000/year. Black Duck (Coverity) Per Team Member / Custom Coverity SAST: $800–$1,500 per team member annually. Custom / Unpublished Pricing scales with user access. Often bundled. Perpetual licenses with 18-22% annual maintenance fees exist for legacy deployments. Contrast Security Custom (GiB hour / usage) Essential tier: $119/mo. Advanced: $359/mo. Enterprise base ~$6,850/yr. Custom / Unpublished Pricing varies by package (AST vs. Contrast One managed service) and workload throughput. HCL AppScan Per Scan / Enterprise License SaaS: ~$313 per scan (min 5 scans). Basic Codesweep: $29.99/scan. Custom / Unpublished Enterprise suite pricing is highly customized, often requiring significant upfront capital expenditure. Feature Comparison
Feature / Capability Snyk Veracode Black Duck Checkmarx Mend.io GitHub (GHAS) SonarQube Endor Labs Primary Strength Developer Adoption & Speed Enterprise Governance & Low FPs License Compliance & Deep SAST Unified ASPM & Repo Scanning Automated Remediation Native Ecosystem Integration Code Quality & Baseline Security Noise Reduction & Reachability Reachability Analysis Basic No No No Advanced No No Full-Stack (95% reduction) Automated AI Fixes Yes (DeepCode) Yes (Proprietary Data) No Yes (Limited IDE) Yes Yes (Copilot) Yes (CodeFix) Yes (Without upgrades) Compilation Required No Yes (Binary) Yes (Coverity) No No No No No Broad Language Support High (14+) Very High (100+) High (22+) High (35+) Very High (200+) Moderate High (40) Moderate License Compliance Moderate Moderate Enterprise-Grade Moderate Enterprise-Grade Basic Basic Moderate Learning
-
Is There Something Better Than JSON?
Have you ever looked at a JSON file and thought, “There has to be something better than this”? I have.
JSON has served us well. It works with everything, and it’s human readable. It’s a decent default, don’t get me wrong, but the more you use it, you’ll find its limitations to be quite painful. So before we answer the question of whether there’s anything better, we should describe what’s actually wrong with JSON.
The Problems with JSON
First, there’s no type system. No datetimes, no real integers, no structs, no unions, no tuples. If you need types, and you almost always do, you’re on your own.
Second, JSON is simple, which sounds like a feature until you try to store anything complicated in it. You end up inventing your own schema, and the schema tooling out there (JSON Schema, etc.) gets verbose fast. Because the spec is so loose, validation can be inconsistent across implementations.
There’s more: fields can be reordered, you have to receive the entire document before you can start verifying it, and there are no comments. You can’t leave a note for the next person explaining why a config value is set a certain way. That’s a real problem for anything that lives in version control.
The Machine-Readable Alternatives
Now, there are plenty of binary serialization formats that solve some of these issues. Protobuf, Cap’n Proto, CBOR, MessagePack, BSON. They’re all interesting and have their place. But they’re machine readable, not human readable. You can’t just open one up in your editor and make sense of it. So let’s set those aside.
The question I’m more interested in is: is there something better than JSON that you can still read and edit as a text file?
It turns out there are two solid options.
Dhall
Dhall is a programmable configuration language. Think of it as JSON with all the things you wish JSON had: functions, types, and imports. You can convert JSON to Dhall and back, and it’s just a text file you can open in any editor. The name comes from a character in an old video game, and the language itself is interesting enough that it’s worth your time to explore.
CUE
CUE stands for Configure, Unify, and Execute. It’s similar to Dhall in that it fills the gaps JSON leaves behind, like types, validation, and constraints, while staying human readable. Where CUE really pulls ahead is in its feature set. You can import Protobuf definitions, generate JSON Schema, validate existing configs, and a lot more. In terms of raw capabilities, CUE has more going on than Dhall.
JSON isn’t going anywhere. But if you’re looking for something interesting to explore, check out both of these. They make great fun little side projects.
/ DevOps / Programming / Json / Configuration
-
Multi-Repos Are Underrated
If you’re considering a monorepo, I’d like you to, stop, and reconsider. Monorepos cause more problems than they solve, and I think multi-repos deserve way more love than they get.
The “Shared Libraries” Argument
The pitch usually goes something like this: “If we put everything in a monorepo, we can have shared libraries across multiple applications.” Okay, sure. But let’s talk about what’s actually happening here.
This is almost always closed-source, internal code. You don’t have a public package registry to lean on. And maybe your org hasn’t approved a private package hosting service. So the monorepo becomes the path of least resistance, not because it’s the best solution, but because nobody wants to fight for the budget to host private packages.
But, actually, private package hosting for most languages doesn’t cost a lot. You can host private packages in GCP pretty easily, but there are several affordable options. However it can depend somewhat on the language.
Monorepos often exist because nobody fought for the right infrastructure, not because it was the right call.
Coupling Will Eat You Alive
Probably the biggest problem with monorepos is coupling. You can very easily introduce tightly coupled dependencies across several applications. Now you can’t update your libraries safely because two completely different applications are using the same one, and nobody wants to touch it.
You know that feeling within a single application where there’s tightly coupled code without proper abstractions? Congratulations, now you have that problem across several different applications.
This is why we have packages with versions. Would you make breaking changes to an API and not version it?
Are we not engineers dedicated to a craft? Version your packages. Version your APIs.
Let the applications that pull in dependencies manage their own upgrades. If something worked on version 1.2 and breaks on 1.3, either fix your application or stay on the old version. That’s the whole point of versioning.
CI/CD Becomes a Nightmare
Monorepos make your CI/CD pipelines absolutely terrible to work on. Not only does it make things harder for everyone on the team to work with their applications day-to-day, but now your build and deploy pipelines are a tangled mess.
There are going to be undocumented parts of the monorepo tooling, like little hidden landmines waiting to kneecap you when you least expect it.
What About NX?
Yes, I’ve used NX. I don’t want to get into and re-traumatize myself, but chances are most of your team secretly hates it. I’ll use it if I’m forced to. But if it’s my decision? No-thX.
A Multi-Repo Example
From my own work: api2spec has fixture repos for Hono, Express, chi, gin, Fastify and many more all in a separate repositories.
They test the same tool against different frameworks across many different programming languages. Putting them in a monorepo would’ve complicated things significantly. Instead we have separate repos under the same GitHub org with consistent naming convention. Simple not Stupid.
For The Love of All that Is Holy Do Yourself A Favor and stick with Multi-Repos
Multi-repos give you clear boundaries, independent versioning, simpler CI/CD, and teams that can move without stepping on each other.
Yes, the overhead of managing separate repositories is real, but it’s a manageable and with good hygiene, the much preferred path over a never ending battle with your own tooling.
The monorepo pitch sounds great in a meeting. The reality is coupling, pipeline complexity, and a team that’s afraid to merge.
-
I switched to mise for version management a month ago. No regrets. No more
brew upgradebreaking Python. Built-in task runner replaced some of projects that were using Makefiles.Still juggling nvm + pyenv + rbenv?
/ DevOps / Programming / Tools
-
When to Use Python Over Bash
When to use python over bash is really a question of when to use bash. Python is a general-purpose language that can handle just about anything you throw at it. Bash, on the other hand, has a very specific sweet spot. Once you understand that sweet spot, the decision makes itself.
What Bash Actually Is
Bash is an interactive command interpreter and scripting language, created in 1989 for the GNU project as a free software alternative to the Bourne shell. It pulled in advanced features from the Korn shell and C shell, and it’s been commonly used by Unix and Linux systems ever since.
What makes Bash unique is its approach to data flow programming. Files, directories, and system processes are treated as first-class objects. Bash is designed to take advantage of utilities that almost always exist on Unix-based systems. So think of tools like
awk,sed,grep,cat, andcurl. Another important thing to know when writing effective Bash scripts, you also need to understand the pipeline operator and how I/O redirection works.A good Bash script will look something like this:
#!/bin/bash set -euo pipefail LOG_DIR="/var/log/myapp" DAYS_OLD=30 find "$LOG_DIR" -name "*.log" -mtime +"$DAYS_OLD" -print0 | xargs -0 gzip -9 echo "Compressed logs older than $DAYS_OLD days"Simple, portable, does one thing well. That’s Bash at its best.
Where Bash Falls Short
Bash isn’t typed. There’s no real object orientation. Error handling is basically
set -eand hoping for the best. There’s notry/catch, no structured exception handling. When things go wrong in a Bash script, they tend to go wrong quietly or spectacularly, with not much in between.Python, by contrast, is optionally strongly typed and object-oriented. If you want to manipulate a file or a system process in Python, you wrap that system entity inside a Python object. That adds some overhead, sure, but in exchange you get something that’s more predictable, more secure, and scales well from simple scripts to complex logic.
Here’s that same log compression task in Python:
from pathlib import Path import gzip import shutil from datetime import datetime, timedelta log_dir = Path("/var/log/myapp") cutoff = datetime.now() - timedelta(days=30) for log_file in log_dir.glob("*.log"): if datetime.fromtimestamp(log_file.stat().st_mtime) < cutoff: with open(log_file, "rb") as f_in: with gzip.open(f"{log_file}.gz", "wb") as f_out: shutil.copyfileobj(f_in, f_out) log_file.unlink()More verbose? Absolutely. But also more explicit about what’s happening, easier to extend, and much easier to add error handling to.
The Performance Question
In some cases, performance genuinely matters. Think high-frequency trading platforms, edge devices, or massive clusters. Bash scripts excel here because there’s almost zero startup overhead. Compare that to Python, which needs to load up the interpreter before it can start executing code. You’re going from microseconds to milliseconds, and sometimes that matters.
But startup time is just one factor. When you compare the actual work being done, Python can pull ahead. String manipulation on structured data? Python wins. Parsing JSON, YAML, or any structured format? Python’s core libraries are written in C and optimized for exactly this kind of work. If you find yourself reaching for
jqoryqin a Bash script, that’s a strong signal you should be using Python instead.The Guidelines People Throw Around
You’ll see a common guideline online: if your script exceeds 100 lines of Bash, rewrite it in Python. But a lot of veterans in the industry feel like that cutoff is way too generous. Experienced engineers often put it at 50 lines, or even 25.
Another solid indicator: nested
ifstatements. Some people say “deeply nested” if statements, but let’s be honest, more than one level of nesting in Bash is already getting painful. Python handles complex branching logic far more gracefully, and you’ll thank yourself when you come back to maintain it six months later.Unit Testing Tells the Story
You can do unit testing with Bash. BATS (Bash Automated Testing System) exists, and ShellCheck is useful as a lightweight linter for catching bad practices. But despite these tools, Python’s testing ecosystem is on another level entirely. It’s fully mature with multiple frameworks, excellent mocking capabilities, and the ability to simulate network calls, external APIs, or system binaries. Complex mocking that would be difficult or impossible in Bash is straightforward in Python.
If your script needs solid testing or if it’s doing anything important, that’s a strong vote for Python.
Bash’s Biggest Win: Portability
So what does Bash actually win at? Portability. When you think about all the dependencies Python needs to run, Bash is the clear winner. You’re distributing a single
.shfile. That’s it.With Python, you have to ask: Does Python exist on this machine? Is it the right version? You’ll need a virtual environment so you don’t pollute system Python. You need third-party libraries installed via a package manager; and please friends, remember that we don’t let friends use pip. Use Poetry or uv. Pip is so bad that I’d honestly argue that Bash not having a package manager is better than Python having pip. At least Bash doesn’t pretend to manage dependencies well.
If you want something simple, something that can run on practically any Unix-based machine without setup, Bash is your answer. Even Windows can handle it these days through WSL, though you’re jumping through a few hoops.
TLDR
The decision is actually pretty straightforward:
- Use Bash when you’re gluing together system commands, the logic is linear, it’s under 50 lines, and portability matters.
- Use Python when you’re parsing structured data, need error handling, have branching logic, want proper tests, or the script is going to grow.
If you’re reaching for
jq, writing nestedifstatements, or the script is getting long enough that you’re losing track of what it does… it’s time for Python.I think in a future post we might look at when Go makes sense over Bash. There’s a lot to cover there about compiled binaries, but for now, hopefully this helps you make the call next time you’re wondering what to start your scripting with.
/ DevOps / Programming / Python / Bash / Scripting
-
Local Secrets Manager - Dotenv Encrypter
I built a thing to solve a problem. It has helped me, maybe it will help you?
It all starts with a question.
Why isn’t there a good local secrets manager that encrypts your secrets at rest? I imagine a lot of people, like me, have a number of local applications. I don’t want to pay per-seat pricing just to keep my sensitive data from sitting in plaintext on my machine.
I built an app called LSM Local Secrets Manager to solve that problem. The core idea is simple. Encrypt your
.envfiles locally and only decrypt when you need them (sometimes at runtime).The Problem
If you’ve got a bunch of projects on your machine, each with their own
.envor.env.localfile full of API keys you’re definitely not rotating every 90 days. Those files just sit there in plaintext. Any process on your system can read them. And with AI agents becoming part of our dev workflows, the attack surface for leaking secrets is only getting easier.ThE CLAW EnteRed ChaT
I started looking at Doppler specifically for OpenCLAW. Their main selling feature is injecting secrets into your runtime so they never touch the filesystem. I was like, cool. Also I like that Doppler stores everything remotely. The only thing was the cost did not make sense for me right now. I don’t want to pay $10-20 a month for this set of features.
So what else is there?
Well GCP Secret Manager has its own set of issues.
You can’t have duplicate names per project, so something as common as
NODE_ENVacross multiple apps becomes a more work than you want to deal with. Some wrapper script that injects prefixes? No thanks. I imagine there are a thousand and one homegrown solutions to solve this problem. Again, no thanks.So what else is there?
You Find A Solution
AWS Secret Manager
A Problem for Solution Problem
AWS IAM
🫣
I have a lot more to say here on this subject but will save this for another post. Subscribe if you want to see the next post.
The Solution
The workflow is straightforward:
lsm init— Run this once from anywhere. It generates your encryption key file.lsm link <app-name>— Run this inside your project directory. It creates a config entry in~/.lsm/config.yamlfor that application.lsm import— Takes your existing.envor.env.localand creates an encrypted version.lsm clean— Removes the plaintext.envfiles so they’re not just sitting around.lsm dump— Recreates the.envfiles if you need them back.
But wait there’s more.
Runtime Injection with
lsm execRemember that cool thing I just told you about? Instead of dumping secrets back to disk, you run:
lsm exec -- pnpm devI feel like a family man from Jersey, who don’t mess around. Aye, you got, runtime injection. I got that.
Well that’s
lsmanyways. It can decrypt your secrets and inject them directly into the runtime environment of whatever command follows the--. Your secrets exist in memory for the duration of that process and nowhere else. No plaintext files hanging around for other processes to sniff.Credit to Doppler for the idea. The difference to what we are doing is your encrypted files stay local.
What’s Next
I’ve got some possible ideas of improvements to try building.
- Separate encrypt/decrypt keys — You create secrets with one key, deploy the encrypted file to a server, and use a read-only key to decrypt at runtime. The server never has write access to your secrets.
- Time-based derivative keys — Imagine keys that expire or rotate automatically.
- Secure sharing — Right now you’d have to decrypt and drop the file into a password manager to share it. There’s room to make that smoother.
I’m not sure how to do all of that yet, but we’re making progress.
Why Not Just Use Doppler?
There are genuinely compelling reasons to use Doppler or similar services. I mean bsides the remote storage, access controls and auditable logs. There’s a lot to love.
For local development across a bunch of personal projects? I don’t think you should need a SaaS subscription to keep your secrets encrypted.
LSM is still early, but the core workflow is there and it works.
Give it a try if you’re tired of plaintext
.envfiles scattered across your machine.
/ DevOps / Programming / Tools / security
-
Doppler | Centralized cloud-based secrets management platform
Doppler’s secrets management platform helps teams secure, sync, and automate their secrets across environments and infrastructure. Experience enhanced security, agility, and automation with our cloud platform.
/ DevOps / links / platform / security / cloud security
-
Claude Code Now Has Two Different Security Review Tools
If you’re using Claude Code, you might have noticed that Anthropic has been quietly building out security tooling. There are now two distinct features worth knowing about. They sound similar but do very different things, so let’s break it down.
The /security-review Command
Back in August 2025, Anthropic added a
/security-reviewslash command to Claude Code. This one is focused on reviewing your current changes. Think of it as a security-aware code reviewer for your pull requests. It looks at what you’ve modified and flags potential security issues before you merge.It’s useful, but it’s scoped to your diff. It’s not going to crawl through your entire codebase looking for problems that have been sitting there for months.
The New Repository-Wide Security Scanner
Near the end of February 2026, Anthropic announced something more ambitious: a web-based tool that scans your entire repository and operates more like a security researcher than a linter. This is the thing that will help you identify and fix security issues across your entire codebase.
First we need to look at what already exists to understand why it matters.
SAST tools — Static Application Security Testing. SAST tools analyze your source code without executing it, looking for known vulnerability patterns. They’re great at catching things like SQL injection, hardcoded credentials, or buffer overflows based on pattern matching rules.
If a vulnerability doesn’t match a known pattern, it slips through. SAST tools also tend to generate a lot of false positives, which means teams start ignoring the results.
What Anthropic built is different. Instead of pattern matching, it uses Claude to actually reason about your code the way a security researcher would. It can understand context, follow data flows across files, and identify logical vulnerabilities that a rule-based scanner would never catch. Think things like:
- Authentication bypass through unexpected code paths
- Authorization logic that works in most cases but fails at edge cases
- Business logic flaws that technically “work” but create security holes
- Race conditions that only appear under specific timing
These are the kinds of issues that usually require a human security expert to find or … real attacker.
SAST tools aren’t going away, and you should still use them. They’re fast, they catch the common stuff, and they integrate easily into CI/CD pipelines.
Also the new repository-wide security scanner isn’t out yet, so stick with what you got until it’s ready.
/ DevOps / AI / Claude-code / security
-
You move fast. Cloud development cycles do not.
Mixing and matching optimization strategies won’t fix your slow development loop. LocalStack streamlines your feedback loop, bringing the cloud directly to your laptop. Same production behavior. Faster feedback. Fully under your control.
/ DevOps / Development / links / localcloud / docker
-
Stop Using pip. Seriously.
If you’re writing Python in 2026, I need you to pretend that pip doesn’t exist. Use Poetry or uv instead.
Hopefully you’ve read my previous post on why testing matters. If you haven’t, go read that first. Back? Hopefully you are convinced.
If you’re writing Python, you should be writing tests, and you can’t do that properly with pip. It’s an unfortunate but true state of Python right now.
In order to write tests, you need dependencies, which is how we get to the root of the issue.
The Lock File Problem
The closest thing pip has to a lock file is
pip freeze > requirements.txt. But it just doesn’t cut the mustard. It’s just a flat list of pinned versions.A proper lock file captures the resolution graph, the full picture of how your dependencies relate to each other. It distinguishes between direct dependencies (the packages you asked for) and transitive dependencies (the packages they pulled in). A
requirements.txtdoesn’t do any of that.Ok, so? You might be asking yourself.
It means that you can’t guarantee that running
pip install -r requirements.txtsix months or six minutes from now will give you the same copy of all your dependencies.It’s not repeatable. It’s not deterministic. It’s not reliable.
The one constant in Code is that it changes. Without a lock file, you’re rolling the dice every time.
Everyone Else Figured This Out
Every other modern language ecosystem “solved” this problem years ago:
- JavaScript has
package-lock.json(npm) andpnpm-lock.yaml(pnpm) - Rust has
Cargo.lock - Go has
go.sum - Ruby has
Gemfile.lock - PHP has
composer.lock
Python’s built-in package manager just… doesn’t have this.
That’s a real problem when you’re trying to build reproducible environments, run tests in CI, or deploy with any confidence that what you tested locally is what’s running in production.
What to Use Instead
Both Poetry and uv solve the lock file problem and give you reproducible environments. They’re more alike than different — here’s what they share:
- Lock files with full dependency resolution graphs
- Separation of dev and production dependencies
- Virtual environment management
pyproject.tomlas the single config file- Package building and publishing to PyPI
Poetry is the more established option. It’s at version 2.3 (released January 2026), supports Python 3.10–3.14, and has been the go-to alternative to pip for years. It’s stable, well-documented, and has a large ecosystem of plugins.
uv is the newer option from Astral (the team behind Ruff). It’s written in Rust and is 10–100x faster than pip at dependency resolution. It can also manage Python versions directly, similar to mise or pyenv. It’s currently at version 0.10, so it hasn’t hit 1.0 yet, but gaining adoption fast.
You can’t go wrong with either. Pick one, use it, and stop using pip.
/ DevOps / Programming / Python
- JavaScript has
-
Serverless and Edge Computing: A Practical Guide
Serverless and edge computing have transformed how we deploy and scale web applications. Instead of managing servers, you write functions that automatically scale from zero to millions of users.
Edge computing takes this further by running code geographically close to users for minimal latency. Let’s break down how these technologies work and when you’d actually want to use them.
What is Serverless?
Serverless doesn’t mean “no servers”, it means you don’t manage them. The provider handles infrastructure, scaling, and maintenance. You just write “functions”.
The functions are stateless, auto-scaling, and you only pay for execution time. So what’s the tradeoff? Well there are several but the first one is Cold starts. The first request after idle time is slower because the container needs to spin up.
The serverless Platforms as a Service are sticky, preventing you easily moving to another platform.
They are stateless, meaning each invocation is independent and doesn’t retain any state between invocations.
In some cases, they don’t run Node, so they behave much differently when building locally, which complicates development and testing.
Each request is handled by a NEW function and so you can imagine that if you have a site that gets a lot of traffic, and makes a lot of requests, this will lead to expensive hosting bills or you playing the pauper on social media.
Traditional vs. Serverless vs. Edge
Think of it this way:
- Traditional servers are always running, always costing you money, and you handle all the scaling yourself. Great for predictable, high-traffic workloads. Lots of different options for hosting and scaling.
- Serverless (AWS Lambda, Vercel Functions, GCP Functions) spins up containers on demand and kills them when idle. Auto-scales from zero to infinity. Cold starts around 100-500ms.
- Edge (Cloudflare Workers, Vercel Edge) uses V8 Isolates instead of containers, running your code in 200+ locations worldwide. Cold starts under 1ms.
Cost Projections
Here’s how the costs break down at different scales:
Requests / Month ~RPS (Avg) AWS Lambda Cloudflare Workers VPS / K8s Cluster Winner 1 Million 0.4 $0.00 (Free Tier) $0.00 (Free Tier) $40–$60 (Min HA Setup) Serverless 10 Million 4.0 ~$12 ~$5 $40–$100 Serverless 100 Million 40 ~$120 ~$35 $80–$150 Tie / Workers 500 Million 200 ~$600 ~$155 $150–$300 VPS / Workers 1 Billion 400 ~$1,200+ ~$305 $200–$400 VPS / EC2 The Hub and Spoke Pattern
Also called the citadel pattern, this is where serverless and traditional infrastructure stop competing and start complementing each other. The idea is simple: keep a central hub (your main application running on containers or a VPS) and offload specific tasks to serverless “spokes” at the edge.
Your core API, database connections, and stateful logic stay on traditional infrastructure where they belong. But image resizing, auth token validation, A/B testing, geo-routing and rate limiting all move to edge functions that run close to the user.
When to Use Serverless
- Unpredictable or spiky traffic — APIs that go from 0 to 10,000 requests in minutes (webhooks, event-driven workflows)
- Lightweight, stateless tasks — image processing, PDF generation, sending emails, data transformation
- Low-traffic side projects — anything that sits idle most of the time and you don’t want to pay for an always-on server… and you don’t know how to setup a Coolify server.
- Edge logic — geolocation routing, header manipulation, request validation before it hits your origin
When to Use Containers / VPS
- Sustained high traffic — once you’re consistently above ~100M requests/month, a VPS is cheaper (see the table above)
- Stateful workloads — WebSocket connections, long-running processes, anything that needs to hold state between requests
- Database-heavy applications — connection pooling and persistent connections don’t play well with serverless cold starts
- Complex applications — monoliths or microservices that need shared memory, background workers, or cron jobs
The Hybrid Approach
The best architectures often use both. It depends on your specific use case and requirements. It depends on the team, the budget, and the complexity of your application.
Knowing the tradeoffs is the difference between a seasoned developer and a junior. It’s important that you make the right decisions based off your needs and constraints.
Good luck and godspeed!
/ DevOps / Development / Serverless / Cloud
-
Switching to mise for Local Dev Tool Management
I’ve been making some changes to how I configure my local development environment, and I wanted to share what I’ve decided on.
Let me introduce to you, mise (pronounced “meez”), a tool for managing your programming language versions.
Why Not Just Use Homebrew?
Homebrew is great for installing most things, but I don’t like using it for programming language version management. It is too brittle. How many times has
brew upgradedecided to switch your Python or Node version on you, breaking projects in the process? Too many, in my experience.mise solves this elegantly. It doesn’t replace Homebrew entirely, you’ll still use that for general stuff but for managing your system programming language versions, mise is the perfect tool.
mise the Great, mise the Mighty
mise has all the features you’d expect from a version manager, plus some nice extras:
Shims support: If you want shims in your bash or zsh, mise has you covered. You’ll need to update your RC file to get them working, but once you do, you’re off to the races.
Per-project configuration: mise can work at the application directory level. You set up a
mise.tomlfile that defines its behavior for that specific project.Environment management: You can set up environment variables directly in the toml file, auto-configure your package manager, and even have it auto-create a virtual environment.
It can also load environment variables from a separate file if you’d rather not put them in the toml (which you probably want if you’re checking the file in).
It’s not a package manager: This is important. You still need poetry or uv for Python package management. As a reminder: don’t ever use pip. Just don’t.
A Quick Example
Here’s what a
.mise.tomlfile looks like for a Python project:[tools] python = "3.12.1" "aqua:astral-sh/uv" = "latest" [env] # uv respects this for venv location UV_PROJECT_ENVIRONMENT = ".venv" _.python.venv = { path = ".venv", create = true }Pretty clean, right? This tells mise to use Python 3.12.1, install the latest version of uv, and automatically create a virtual environment in
.venv.Note on Poetry Support
I had to install python from source using mise to get poetry working. You will want to leave this setting to be true. There is some problem with the precompiled binaries they are using.
You can install global python packages, like poetry, with the following command:
mise use --global poetry@latestYes, It’s Written in Rust
The programming veterans among you may have noticed the toml configuration format and thought, “Ah, must be a Rust project.” And you’d be right. mise is written in Rust, which means it’s fast! The project is stable, has a ton of GitHub stars, and is actively maintained.
Task Runner Built-In
One feature I wasn’t expecting: mise has a built-in task runner. You can define tasks right in your
mise.toml:[tasks."venv:info"] description = "Show Poetry virtualenv info" run = "poetry env info" [tasks.test] description = "Run tests" run = "poetry run pytest"Then run them with
mise run testormise r venv:info.If you’ve been putting off setting up Make for a project, this is a compelling alternative. The syntax is cleaner and you get descriptions for free
I’ll probably keep using Just for more complex build and release workflows, but for simple project tasks, mise handles it nicely. One less tool to install.
My Experience So Far
I literally just switched everything over today, and it was a smooth process. No too major so far. I’ll report back if anything breaks, but the migration from my previous setup was straightforward.
Now, I need to get the other languages I use, like Go, Rust, and PHP setup and moved to mise. Having everything consolidated into one tool is going to be so nice.
If you’re tired of Homebrew breaking your language versions or juggling multiple version managers for different languages, give mise a try.
The documentation is solid, and the learning curve is minimal.
/ DevOps / Tools / Development / Python
-
Two Changes in Claude Code That Actually Matter
As of 2026-01-24, the stable release of Claude Code is 2.1.7, but if you’ve been following the bleeding edge, versions 2.1.15 and 2.1.16 bring some significant changes. Here’s what you need to know.
The npm Deprecation Notice
Version 2.1.15 added a deprecation notification for npm installations.
If you’ve been using Claude Code via npm or homebrew, Anthropic will soon start nudging you toward a new installation method. You’ll want to run
claude installor check out the official getting started docs for the recommended approach.This isn’t a breaking change yet, but it’s a clear they are moving away from npm for releases going forward.
Built-in Task Management
Version 2.1.16 introduces something I’m genuinely excited about: a new task management system with dependency tracking.
If you’ve been using tools like beads for lightweight issue tracking within your coding sessions, this built-in system offers a similar workflow without the setup.
You can define tasks, track their status, and—here’s the key part—specify dependencies between them. Task B won’t start until Task A completes.
This is particularly useful for repositories where you don’t have beads configured or you’re working on something quick where setting up external tooling feels like overkill.
Your sub-agents can now have proper task management without anything extra.
Should You Update?
If you’re on 2.1.7 stable and everything’s working, there’s no rush. But if you’re comfortable with newer releases, the task management in 2.1.16 is worth trying, especially if you work with complex multi-step workflows or use sub-agents frequently.
The npm deprecation is something to keep on your radar regardless. Plan your migration before it becomes mandatory.
/ DevOps / Ai-tools / Claude-code
-
Security and Reliability in AI-Assisted Development
You may not realize it, but AI code generation is fundamentally non-deterministic. It’s probabilistic at its core, it’s predicting code rather than computing it.
And while there’s a lot of orchestration happening between the raw model output and what actually lands in your editor, you can still get wildly different results depending on how you use the tools.
This matters more than most people realize.
Garbage In, Garbage Out (Still True)
The old programming adage applies here with renewed importance. You need to be explicit with these tools. Adding predictability into how you build is crucial.
Some interesting patterns:
- Specialized agents set up for specific tasks
- Skills and templates for common operations
- Orchestrator conversations that plan but don’t implement directly
- Multiple conversation threads working on the same codebase via Git workspaces
The more structure you provide, the more consistent your output becomes.
The Security Problem
This topic doesn’t get talked about enough. All of our common bugs have snuck into the training data. SQL injection patterns, XSS vulnerabilities, insecure defaults… they’re all in there.
The model can’t always be relied upon to build it correctly the first time. Then there’s the question of trust.
Do you trust your LLM provider?
Is their primary focus on quality and reliable, consistent output? What guardrails exist before the code reaches you? Is the model specialized for coding, or is it a general-purpose model that happens to write code?
These are important engineering questions.
Deterministic Wrappers Around Probabilistic Cores
The more we can put deterministic wrappers around these probabilistic cores, the more consistent the output will be.
So, what does this look like in practice?
Testing is no longer optional. We used to joke that we’d get to testing when we had time. That’s not how it works anymore. Testing is required because it provides feedback to the models. It’s your mechanism for catching problems before they compound.
Testing is your last line of defense against garbage sneaking into the system.
AI-assisted review is essential. The amount of code you can now create has increased dramatically. You need better tools to help you understand all that code. The review step, typically done during a pull request, is now crucial for product development. Not optional. Crucial.
The models need to review itself, or you need a separate review process that catches what the generating step missed.
The Takeaway
We’re in an interesting point in time. These tools can dramatically increase your output, but only if you build the right guardrails around them should we trust the result.
Structure your prompts. Test everything. Review systematically. Trust but verify.
The developers who figure out how to add predictability to unpredictable processes are the ones who’ll who will be shipping features instead of shitting out code.
/ DevOps / AI / Programming
-
Gitea - Git with a cup of tea! Painless self-hosted all-in-one software development service, including Git hosting, code review, team collaboration, package registry and CI/CD
/ DevOps / Development / links / platform / self-hosted / code