Architecture
-
Everything Is Eventually a Database Problem
I think there’s a saying that goes something like “code is ephemeral, but data is forever.” That’s never been more true than right now. Code is easier than ever to create, anyone can spin up a working app with an AI agent and minimal experience. But your data structure? That’s the thing that sticks around and haunts you.
Data modeling is one of those topics that doesn’t get enough attention, especially given how critical it is. You need to understand how your data is stored, how to structure it, and what tradeoffs you’re making in how you access it. Get it right early, and your code stays elegant and straightforward. Get it wrong, and your codebase becomes a forever series of workarounds…
Microservices Won’t Save You
For teams moving from a monolith to microservices, if the data stays tightly coupled, you don’t really have microservices; you have a distributed monolith with extra network hops.
Yes, data can be coupled just like code can be coupled. If all your different services are still hitting the same database with the same schema, you have a problem. You need separate data structures for your services, not a monolithic architecture hiding behind a microservices facade.
The Caching Trap
So what happens when you have a lot of data and your queries get slow? You’ve done all the easy stuff; optimized queries, added indexes, followed best practices. But things are still slow.
Every senior engineer’s first instinct is the same: “Let’s add Redis in front of it.” Or “more read replicas.” And sure, that works, but you have just added complexity and now you have to deal with cache invalidation.
What happens when you have stale data? How do you recache current data, and when does that happen?
Are you caching on the browser side too? Understanding where data can be cached and how to invalidate it is another genuinely difficult problem to solve. You’re just trading one set of problems for a different set of problems.
You Can’t Predict Every Future Question
If you’re selling things on the internet, chances are, you will care about event sourcing at some point. A lot of interesting business problems don’t care about the current state of a user, they care about the intent and history. So how you store intent and history is probably different from your ACID-compliant Postgres table that you’ve worked hard to normalize.
You can get your data structure perfect for displaying products and processing sales, then run into a completely new set of requirements that changes everything about how your data needs to be structured.
It’s genuinely hard to foresee all the potential questions you’ll need to answer in the future.
Why This Matters Now
Everything you do on a computer stores data somewhere, it’s just a matter of persistence.
Which is why; everything software-related is eventually a database problem.
Data modeling isn’t glamorous, but getting it right is the difference between a system that scales gracefully and one that fights you every step of the way.
/ Programming / Databases / Architecture