Programming
-
Day 19: Leap Seconds or When a Minute Has 61
It’s week four and this week we’re talking about the inconsistencies, the problems, the strange things about time and how we measure it. If week three was mostly about software and computer systems, week four is more about the properties or the features or the details of recording time and our time systems.
The first crack in our concept of time is the quite irregular leap second.
UTC works?
How can a minute ever have 61 seconds? If you start at zero, you got to end at 59. You start at zero and you end at 60, now you have 61 seconds. That doesn’t make sense.
If you have ever seen
:60Zin a log file, that is the leap second. That minute had sixty-one seconds in it.It exists because we are trying to do the impossible: keep two completely different definitions of the “second” in alignment, forever. There are, it turns out, two definitions.
But wait, it gets worse.
There are three time scales running in the background of our civilization right now:
- TAI, or International Atomic Time. The average of about 400 atomic clocks at standards labs around the world, all ticking off the cesium hyperfine transition we covered on Day 8. TAI is uniform. Every second is the same length as every other second. TAI does not care about the Earth.
- UT1, or Universal Time 1. Defined by the Earth’s actual rotation, measured by tracking distant quasars with radio telescopes. UT1 is wobbly. The Earth speeds up and slows down by milliseconds per day, mostly because of tidal friction (slowing it down) and core-mantle coupling (anyone’s guess on any given decade).
- UTC, or Coordinated Universal Time. The civil time on your phone. UTC is TAI minus an integer number of leap seconds, kept within 0.9 seconds of UT1.
It’s time to stop pretending like the current version of UTC isn’t a compromise. It certainly is not the best we could come up with. It’s just what everyone could agree on.
The first of its problems is the dang leap second.
So how did we get to this mess?
The IERS, the International Earth Rotation and Reference Systems Service in Paris, watches the gap between UT1 and UTC. They announce the leap second six months in advance in an actual notice called Bulletin C. They also have their own weekly and monthly newsletters called Bulletin A and Bulletin B. I don’t know what is going on at IERS and I’m sorry to anyone working there, but this leap second thing is kinda crazy.
When the leap happens, the clock reads:
23:59:58 23:59:59 23:59:60 ← this is the leap second 00:00:00That
:60is the part that breaks software. Most date/time libraries do not believe:60is a real value. POSIX, the standard governing Unix systems, explicitly defines Unix time to pretend leap seconds don’t exist.Since the system was introduced in 1972, 27 leap seconds have been inserted, although none since 2016. Also, there has never been a negative leap second. We’ve only ever needed to slow UTC down to match the Earth.
But, that may be about to change. More on that tomorrow.
Because I can’t help myself.
Here’s more stuff about… Computers.
There are three ways a computer can handle the leap second arriving.
- Step. At midnight, the clock just jumps back one second. From the OS’s perspective, time briefly moves backward. Anything assuming time is monotonic, meaning it only goes forward, sees its assumption violated and may explode.
- Stall. Hold
23:59:59for two seconds. Time doesn’t go backward, but two events get the same timestamp. Anything depending on timestamp uniqueness gets confused. - Smear. Spread the extra second over a long window (Google originally used 20 hours centered on the leap, then standardized at 24 hours) by ticking slightly slow for the whole period. No
:60ever appears. No backward step. Just a clock that runs 1.0000116× slow for a day.
Google introduced smearing in 2008. By the late 2010s most cloud providers (Amazon, Microsoft, Facebook) had adopted some flavor. It is now the de facto practice.
Before smearing, leap seconds were MORE dangerous.
A second is a second not two
A leap second is like when Pluto was a Planet. It has to be a singular definition. A known amount. A standard. Software written at some of the most capable engineering organizations still took down important infrastructure, internet infrastructure. Clearly it should not be this hard to define a unit of time.
But anyways, onto the next post.
Tomorrow will cover the historic 2022 vote to abolish the leap second, the fight that produced it, and what we all agreed to.
Sources
- International Earth Rotation and Reference Systems Service (IERS) — The body responsible for monitoring Earth’s rotation and issuing Bulletin C to announce leap seconds.
- A global timekeeping problem postponed by global warming — Nature (2024). The Duncan Agnew paper detailing how melting polar ice has counteracted the Earth’s acceleration, delaying the unprecedented “negative leap second” until roughly 2029.
- POSIX.1-2017 Base Definitions: Seconds Since the Epoch — The Open Group. The formal specification demonstrating that Unix time legally ignores leap seconds.
- Time, Technology and Leaping Seconds — Google’s original 2011 blog post introducing the concept of the “leap smear.”
- The Leap Second Glitch Explained — Wired. Detailed breakdown of the 2012 Linux
hrtimerbug that took down Reddit and Qantas. - How and why the leap second affected Cloudflare DNS — Cloudflare’s excellent, candid post-mortem of their 2017 New Year’s RRDNS outage.
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
Programming Infrastructure Time 30daysoftime Leap-second UTC
-
Day 18: DST and The Related Software Bugs
This is one of two or three DST posts in the 30 Days of Time series. Today’s angle: the software bugs.
Twice a year, in most of the developed world, the clocks jump forward in spring and back in fall. The hour that doesn’t exist in spring materializes and then disappears in the fall. This is the source of more shipped bugs than any other single phenomenon in software.
The Two Impossible Hours
The mechanics, if you’ve never thought hard about them.
Spring forward. On the transition Sunday in March (in the US), the clock reads
01:59:59and then immediately reads03:00:00. The hour from 2:00 to 2:59 AM does not exist. 2:30 AM on that day is not a time. If you tell a computer to do something at 2:30 AM on that day, you have asked it to do something at a time that doesn’t exist.What it does is up to the library:
- It might silently skip.
- It might silently run at 3:30 instead.
- It might throw an exception.
- It might run at the wrong time and silently throw your reports off.
The classic landmine is a daily task scheduled in local time, for example a job set to run at
1:30 AM. If you’re using the standard Linuxcrondaemon, it has battle-tested, built-in logic to detect DST transitions and prevent duplicates.The problems are usually at the application layer. If you are using an application-level scheduler or Cron library that hasn’t been configured properly and blindly trusts the system clock, you can get into a situation where that 1:30 a.m. doesn’t exist or runs twice.
A Short Tour of Named Disasters
March 2007, United States. Congress passed the Energy Policy Act of 2005, which moved DST to begin three weeks earlier and end one week later. The change took effect in March 2007. Every system in the country running on a tz database older than mid-2006 spent three weeks in March, and one week in November, off by an hour. Banks ran payroll at the wrong time. BlackBerry calendars showed every meeting an hour off. Federal agencies had to issue advisories. The DOE later estimated the extension saved about 0.5% of electricity per day of extended DST, or roughly 1.3 TWh annually. The remediation cost across every affected piece of software in the country dwarfed that figure. (More on that later.)
New Year 2011, iOS. Non-recurring alarms set for January 1 or 2, 2011 did not fire, in any time zone. People slept through work. Apple’s official advice was to set one-time alarms as recurring until January 3. This was on top of an iOS DST bug from a few months earlier, when the fall 2010 transition shifted alarms by an hour in countries that had already changed clocks. Two months after the New Year’s bug, iOS again mis-handled the US spring DST transition. Apple released an apologetic fix and quietly rewrote the alarm subsystem.
Brazil, April 2019. Brazil canceled DST after decades of observing it, via Decree 9,764. This is fine for clocks going forward, but the cancellation was announced only a few months in advance, and the IANA tz database had to ship updates fast. Every Brazilian server running on a stale cache spent the next year an hour off, in particular for any future-scheduled event saved as “local time.”
Palestine. For about a decade running, Google Calendar shipped wrong DST data for Palestine, because the Palestinian Authority changes DST rules with short notice and the IANA volunteers don’t always learn in time. Meetings between Israeli and Palestinian colleagues would silently shift by an hour twice a year.
The Shape of the Failure
The DST bugs usually go like this:
- A piece of software was written with the assumption that local time is well-defined and monotonic.
- Local time is neither.
- The author never hit edge cases. It only happens twice a year, in certain regions, under certain settings.
- The bug ships. It runs fine for six months. Then it doesn’t.
The mitigations are well-known and this is why we do what we do.
- Store UTC. Always. The IANA zone ID goes in a separate column. Never, ever store a naked local timestamp.
- Recompute the local display every time. Treat local-time as a view, not data.
- Never schedule anything between 2 and 3 AM local. That hour does not exist in your country half the time.
- Use libraries that surface the ambiguity. The older Python
pytzlibrary would throw when you constructed an impossible local time. The modernzoneinfohandles it silently via afoldattribute, meaning you have to manually check for ambiguity. JavaScript’sDateproduces inconsistent results across engines. TheTemporalAPI, which reached Stage 4 in March 2026 and ships in Chrome 144, Firefox 139, and Node 26, lets you explicitly reject ambiguous times. Use it the soonest you can. - Keep tzdata current. This is a system-package problem and most teams forget about it until something breaks.
The tricky part of software has always been that we think that the wall clock or the wall time, the number you see on a daily basis is the same as the actual physical passage of time when in reality they are not. Daylight savings time is a really great example of the absurdity of our timekeeping.
Week 3 Recap
If this is the first time you are reading this series I figured a recap is order. Week 3 has been about the infrastructure of practical timekeeping, the layer where computers, calendars, and humans actually have to agree on what time it is.
- Day 13: Unix Time, 1,780,620,532 — The 10-digit integer counting up from 1970 that runs every computer on Earth, and the weird properties hiding behind the name.
- Day 14: The Bug That Didn’t End the World, and the One That Still Might — Y2K was a save, not a hoax, and Y2038 is the one nobody is preparing for.
- Day 15: The Man Who Synchronized the World — David Mills, NTP, and the forty-year project that keeps every networked clock on Earth within a few milliseconds of UTC.
- Day 16: How the World Agreed on a Date Format (Except the US) — The century-long campaign that produced
2026-06-08T14:30:00Z, and why a bare05/06/26is still an act of faith. - Day 17: Time Zones Are a Nightmare — 38 named offsets in active use, half-hour zones, and why “what time is it there?” is the wrong question.
- Day 18 (today): DST, and the bugs that ride along with it twice a year.
The picture I want to leave you with is that every problem we’ve covered in Week 3 is a downstream consequence of a deeper one. The system isn’t fragile because of bad programmers. It’s fragile because the underlying thing, “what time is it, here, right now,” was never a single answer, and we’ve been pretending it was.
What’s Coming
Week 4 is about the cracks. What if a minute had 61 seconds? What if October had only 21 days? What if every meeting on every calendar landed on the same weekday, forever? Each of those has actually happened, or is being voted on, or was almost adopted. Week 4 covers leap seconds and their abolition, the DST fight nobody can win, and the calendars we use, almost used, and may yet use.
Sources
- Energy Policy Act of 2005 (Wikipedia). Details the US DST schedule change that took effect in 2007.
- Impact of Extended Daylight Saving Time on National Energy Consumption (US DOE). The 0.5%-per-day savings estimate from the post-2007 study.
- Apple confirms New Year’s alarm bug (AppleInsider). Coverage of the iOS 2011 non-recurring alarm bug and Apple’s workaround.
- Daylight saving time in Brazil (Wikipedia). History of DST in Brazil, including the 2019 abolition via Decree 9,764.
- Zune 30GB leap year bug (Wikipedia). The firmware loop that bricked Zunes on New Year’s Eve 2008.
- 2012 Reddit leap second outage (Wired). Write-up on the Linux
hrtimerbug that took down Reddit, LinkedIn, and Qantas. - TC39 Advances Temporal to Stage 4 (Socket). Current status of the JavaScript
TemporalAPI.
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
-
Day 17 — Time Zones Are a Nightmare
Yesterday I wrote about how
2026-05-24T14:30:00Zwon the format wars. ThatZat the end, it turns out, is quite important. It says Zulu, it says UTC, it says: I am refusing to participate in the nightmare that is Timezones.However, today we participate.
The lie you were told in school
There are 24 time zones, one for every hour, neat 15° slices of the globe.
There are not. There are at least 38 named offsets in active use right now, and the list changes a few times a year.
Some of them are not on the hour:
- India is at UTC+5:30. The whole country, one zone, half-hour offset.
- Nepal is at UTC+5:45. Forty-five minutes. Because Nepal decided in 1986 that it wanted its civil time anchored to the meridian passing through Gauri Shankar, not Delhi.
- Newfoundland is at UTC−3:30.
- The Chatham Islands are at UTC+12:45.
The range isn’t 24 hours either. It runs from UTC−12 to UTC+14, a 26-hour spread, because Kiribati got tired of being split by the international date line in 1995 and just moved the line. One day the Line Islands woke up and it was tomorrow.
Then there’s China, which geographically spans five time zones and politically uses one (UTC+8), so in the far west of Xinjiang the sun rises at what the clock insists is 10 AM. North Korea changed its offset twice in the 2010s, from UTC+9 to UTC+8:30 in 2015 to mark liberation from Japan, then back to UTC+9 in 2018 to align with Seoul during a diplomatic thaw.
Time zones are not geography. Time zones are politics with a clock face glued to the front.
How we got here
Before about 1850, every town in the world ran on its own clock. Noon was when the sun was overhead here, which meant noon in Boston was several minutes off from noon in New York, which was off from Philadelphia, which was off from everywhere.
Nobody cared, because nobody was traveling fast enough for it to matter.
Then the railroads showed up.
When your train leaves at “noon” and arrives at “3 PM” and every station defines noon differently, you can’t print a schedule. Britain rolled out Railway Time (GMT everywhere) in the 1840s. The American railroads, bless them, didn’t wait for permission. On November 18, 1883, they unilaterally divided the United States into four zones. Newspapers called it “the day of two noons” because clocks across the country jumped, sometimes forward, sometimes back, to land on the new shared time.
The following year, in October 1884, twenty-five countries met in Washington for the International Meridian Conference and made it official. Greenwich is 0°, the universal day starts at midnight in Greenwich, every other place is some offset from that.
France abstained. France wanted Paris. France held out until 1911.
The “database” holding the world together
When your phone shows you the right time after you land in Tokyo, when your calendar correctly reschedules a meeting because Mexico canceled DST a few years ago, when your server logs all line up across a deploy in three continents, that all happens because of a single, voluntarily maintained text file.
It’s called the IANA Time Zone Database, also known as the Olson Database, after Arthur David Olson, an NIH employee who started maintaining it in 1986 as a side project. Today it lives under IANA stewardship and is primarily maintained by Paul Eggert, a UCLA computer scientist who has been doing this, mostly alone, for decades.
Every Unix system, every Linux distro, every Mac, every iPhone, every Android phone, Java, Python, Go, Rust, Postgres, browsers, every piece of software that knows what time it is, gets its time zone rules from this database. The release cadence is multiple updates per year, almost always triggered by some country’s parliament deciding to change DST rules with three months' notice.
The format is something like
America/New_York,Europe/Berlin,Asia/Kolkata,Pacific/Kiritimati. Area, slash, location. NotEST, notGMT+5, because those are offsets and offsets aren’t enough. The rules are what you need, because the rules change with politics.The whole arrangement is held together by a small group of volunteers, a mailing list, and Paul Eggert’s continued willingness to keep doing this. If he ever stops, somebody else will have to start.
Why this is one of the hardest problems in working programmer software
A few categories of pain, none of them solvable, all of them shipped to production daily:
1. Ambiguous local times. When the clocks fall back in November, the hour from 1:00 to 2:00 AM happens twice. If a user schedules a meeting at “01:30 local time” on the wrong day, which 01:30 do they mean? There is no correct answer. Your software picks one and someone shows up an hour off.
2. Nonexistent local times. When the clocks spring forward in March, 2:30 AM doesn’t exist. If somebody’s medication-reminder app is set for 02:30, what does it do that morning? Skip? Run at 03:30? Run at 01:30 the previous hour? There is no correct answer.
3. Future timestamps are mutable. If you store a meeting as “October 15, 2027 at 3 PM in Mexico City,” and Mexico cancels DST between now and then, which it did, in 2022, the meeting moves. The number of hours from now until that meeting changes after you saved it. The cardinal rule, the only thing that saves you, is this: store UTC and the IANA zone name separately, never store a local timestamp alone, and recompute on display.
4. JavaScript’s
Dateobject. It’s not great but I worte more about it here, There is a fix called the JavaScript Temporal API. It’s technically here now, though browser support is still rolling out, and we will all be happier when it’s fully supported everywhere.
The list of lies (about time)
There’s a famous post called Falsehoods Programmers Believe About Time, and a partial sample from the time-zone section gives you the texture:
- “There are 24 time zones.” (38+, give or take, depending on the week.)
- “A day is 24 hours.” (DST transitions make some days 23 or 25.)
- “Time zones don’t change.” (They change several times a year.)
- “If I store the UTC offset I don’t need the zone ID.” (You do, for any future date.)
- “UTC is a time zone.” (UTC is a time scale. Zones are offsets from it. This distinction is going to matter more than it sounds like it should.)
All are gotchas that programmers encounter when working with time zones.
The ultimate example of technical debt
The time zone system isn’t broken, per say… it’s working exactly as designed.
It was designed by railroad executives in 1883, ratified by diplomats in 1884, and then handed off to every country on Earth to amend at will. Every president who has ever moved a DST date for political reasons, every dictator who has ever changed the national offset to flatter a neighbor, every parliament that has voted to abolish daylight saving without specifying when, has added their fingerprint to the IANA database.
It is a working international system. It is also a Rube Goldberg machine running on a tar pit, held aloft by Paul Eggert and a mailing list.
If you ever thought Timezones were bad, tomorrow it gets worse. We’re going to talk about Daylight Saving Time, and the specific, named software disasters it has caused.
Sources
- Nepal Standard Time — Wikipedia. Details Nepal’s 1986 decision to anchor civil time to the Gauri Shankar meridian (
UTC+5:45). - Time in Kiribati — Wikipedia. Covers the 1995 shift of the International Date Line, creating the 26-hour global spread.
- Time in North Korea — Wikipedia. Documents the geopolitical shifts between
UTC+8:30andUTC+9. - Day of Two Noons — Wikipedia. History of American railroads unilaterally standardizing time on November 18, 1883.
- International Meridian Conference — Wikipedia. The 1884 agreement that established Greenwich as 0° (and France’s holdout until 1911).
- IANA Time Zone Database — Wikipedia. History of the Olson database, its maintenance by Arthur David Olson and Paul Eggert, and its fundamental role in modern computing.
- Daylight saving time in Mexico — Wikipedia. Details the national abolishment of DST in October 2022.
- JavaScript Temporal API — TC39 Documentation. The modern fix for JavaScript’s notoriously broken
Dateobject (currently rolling out to browsers). - Falsehoods Programmers Believe About Time — Noah Sussman’s canonical post detailing the myriad ways developers misunderstand time.
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
-
Day 16: How the World Agreed on a Date Format (Except the US)
There is a war that has been quietly raging for about a century, and it is fought over six little characters.
05/06/26In the United States, that is May 6, 2026. In most of Europe, it is June 5, 2026. In Japan, which uses a year-month-day order and frequently uses the imperial calendar, the
05might be read as year 5 of the Reiwa era (2023). In Iran, the entire premise is wrong, because Iran’s official calendar is the Solar Hijri and the current year is 1405.So here we are. A date written by one person and parsed by another is, in the general case, an act of faith.
The most consequential standards effort of the late 20th century was an attempt to end this. It succeeded, sort of. The format it produced,
2026-06-08T14:30:00Z, looks unremarkable now, but it represents a multi-decade campaign to drag the world’s date conventions into a single, unambiguous, machine-parseable shape.That standard is ISO 8601, and the story of how it won is the story of why your API logs look the way they do.
What’s actually wrong with
05/06/26Let me give you the tick of the tock (lay of the land). Every culture has a different intuition about which number comes first in a date, and none of them is wrong.
In the United States, the convention is month-day-year. This descends from spoken American English, “May sixth, twenty-twenty-six,” where the month comes first in speech.
In most of Europe, Latin America, and much of Asia, the convention is day-month-year. “The fifth of June, twenty-twenty-six.” The day is the first specific element.
In East Asia, the convention is year-month-day, written largest-unit-first. This reflects a linguistic preference for going general-to-specific that runs the opposite direction of the English phrasing.
Who is to say which is more correct than another? The problem is that a single string of digits separated by slashes can mean three different things depending on who wrote it, and there is no way other way to tell.
When the ambiguity bites
It’s not just an annoyance.
International travel figured this out the hard way. Across global passport documentation, the convention settled on a three-letter month abbreviation:
08 JUN 2026. It’s unambiguous because no month is named06. The international passport standard (ICAO Doc 9303) mandatesJANthroughDECfor exactly this purpose.In healthcare, patient safety organizations have flagged date ambiguity as a documented source of medication error: a chart that says
7/8/09can be read as July 8 by one clinician and August 7 by another.The shape of the problem is the same across medicine, aviation, logistics, contracts, and customs declarations. Different conventions lead to confusion and errors.
The standard
In 1988, ISO published
ISO 8601:1988. Pick one format, make it unambiguous, make it sort lexicographically, make it machine-parseable, and standardize the world on it.The format they picked:
2026-06-08T14:30:00ZThe choice of
YYYY-MM-DDwas deliberate. Year-month-day is the East Asian convention, but it has a technical property that the other two don’t: it sorts correctly as a string.2025-12-31comes before2026-01-01whether you sort by character or by number.12/31/25and01/01/26do not. For the emerging computing industry of the late 1980s, databases, log files, file systems, this was a decisive advantage.The capital
Tseparates the date from the time. Not pretty, but unambiguous. The trailingZ(informally pronounced “Zulu”) means UTC. This timestamp has no timezone offset, it is anchored directly to UTC.What actually use: RFC 3339
ISO 8601 is too permissive for engineering use.
It allows fractional seconds. It allows omitting components. It allows the basic form (
20260608T143000Z) without separators. It allows week dates and ordinal dates. It allows24:00:00as midnight (this was removed in 2019, then reinstated by amendment, in one of those standards-committee compromises that satisfies no one).So in 2002, the IETF published
RFC 3339. RFC 3339 is a profile of ISO 8601, a strict subset that picks one form and forbids the rest. The basic form is disallowed. Week dates are disallowed. The time component is mandatory. The timezone designator is mandatory.This is what every modern internet API actually uses. GitHub, AWS, Stripe, Cloudflare, OpenAI. They accept RFC 3339, not full ISO 8601. They reject
20260608T143000Zeven though it’s legal ISO 8601.What everyone calls “ISO 8601” in casual conversation is, almost always, RFC 3339.
What ISO 8601 isn’t
A few things worth being clear about:
- ISO 8601 is not UTC. UTC is a timescale. ISO 8601 is a format.
- ISO 8601 is not Unix time. Unix time is the integer
1781055000. ISO 8601 is the string2026-06-08T14:30:00Z. They can represent the same instant. They are not the same thing. - ISO 8601 does not solve leap seconds. The format permits
:60in the seconds field, but what to do with such a value is implementation-defined. - ISO 8601 does not include the calendar system. It assumes the Gregorian calendar. No provision for Islamic, Hebrew, or Buddhist calendars.
The civilizational payoff
There is a sense in which
2026-06-08T14:30:00Zis the most consequential string format in modern computing.While legacy systems still cling to their own formats—HTTP headers use RFC 1123, Git and JWTs use integer Unix timestamps, and X.509 certificates use ASN.1—RFC 3339 has conquered the modern web. It is the default serialization for datetime objects in modern programming languages. It appears in the JSON payloads of almost every modern API (GitHub, Stripe, AWS, OpenAI). It is the standard format for XML’s
xs:dateTime. It is written into millions of cloud infrastructure log lines every second.It is the closest thing modern technical infrastructure has to a universal vocabulary for the question “when did this happen?"
It won because it was unambiguous and sortable, and a single committee was willing to pick one of three equally valid cultural conventions and tell the other two cultures to deal with it. Most international standards die in the negotiation. ISO 8601 survived because the technical advantages of
YYYY-MM-DDwere strong enough to overwhelm the political cost.Us Americans haven’t adopted it (yet). We still write
06/08/2026on bank checks, forms and filings, but the machines we all use are on 8601 and they are doing most of the talking.
Sources
- Japanese era name — Wikipedia — Reiwa began 1 May 2019; Reiwa 5 = 2023.
- Solar Hijri calendar — Wikipedia — year 1405 began 21 March 2026, ends 21 March 2027.
- ISO 8601 — Wikipedia — first published 1988; ISO 8601-1:2019 removed
24:00; the 2022 amendment reinstated it. - ISO 8601-1:2019/Amd 1:2022 — the amendment that put
24:00:00back. - RFC 3339 — Date and Time on the Internet: Timestamps — IETF, July 2002. Profile of ISO 8601 used by most modern APIs.
- RFC 3339 vs ISO 8601 — visual map of which forms each standard accepts; basic form (
20260608T143000Z) is valid ISO 8601 but not RFC 3339. - Machine-readable passport (ICAO Doc 9303) — Wikipedia — ICAO standard requiring three-letter month abbreviations (
DD MMM YYYY) in the visual inspection zone of all passports. - ISMP List of Error-Prone Abbreviations — highlights the risk of ambiguous documentation and dates in medical records.
- RFC 1123 — Requirements for Internet Hosts — specifies the required date format for HTTP Date headers.
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
Tomorrow: Unix time, the second-counting system that runs under every timestamp you’ve ever seen, and the rollover problem that hits in 2038.
Programming Software development 30daysoftime Standards Iso8601
-
Day 15: The Man Who Synchronized the World
David Mills, the father of internet time, wrote the protocol that synchronizes every computer on Earth. He did it as a professor at the University of Delaware, on a project he started in the early 1980s and never stopped working on.
The code lives in your laptop, your phone, your router, every cloud server you’ve ever touched, and every satellite in low Earth orbit. The protocol it implements is called NTP. The reason your computer’s clock is correct, right now, within a few milliseconds of UTC, is that Mills spent forty years of his life making sure it would be.
He once described the early ARPANET days as a “sandbox” where researchers were simply told to “do good deeds.” Part of the allure of the time-synchronization work, he told The New Yorker in 2022, was that he was just about the only one doing it. He had his own “little fief.”
For forty years, that is exactly what it was.
The problem
The early internet had a clock problem. As soon as there were enough machines on the network that “what time is it?” didn’t have a single answer, somebody was going to have to write a protocol. Each computer had its own oscillator. Each oscillator drifted at its own rate. Two machines that agreed at noon could be tens of seconds apart by midnight.
Why did this matter? For most things, it didn’t. For some things, it mattered a lot. A file saved on one machine and copied to another could look older than the version it overwrote, confusing every backup tool that assumed time moves forward. Cryptographic handshakes that expire after a few seconds could fail because the two ends disagreed on what “a few seconds ago” meant. Database replicas could apply writes in the wrong order and corrupt their own state. Email between two servers could arrive timestamped before it was sent. Debugging a multi-machine bug meant correlating log entries across clocks that didn’t agree about which event came first.
Mills decided the actual problem was that there was no protocol for negotiating the truth (in this case, time) across multiple systems. The clock on his desk was wrong. Every other clock was also wrong. The question wasn’t “who has the right time?", it was “given that nobody has the right time and the network adds an unknown delay to every measurement, how does the system converge on a consensus that is closer to UTC than any individual node could achieve alone?"
His first NTP RFC,
RFC 958, was published in September 1985. We now call that protocol NTPv0, or the prototype. In it, Mills nailed down the four-timestamp packet format and the offset/delay math that has been in every revision since. The packet format and the core algorithm haven’t meaningfully changed in forty years. That kind of staying power is rare in any field. In internet infrastructure, where the half-life of a protocol can be measured in single-digit years, it is quite commendable.The four timestamps
NTP’s core insight is that the network delay between client and server can be measured, not just guessed, as long as both sides record their own timestamps for both legs of the conversation. Four timestamps are exchanged in a single round trip:
Client Server ────── ────── T₁ ──── request ───────────────► T₂ T₃ T₄ ◄────────────── response ─────- T₁ — the client sends the request (client clock)
- T₂ — the server receives it (server clock)
- T₃ — the server sends the response (server clock)
- T₄ — the client receives it (client clock)
Now the client has four numbers. T₁ and T₄ are in the client’s reference frame, T₂ and T₃ are in the server’s. From those four numbers, two things fall out: the round-trip delay (how long the conversation took, minus the time the server spent thinking) and the clock offset (how far the client’s clock is from the server’s). The client now knows how wrong it is, and by how much.
The math depends on one critical assumption: the network is symmetric. The packet takes the same time to travel in both directions.
If you’ve been following along in the series, you know there are a lot of ways to measure time. Atomic clocks. GPS receivers. The quartz crystal in your laptop. Radio signals broadcast from government antennas. They don’t all tick at the same rate, and they don’t all agree on what the current time is. How does NTP reconcile across that much varity in time sources?
The stratum hierarchy
NTP organizes the world’s clocks into a tree, with depth measured in strata.
Stratum 0 is the reference. Cesium atomic clocks. Hydrogen masers. GPS receivers. Radio receivers tuned to WWV, DCF77, or MSF. These are not on the network, they’re physical devices wired directly to a small number of computers via PPS pulses on serial ports.
Stratum 1 is the small group of servers wired directly to Stratum 0. There are perhaps a few thousand of these globally. NIST runs some. Major universities run some. The big internet exchanges run some.
Stratum 2 servers sync with Stratum 1, Stratum 3 with Stratum 2, and so on down to Stratum 15. Stratum 16 means “unsynchronized, do not trust.”
A typical Linux laptop syncs against Stratum 2 or 3 servers. A typical cloud VM syncs against its provider’s internal Stratum 1 fleet. Your phone syncs against whatever its carrier provides. The whole tree is held together by NTP itself, recursively.
The genius of the design is that there is no central authority. Mills did not own the protocol. There is no “official NTP server.” Anyone can run a Stratum 1 with the right hardware, and anyone can run a Stratum 2+ by syncing with a few Stratum 1s of their choice. The largest public pool,
pool.ntp.org, is a volunteer effort started in 2003 by Adrian von Bidder. It currently aggregates a few thousand donated stratum-2 servers worldwide and serves several billion requests per day. Nobody is in charge of it. It just works.The slew, not the step
There are three different times to keep track of on every synced computer. The reference time is what UTC says, the truth NTP is chasing. The tick rate is how fast the computer’s oscillator pulses. It’s supposed to produce one second of clock time per real second, but always drifts a little. The system clock is what gets reported when an application asks for the current time. Synchronizing means closing the gap between the system clock and the reference time without breaking anything that depends on the system clock being well-behaved.
NTP’s primary tool for that is the slew: it adjusts the tick rate, making each tick slightly longer or shorter than nominal, so the system clock drifts into alignment on its own. The alternative would be to jump the clock forward or backward by the full offset (a step), which is fast but can produce duplicate keys in a database, expire valid TLS sessions, or cause a logging system to mis-order events.
Mills designed
ntpdto slew conservatively. A 200ms gap might take several minutes to close, and corrections larger than about 128ms would get stepped because slewing them gradually was prohibitively slow. That trade-off worked for the always-on Unix workstations of the 1980s and 90s. It works less well for the modern reality of laptops that suspend for hours and resume with a clock that hasn’t been touched since last Tuesday, or cloud VMs that get migrated between hosts. Modern variants likechronyslew more aggressively for exactly that reason. When you open your laptop lid, you want the clock right now, not after fifteen minutes of imperceptible easing.The legacy
In a sense, NTP is the thing that made the modern internet possible.
Without well-synchronized clocks, you cannot have SSL certs. The browser needs to know when the cert expires, and if its clock is off by more than a few minutes, the encryption breaks. The same goes for databases. No matter the type, NoSQL or otherwise, they all depend on a clock to record when an operation took place.
Without NTP, cell towers wouldn’t agree on when to hand off a call. Financial transactions wouldn’t be enforceable. And all those log files you’ll totally read one day wouldn’t make any sense. NTP is foundational to all of it. It runs as a daemon on every machine, the ones you stare at all day, the ones you don’t see, and the ones you don’t care about.
We remember Mills as the internet’s “Father Time” and the man who synchronized the world. Neither is a metaphor.
Sources
- In Memoriam: David Mills (UDaily, March 2024) — University of Delaware’s obituary; biographical detail, career timeline.
- David L. Mills — Wikipedia — congenital glaucoma from birth, vision worsening from ~2012, fully blind by 2022; UDel professor 1986–2008.
- David Mills, the internet’s Father Time, dies at 85 — The Register — death date (Jan 17, 2024), age 85.
- RFC 958 — Network Time Protocol (September 1985) — the original NTPv0 specification.
- Network Time Protocol — Wikipedia — version lineage: RFC 958 (v0, 1985), RFC 1059 (v1, 1988), RFC 1119 (v2, 1989), RFC 1305 (v3, 1992), RFC 5905 (v4, 2010), RFC 8915 (NTS, 2020).
- NTP pool — Wikipedia — Adrian von Bidder started the pool in January 2003; Ask Bjørn Hansen has run it since 2005.
- MiFID II RTS 25 clock synchronization (Meinberg) — 100µs requirement for high-frequency trading at sub-1ms gateway latency.
- A Brief History of NTP Time: Confessions of an Internet Timekeeper (Mills, PDF) — Mills' own history of NTP.
- The Thorny Problem of Keeping the Internet’s Time (The New Yorker, September 2022) — Nate Hopper’s profile of David Mills and the fragile state of NTP maintenance.
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
Tomorrow: ISO 8601, the format wars, the carnage of MM/DD vs DD/MM, and why
2026-06-07T14:30:00Zwon. -
Day 14: The Bug That Didn't End the World, and the One That Still Might
On December 31, 1999, a measurable percentage of the developed world stockpiled bottled water, withdrew cash from ATMs, and stayed up to see if the lights would go out at midnight.
They didn’t. Planes did not fall from the sky. Power grids did not collapse. Bank balances did not reset. The new millennium arrived, the champagne was opened, and by January 3rd everyone agreed it had been a hoax.
It was not a hoax. It was a save.
Y2K was a global $300+ billion engineering effort spread across roughly five years and almost every government and Fortune 500 IT department on Earth. The reason nothing happened on January 1, 2000 is that for half a decade, an enormous number of people worked very hard so that nothing would happen. The bug was real. The fix worked. Most people forgot it was ever a problem.
Twelve years from now, a structurally identical bug detonates again, but first let’s understand what happened in ‘99.
Y2K: the bug
The Y2K bug is almost embarrassingly simple. From the 1960s through the 1980s, computer storage was expensive enough that programmers had a habit of representing years with two digits,
99instead of1999,73instead of1973. It saved two bytes per date. Across a payroll system tracking millions of employees, that mattered.The assumption baked into that decision was: we’ll have rewritten this system long before the century rolls over.
This is the most consistently wrong assumption in software engineering. Code outlives its authors’ confidence. By the late 1990s, vast amounts of critical infrastructure, bank ledgers, airline reservation systems, hospital records, utility billing, social security disbursement, military logistics, nuclear plant monitoring, were running on COBOL programs from the 60s and 70s that had been patched but never rewritten. The language is unfamiliar to many but the fix it later approach is relable to everyone. The developers at the time all quietly assumed that the year
99was less than the year00.When the rollover hit,
99-12-31 + 1 day = 00-01-01looked, mathematically, like jumping back to 1900. Interest calculations would compute negative ages. Pensioners would suddenly be billed for a century of debt. Reservation systems would mark every upcoming flight as having departed in the past. Insurance policies would expire en masse.The reason planes did not fall is that, starting roughly in 1995, every major airline, manufacturer, FAA system, and air traffic controller began an exhaustive audit-and-fix campaign.
The reason the power grid did not collapse is that every utility company in North America and Europe ran the same campaign on their SCADA systems.
The reason your bank balance was still correct on January 1, 2000 is that someone, somewhere, spent late nights in 1997 reading printouts of code written before they were born.
The estimated total global cost: $300 to $600 billion. The amount of measurable damage on January 1, 2000: small enough that people argued for the next decade about whether the spend had been justified.
It was. The bug was real. The fix worked. The result of a successful preventive engineering campaign is that it looks, in retrospect, like the problem was never there.
Y2038: the same bug, different number
Twelve years from now, specifically, January 19, 2038, at 03:14:07 UTC, a structurally identical bug fires for a different reason.
Unix time is stored, on a huge amount of legacy infrastructure, as a signed 32-bit integer. That gives you about 2.1 billion seconds of positive range from the 1970 epoch. 2.1 billion seconds is 68 years. 1970 + 68 = 2038.
At
03:14:07 UTCon that date, the counter hits its maximum value,2,147,483,647. The next tick overflows. In two’s-complement signed integer arithmetic, the value rolls over to its most negative possible value:-2,147,483,648. Interpreted as a Unix timestamp, that’s December 13, 1901.Every 32-bit Unix-derived system that hasn’t been patched will, in the span of one tick, conclude that it is now the early 20th century. The effects are the same family of effects as Y2K, but applied to a much wider deployment surface.
File modification times become nonsensical. SSL certificates appear expired, or worse, not-yet-valid. NTP synchronization fails. Filesystems with 32-bit inode timestamps lose ordering. Embedded device firmware that schedules tasks based on wall-clock time begins executing at random intervals. Industrial control systems that latch state machines on “time since last event” calculations latch on negative durations and either freeze or behave unpredictably.
Modern desktop and server operating systems are mostly fine. Linux finished migrating to 64-bit
time_ton all architectures by kernel 5.6 (2020) and glibc 2.32. macOS and Windows have been 64-bit-clean for over a decade. AWS, GCP, and Azure all run 64-bit kernels.The problem is not where you are reading this. The problem is in the physical world that keeps everything running.
The long tail is enormous
Estimates of the number of currently deployed 32-bit embedded devices that interact with
time_tin some way range from a few hundred million to several billion.Industrial controllers, automotive ECUs, network routers, smart-meter firmware, point-of-sale terminals, medical imaging devices, GPS units, cable boxes, elevator controllers, traffic light systems, ATM internals, payment terminals, building HVAC, water-treatment SCADA, satellite firmware, oil rig control systems, and the embedded computer in your refrigerator.
Each one, depending on vintage and vendor, may or may not have been patched.
Many of these devices are not internet-connected and cannot be patched remotely. Many are running firmware whose source code has been lost. Many are running firmware whose vendor no longer exists. Many are in places where physical access is hard, a deep-sea oil platform, a satellite in geostationary orbit, a controller welded inside an industrial machine.
The Y2K fix worked because the affected systems were largely centralized: mainframes in data centers, software at named companies, code with active maintainers. You could audit it. You could rewrite it. You could ship a patch.
Y2038 is decentralized. The affected systems are everywhere.
The Buff Must Flow
In 2022, Microsoft Exchange Server stopped delivering email worldwide. The cause was a 32-bit signed integer in Exchange’s anti-malware scanner that stored the date as a long-form number. On New Year’s Day, the value tipped over the limit and the scanner refused to load. Mail queues backed up everywhere. Microsoft shipped an emergency script the next day. They called it Y2K22.
On April 6, 2019, the GPS week number counter rolled over. The failure mode was familiar, an integer designed when the engineers thought it was going to be big enough turned out, decades later, not to be. NYC’s municipal wireless network went down. KLM grounded a flight. Older car and marine GPS units showed dates in 1999.
Two examples of overflows hitting production and breaking real things. Y2038 will be every one of those at once, in places nobody is thinking about.
Y2038 is foreseeable. We know about it. We know what needs to be done. We have twelve years. We should get started sooner rather than later. A lot of important systems need to be replaced, and the fewer that fall through the cracks, the better.
There’s no checklist for the devices we’ve already forgotten about, but maybe there should be.
Tomorrow: The Smear, how Google, Amazon, and Meta quietly decided to stop telling the truth about leap seconds, and why everyone else followed.
Sources
- Year 2000 problem — Wikipedia
- Year 2038 problem — Wikipedia
- Microsoft Exchange year 2022 bug in FIP-FS breaks email delivery — BleepingComputer
- Microsoft Exchange Fixes Disruptive ‘Y2K22’ Bug — BankInfoSecurity
- GPS week number rollover — Wikipedia
- GPS Week Number Rollover — GPS.gov
- The impact and resolution of the GPS week number rollover of April 2019 — Geoscientific Instrumentation (Copernicus)
- Linux kernel 5.6 — 64-bit time_t support for 32-bit architectures (KernelNewbies)
- The Open Group Base Specifications: time.h
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
-
Day 13: Unix Time, 1,780,620,532
That’s roughly what time it is, right now, as I type this.
Not 8:48 PM. Not “Thursday.” Not “June 4th, 2026.” None of those are what your computer thinks “now” is. To your laptop, your phone, your car’s infotainment system, the streaming server pushing this page to your browser, and the ATM in the corner store, now is a number. A 10-digit integer. Counting up, one tick per second, since a fixed moment in 1970.
That number runs the world. It’s the closest thing the global computing infrastructure has to a heartbeat. And it has some weird properties, almost none of which are explained by the name it goes by.
Unix time.
The clock under every clock
Open a terminal. Type
date +%s. You’ll see something like1780620532come back. That’s Unix time. Seconds since the Unix epoch,1970-01-01T00:00:00 UTC.Every modern operating system tracks time this way internally, even if it dresses up the output for you. The pretty “8:48 PM” on your menu bar is a calculation: take the current Unix timestamp, apply your timezone offset, run it through the calendar rules, format it for display. The underlying number is just
1,780,620,532-and-change, counting up.JavaScript’s
Date.now()? Unix time in milliseconds. Java’sSystem.currentTimeMillis()? Unix time in milliseconds. Python’stime.time()? Unix time as a float. Go’stime.Now().Unix()? Unix time. PostgreSQL’sEXTRACT(epoch FROM ...)? Unix time. SQLite’sstrftime('%s', 'now')? Unix time.It’s the lingua franca of computing. Two systems written in different languages, on different continents, with different calendars in their UIs, agree about what now means because they both agree about this one number.
Why 1970?
The honest answer is: convenience.
In the early 1970s, Ken Thompson and Dennis Ritchie were building Unix at Bell Labs. They needed a way to represent time on a 32-bit machine. Their first attempt counted 1/60 of a second per tick in a 32-bit integer, and overflowed in about two and a half years. So they switched to 1 tick per second, which gave them roughly 136 years of range in a signed 32-bit integer.
Then they needed a zero. They picked
1970-01-01because:- It was recent enough that the historical calendar mess (Julian vs. Gregorian, the dropped days in 1582, the year that started in March) was someone else’s problem.
- It was round.
- It predated every Unix system anyone cared to represent.
- It was conveniently close to UTC’s formalization a couple of years later.
That’s it. There’s no cosmological significance to 1970-01-01. It’s not aligned with any astronomical event. It’s the timestamp equivalent of
git init. We’ll start counting from here, and we’ll figure the rest out later.The “later” turned out to mean everywhere.
The thing that isn’t there: leap seconds
The computer’s time problem mostly comes from UTC.
Unix time is defined as the number of seconds since the Unix epoch. You might reasonably assume that if I have two timestamps, the difference between them is the actual number of physical seconds that elapsed between those two moments.
It is not.
Unix time does not count leap seconds. Since 1972, the IERS has inserted 27 leap seconds into UTC, extra seconds added to keep civil time aligned with Earth’s slowing rotation. Unix time pretends they never happened. The Unix clock has, over its 56-year lifetime, “lost” almost half a minute relative to reality.
Even weirder: during the actual leap second, when UTC ticks
23:59:59 → 23:59:60 → 00:00:00, Unix time has to do something. POSIX doesn’t specify what. So implementations have invented three different answers:- Repeat the second. The clock shows
23:59:59for two real seconds and then jumps to00:00:00. Two distinct physical moments share the same timestamp. File mtimes can collide, log entries can appear out of order. - Insert the second. The clock briefly shows
23:59:60, which is a valid UTC string but breaks every parser that assumes seconds run 00–59. Linux kernels do this. Hilarity ensues at midnight. - Smear it. Don’t insert the second at all. Slow every clock down by a tiny fraction over a 24-hour window so it absorbs the missing second smoothly. Google does this. Amazon does it. Facebook does it.
So “Unix time” in 2026 means three subtly different things depending on whether your server is running stock Linux, smeared Google time, or one of the dozens of variants in between. Two timestamps from two providers may disagree by a second, and both are correct under their own definitions.
That’s what the spec authors call “implementation-defined behavior” and what the rest of us call “why distributed-system logs don’t line up.”
The number is also a string
Integers are easy for computers but humans expect a string. Unix time is the easiest timestamp format to compare, sort, and store because it’s an integer, but as soon as we convert to human-readable format, all that changes.
To find out which one is earlier, subtract. To sort a million events, sort the integers. To store one efficiently, write 8 bytes. To send one over the network, send 8 bytes.
Compare this to a full ISO 8601 timestamp like
2026-06-04T16:47:23.512847+00:00. That’s a 32-character string that needs to be parsed, validated, normalized for timezone, and converted to a comparable representation before you can do anything with it. Every comparison is a parsing pass. Every storage is 4× the bytes. Every sort is a string sort with calendar rules.Unix time is fast. It’s so fast that even formats designed to replace it (Google’s Spanner, AWS’s KSUIDs, Twitter’s Snowflake) embed Unix-like millisecond counts at their core and just append entropy bytes around them.
The ubiquity isn’t an accident. It’s the natural result of picking the representation that’s cheapest at every step.
The Untimes
Unix time is a convention that has eaten the world.
It’s anchored to UTC, which means it inherits UTC’s quirks. It’s embedded controllers in cars, industrial equipment, network gear, satellite firmware, gas pumps, so pretty much every piece of modern infrastructure.
1,780,620,532is just a number, a timestamp. It’s used by your bank for transactions, used by your file system for its files, but also it’s a hack. A 56-year-old dart in the board of of time, that ignores leap seconds, depends on UTC, has three different definitions during the same physical second, and we built the entire internet on top of it.Tomorrow will be on what happens when the bill comes due. Y2K and Y2038, the bug that didn’t end the world, and the bug that still might.
Sources
- Unix time — Wikipedia
- Leap second — Wikipedia
- Coordinated Universal Time — Wikipedia
- International Earth Rotation and Reference Systems Service — Wikipedia
- Leap Smear — Google Developers
- Look Before You Leap — The Coming Leap Second and AWS
- It’s time to leave the leap second in the past — Engineering at Meta
- How Precision Time Protocol handles leap seconds — Engineering at Meta
- Leap second bug cripples Linux servers at airlines, Reddit, LinkedIn — The Register
- Resolve Leap Second Issues in Red Hat Enterprise Linux
- History of Unix — Wikipedia
- Snowflake ID — Wikipedia
- ksuid — segmentio (GitHub)
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
-
Day 10: The Zero Point
Three epochs quietly run the world:
- The Unix epoch. Midnight, January 1, 1970. Almost every computer measures time as seconds since this instant.
- The GPS epoch. Midnight, January 6, 1980. Every GPS satellite, every navigation chip in every phone, measures time as seconds since this instant.
- The astronomical epoch (J2000.0). Noon, January 1, 2000, Terrestrial Time. Almost every star catalog, planetary orbit calculation, and space mission uses this instant.
Three different zeros. Three different conventions. None of them line up with anything you’d find on a calendar. Here is why.
You Can’t Have a Clock Without a Zero
A clock counts intervals. To tell you what time it is right now, it needs to know how many intervals have passed since something. The “since something” is the epoch: a fixed, agreed-upon instant from which all measurement runs.
Most timekeeping systems hide their epoch behind a calendar facade. “April 14, 2026” is meaningful to humans, but underneath, the computer is doing arithmetic on a single integer counted from a particular zero.
The calendar is the friendly mask.
The epoch is the actual machinery.
The Three Big Epochs
Unix Epoch: January 1, 1970, 00:00:00 UTC
Picked in the early 1970s by the engineers building Unix. They needed a zero point for the system’s internal
time_tinteger. 1970 was recent enough to feel current, far enough away to leave room for negative numbers (events before 1970), and round enough to remember.I think that they probably thought, like, well, if time is all relative, then let’s just pick some arbitrary time and it doesn’t matter.
It was an choice, not an astronomical one, just relative to some arbitrary point they decided.
So let me say that again.
The Unix epoch has no relationship to any natural event. It is a convention that, through the pervasive nature of Unix, became the default for all modern computing.
GPS Epoch: January 6, 1980, 00:00:00 UTC
The GPS satellite constellation started broadcasting on January 6, 1980. The epoch was just the moment the system turned on.
Why January 6? Because that’s a Sunday, and the GPS week-counting system uses weeks, and weeks start on Sunday.
The first GPS week is week zero.
GPS time has run continuously from that instant and has never had a leap second adjustment, so it is currently 18 seconds ahead of UTC, a gap that keeps growing.
But more on that in a tomorrow’s post.
J2000.0: January 1, 2000, 12:00:00 Terrestrial Time
This is the astronomers' epoch, and it’s the most carefully chosen of the three. Notice two things:
- It’s noon, not midnight.
- It’s in Terrestrial Time, not UTC.
Both choices have reasons.
Why noon? Astronomers observe at night. A “day” for an astronomer historically started at noon and ran through the following noon, so a single night’s observation session never straddled a date boundary.
If you started a date at midnight, half the stars you saw last night would log on one date and half on the next.
Annoying for astronomers, so they decided to reduce their suffering by redefining the epoch.
The Julian Date system, introduced by Joseph Scaliger in 1583, runs from noon to noon for this reason.
Noon TT on January 1, 2000 was Julian Date 2,451,545.0 exactly, a perfectly round Julian-Date integer.
Why such a huge number?
Because Julian Dates count days from noon on January 1, 4713 BC, the start of Scaliger’s count.
He picked that year because three big calendar cycles (solar, lunar, and the Roman indiction) all aligned there, and because it sat well before any recorded astronomical observation, so every date in history would be a positive integer.
By noon on January 1, 2000, exactly 2,451,545 days had elapsed.
The “0” at the end of “J2000.0” is a flag for that round number, a clean integer in a counting system older than telescopes.
Why Terrestrial Time and not UTC? Because UTC has leap seconds and Terrestrial Time doesn’t.
TT is the smooth atomic timescale we built two days ago (TAI + 32.184 seconds). Anchor your epoch to UTC and every leap second shifts your historical observations sideways. Anchor it to TT and it stays put. That’s why the canonical zero is in TT.
TAI: 2000-01-01 11:59:27.816 UTC: 2000-01-01 11:58:55.816 TT: 2000-01-01 12:00:00.000 ← this is J2000.0Other Epochs Worth Knowing
A few more that show up in working systems:
- Modified Julian Date (MJD): November 17, 1858, midnight. Used in space-mission control because it drops the leading digits of a full Julian Date, saving bytes in old memory-constrained systems.
- TAI origin: January 1, 1958, midnight UT2. The instant the cesium-coordinated TAI scale started running.
- Year zero of the Gregorian calendar: there isn’t one. The calendar jumps from 1 BC to 1 AD with no year zero in between, breaking date arithmetic across the boundary and serving as a low-grade gotcha in historical software.
The Deep-Time Temptation
Some people, looking at this collection of arbitrary-feeling start points, ask why we don’t just pick something physically meaningful. The formation of the Earth, the formation of the solar system, the Big Bang.
The answer is precision.
We don’t know any of those instants to better than millions of years. Earth formed roughly 4.54 billion years ago, plus or minus 50 million. The solar system, 4.567 billion years ago, plus or minus 1 million. The Big Bang, 13.8 billion years ago, plus or minus 20 million.
A reference epoch that is uncertain to a million years isn’t a reference…
The astronomical zero needs to be knowable to the nanosecond, recoverable in the future from preserved records, and verifiable against real observations.
Of every candidate, J2000.0 is the best at all three.
Modern atomic clocks were running in 2000. Star positions on that day are catalogued.
The exact instant is recorded across thousands of observatories.
If civilization collapses and is rebuilt, J2000 is recoverable from physical artifacts. The formation of the Earth is not.
What the Epoch Is Doing
Pick your epoch and you pick what your system can and can’t represent.
- Unix time can’t go before 1970 without negative numbers, and there is the whole integer-overflow issues after a few centuries.
- GPS time started in 1980 and counts strictly forward. Nothing before is representable.
- J2000.0 sits at the present, so calculations naturally span backwards and forwards by tens of thousands of years with full precision.
The choice of epoch is often the most invisible design decision in a timekeeping system, but it shapes everything downstream.
Some of the strangest bugs in software history, Y2K, the 2038 problem, GPS week rollovers, trace back to picking a zero without thinking about the consequences.
Tomorrow we’ll see what happens when one of those choices has to deal with relativity, gravity, and the curvature of spacetime.
The Gee-Pee-Ess time, and the clocks that ship from the factory wrong on purpose.
Sources
- Unix time — Wikipedia
- GPS time — Wikipedia
- Epoch (astronomy) — Wikipedia
- Julian day — Wikipedia
- Terrestrial Time — Wikipedia
- Year 2038 problem — Wikipedia
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
Programming 30daysoftime Astronomy Timekeeping Computing-history
-
A Dotfiles Manager That Snapshots Every Change
Managing dotfiles in 2026 is a solved problem in the same way that managing your own backups is a solved problem: there are five tools for it, all of them work, all of them require you to set up some plumbing first, and once you’re set up you still don’t have a great answer to “I just broke my shell config, get me back to yesterday.”
The conventional answer is some combination of: a git repo for your
~/.zshrcand friends, a symlink script (orstow, orchezmoi, oryadm), and the discipline to remember to commit after every change. The setup is a one-time hassle. The “wait, what did I change?” recovery story is not great. And if you want to sync across machines, you’ve now got opinions about remote repos, SSH keys on a fresh box, and which order things have to happen in.I wanted something different, so not a configuration framework, but a record of every change to the files I care about, in a place I can roll back from, with the lowest possible setup cost.
That’s what dfm is.
What It Does
dfmis a single static Go binary. You point it at the files you want to track (~/.zshrc, anything under~/.config/, whatever), and every time one of them changes it takes a content-addressed snapshot. The snapshots live on disk in~/.local/share/dotfiles/backups/. A small state database (SQLite locally, or libSQL via Turso if you want cross-machine sync) records which file maps to which snapshot at which point in time.You can roll back. You can diff against an old snapshot. You can see when you last touched a file. And because every snapshot is content-addressed, you never re-store the same bytes twice — switching themes in
~/.zshrcten times costs the size of two configs, not ten.The other half is the backup story.
dfm initwalks you through cloning (or creating, viagh) a private GitHub repo that mirrors your tracked files plus their history. The point isn’t to make you adopt a new git workflow. It’s that pulling your config onto a fresh machine should be one command, and recovering fromrm -rfshould never have a “well, hopefully my last commit was recent” caveat.Why Setup Is the Hard Part
The reason people don’t audit their dotfiles is the same reason people don’t back up their laptops: the setup is annoying, and the payoff is theoretical until it isn’t.
dfm initis a six-step interactive wizard. It detects aTURSO_DATABASE_URLenv var if you’ve got one, offers sensible defaults for everything else, lets you opt in to tracking~/.zshrcimmediately, and writes a single config file with the right permissions. Re-run it on an existing config and it pre-fills every prompt with your current value, so the cost of changing your mind later is also low.--yesaccepts every default for scripted setup.If that sounds boring, that’s the point. Boring is what makes a tool actually get used.
The AI Bit
There’s an optional AI integration.
dfm suggest <file>asks a local AI CLI (Claude Code by default, configurable) to propose an improvement to one of your tracked files, returns the proposal as a unified diff, and stores it as a pending suggestion.dfm apply <id>reviews the diff and applies it, with a fresh snapshot first, so you can roll back if the suggestion turns out to be wrong.I’m exited to try this feature out, because I’m sure there is something i"m doing wrong. The “Look at my
~/.zshrcand tell me what I could clean up” is useful feature that doesn’t require me copy and pasting or granting read or write access to my entire home directory.Where to Get It
github.com/llbbl/dotfiles-manager. Pre-built binaries for darwin and linux on arm64/amd64. Current version, as of writing, is v1.4.0.
If you’ve been meaning to actually back up your dotfiles and the friction has stopped you, this is the post where I tell you the friction is solvable.
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
-
Your AI Coding Agent Can Read Every Secret on Your Machine
Every developer running an AI coding agent has handed that agent the keys to their machine. Not metaphorically. Literally. The agent runs as your user. It can read every file you can read, execute every command you can execute, and hit every API your stored credentials authorize.
For most workflows, that’s the point. You want the agent to read your code, modify your project, ship your work. But there’s a quieter implication: the agent can also read your
.envfiles. It can invoke your secret-management tooling. It can grep forAPI_KEY=across your home directory. And nothing in the agent stack says “wait, you didn’t ask for this.”Same-UID isolation isn’t isolation. It’s the absence of isolation labeled politely.
The usual answer to “keep secrets safe from your coding agent” is: don’t store them where the agent can find them. Use a cloud secret manager. Rotate aggressively. These are good practices, and for local development, they’re often impractical. The agent is going to encounter secrets whether or not your security-best-practices doc approves.
So over the last week, I built an audit subsystem into lsm, my Local Secrets Manager. The whole thing is designed to answer one forensic question: did anything weird touch my secrets last night?
The Threat Model
A defense without a threat model is theater, so let me be specific.
The threat isn’t a sophisticated remote attacker. lsm is public, open-source code. The threat isn’t a buggy lsm either; bugs happen, and the user can read the source.
The threat is the agent layer running adjacent to lsm. Coding agents have legitimate access to a wide swath of your filesystem. They’re imperfect at intent inference. They sometimes get prompt-injected. They sometimes run in the background while you’re asleep. When an agent calls
lsm get prod DATABASE_URL, the action is indistinguishable from you doing the same thing. The audit log’s job is to make those calls retrospectively distinguishable.A secondary threat is an agent covering its tracks. If something reads a secret and then edits the audit log to erase the evidence, the log is worse than useless.
What Got Built
The audit subsystem records every access as a structured event: a sequence number, a timestamp, the action, the app and environment, an
Actorblock describing the calling process, and two cryptographic fields linking each event to the previous one.The
Actorblock was the interesting design problem. It captures parent process ID, parent process name, TTY device path (or empty if there’s no terminal), current working directory, an agent marker derived from environment variables that tools like Claude Code, Cursor, Aider, and Continue set, and the calling user ID. Every field is captured every time. Noomitempty. UID zero is a real, meaningful value, and silently dropping it would be a footgun.Events land in a hash-chained JSONL file at
~/.lsm/audit.jsonl. Each row carries the SHA-256 of the previous row plus its own body. If anyone edits, inserts, or deletes a row in the middle, the next row’sprevno longer matches andlsm audit verifysurfaces the break.The chain doesn’t catch tail truncation. If you chop off the end of the file, what’s left is internally consistent. A sidecar file storing the last expected hash is the obvious fix, and I deliberately rejected it. lsm is public code. Any local attacker who knows about the sidecar can rewrite both files in lockstep. Tail-truncation detection is deferred to the off-machine path: when events ship to a remote stack, the last hash naturally lives somewhere the local attacker doesn’t control.
Reading the Log
Three commands cover the read side.
lsm audit taildoes what you’d expect.lsm audit show <seq>prints a single event.lsm audit queryis the workhorse, with every field as a filterable dimension:--app,--env,--event,--parent-comm,--agent-marker,--tty present|absent,--since,--until. Output is JSONL when piped and columnar text when interactive.Then there’s
lsm audit suspicious, which runs four hard-coded detectors in one pass:- Outside hours. Events whose timestamps fall outside 07:00–23:00. The 3 a.m. canary.
- Burst. More than N events from a single parent process within a sliding window. The runaway-agent canary.
- New parent_comm. Process names not seen in the prior 30 days. The “what is this new thing” canary.
- Non-interactive, no agent. No TTY, no recognized agent marker. The “what is even running this” canary.
A single event can stack reasons. A 3 a.m. burst from an unknown parent is unambiguously interesting.
The detector doesn’t learn baselines, doesn’t call out to an ML model, doesn’t require a service. High-signal patterns are obvious patterns, and obvious patterns are well-served by hard-coded predicates.
Shipping Events Off the Box
If you already run an observability stack, lsm can ship audit events over OTLP (the OpenTelemetry wire protocol). Three design choices matter here.
The local file sink is always authoritative. The remote sink is a mirror, not a replacement. An lsm operation never fails because the remote endpoint is down.
Redaction is allowlist-based. App and environment names are HMAC-hashed with a per-host salt before becoming labels. The TTY device path is dropped and replaced with a
tty_present: true/falseboolean. Secret values,cwd,hash,prev, and the schema version never leave the host. Secret names are replaced withkey_present: truemarkers; the remote observer can see that a key was accessed, never which key.Events whose name starts with
audit.(chain failures, suspicious matches, sink drops) are always local. Telling a remote attacker that local integrity has been compromised is counterproductive.What’s Still Open
The most important non-feature: no command in lsm emits events yet.
setdoesn’t log.getdoesn’t log.deletedoesn’t log. The plumbing is complete, the calls are not wired in. Each emit site needs careful thought about which fields are appropriate, whether the event should be local-only, and how it interacts with sensitive operations. That’s the next chunk of work.The agent-coding era is normalizing a model where AI tools have wide-ranging access to developer machines. The premise that the agent operates as a fully-trusted local user is unlikely to change soon. Managing the risk means visibility. It means being able to answer “what touched my secrets last night” with a record the agent couldn’t silently rewrite.
The code is at github.com/llbbl/lsm. The full design lives in
docs/observability.md.I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
-
Five Modern JavaScript Features That Make the Old Patterns Look Silly
I’ve been doing some reading on what JavaScript has been picking up over the last few releases, and the current batch is unusually good. Cleaner resource management, real Set math, lazy iterators, and a couple of small ergonomic wins that retire some genuinely tedious patterns. So this post is my attempt at summarizing five of the more interesting ones, what they replace, and where each one stands on browser support. I hope it helps if you’re trying to figure out what’s actually shipping versus what’s still a proposal.
1. Explicit Resource Management with
usingIf you’ve written C# or Python, this will feel familiar. The
usingkeyword (and its async siblingawait using) ensures a resource is cleaned up the moment the variable goes out of scope, even if your code throws. Under the hood it looks forSymbol.disposeorSymbol.asyncDisposeon the object.The old way meant remembering to wrap everything in
try/finally:async function fetchUser() { const db = new DatabaseConnection(); await db.connect(); try { return await db.query('SELECT * FROM users WHERE id = 1'); } finally { await db.close(); } }The new way:
async function fetchUser() { await using db = new DatabaseConnection(); await db.connect(); return await db.query('SELECT * FROM users WHERE id = 1'); }No
finally, no forgetting to close the connection on the error path. The cleanup is guaranteed.Browser/runtime support: Chrome 123+, Firefox 119+, Node 20.9+. Safari is still pending.
2. New Set Methods
For years, JavaScript’s
Setwas basically a deduplicated array with a fancy name. If you wanted actual set math, you were converting to arrays and looping. Now the operations are built in and run at engine speed.const userRoles = new Set(['read', 'write', 'comment']); const adminRoles = new Set(['read', 'write', 'delete', 'ban', 'comment']); userRoles.intersection(adminRoles); // shared roles adminRoles.difference(userRoles); // what admin has that user doesn't userRoles.union(adminRoles); // everything, deduped userRoles.isSubsetOf(adminRoles); // trueThat’s it. That’s the whole job. No more
new Set([...a].filter(x => b.has(x)))incantations. The full method set also includessymmetricDifference,isSupersetOf, andisDisjointFrom.These shipped as part of ES2024 and have reached Baseline. Available in Chrome 122+, Safari 17+, and recent Firefox.
3. Iterator Helpers
Until now,
.map()and.filter()only worked on arrays, and arrays load everything into memory. If you’re streaming a 50GB log file through a generator, callingArray.from()on it will introduce you to your operating system’s OOM killer.Iterator helpers bring those same methods to iterators, operating lazily, one item at a time.
The old way:
function* infiniteNumbers() { let i = 1; while (true) yield i++; } const evens = []; for (const num of infiniteNumbers()) { if (num % 2 === 0) { evens.push(num); if (evens.length === 3) break; } }The new way:
const result = infiniteNumbers() .filter(n => n % 2 === 0) .take(3) .toArray(); // [2, 4, 6]It only computes what
take(3)needs. You can chain on an infinite sequence and it just works.These are part of ES2025. Firefox has shipped them, Chrome is in the process of shipping in V8, and Safari’s implementation is roughly half done.
4. Map Upsert
The naming bounced around (early proposals called it
emplace, thenupsert), but the final landing isgetOrInsert(key, default)andgetOrInsertComputed(key, callback). The idea is simple: stop doing the three-step “check, default, fetch” dance every time you group data.The old way:
const wordMap = new Map(); for (const word of words) { const key = word[0]; if (!wordMap.has(key)) { wordMap.set(key, []); } wordMap.get(key).push(word); }The new way:
const wordMap = new Map(); for (const word of words) { wordMap.getOrInsert(word[0], []).push(word); }This is the kind of thing every codebase has a
groupByhelper for. The proposal reached Stage 4 in January 2026, so it’s officially in the spec, but engine implementations are still in progress as of this writing. Worth knowing about, not yet safe to ship without a polyfill.5. Import Attributes
As ES Modules took over, importing JSON natively became a real need. The catch is that just letting
importpull in a.jsonfile is a security problem. If the server quietly serves JavaScript instead of JSON, the engine would happily execute it as code.Import attributes fix that by making you declare the type explicitly. If the file isn’t what you said it was, the engine refuses.
import config from './config.json' with { type: 'json' }; console.log(config.databaseHost);No more
fs.readFileSyncfor config, no morerequirehacks in otherwise-modern codebases. Just an import that’s safe by default.If you’ve seen the
assert { type: 'json' }form in older articles, that was an earlier syntax that got renamed before shipping. The current keyword iswith. Available in Chrome, Edge, Firefox, and Safari since April 2025, plus Node and Deno.The Through-Line
What stands out across these five is that each one retires a pattern that’s been written into JavaScript codebases millions of times. The
try/finallycleanup. The customgroupByhelper. The Lodash imports for set operations. Theforloop with a manual counter because there was no.take()on generators. Thefs.readFileSyncfor loading a config file in an otherwise-modern ESM project.The language is quietly absorbing the utility belt, and the code that’s left looks a lot more like what we meant to write in the first place. Sign me up.
Sources
- Explicit Resource Management — V8
- JavaScript Set methods reach Baseline — web.dev
- Iterator helpers — V8
- Map.prototype.getOrInsert — MDN
- Import attributes — MDN
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
-
JavaScript Finally Gets a Real Date API
When has working with dates ever been easy? Every language has its own version of the same headaches: time zones, parsing, leap years, arithmetic that does weird things at month boundaries. JavaScript just had some quirks layered on top, extra cruft left over from when the language was first created. Third party libraries like Moment.js or date-fns had to fill in the gaps.
Those days are over. Now we have the Temporal API.
Why Date Was Broken
Let me give you the short version of why
Dateis the way it is. It was inspired by Java’sjava.util.Datefrom the 90s, which Java itself eventually deprecated. JavaScript inherited the design and never let it go.The problems are well-known at this point:
Mutability. Pass a
Dateto a function and that function can change it underneath you.const myDate = new Date('2023-01-01'); function addDays(date, days) { date.setDate(date.getDate() + days); return date; } addDays(myDate, 5); console.log(myDate.toISOString().slice(0, 10)); // '2023-01-06'. Surprise, your original is gone.Time zone confusion.
Datestores milliseconds since the Unix epoch but formats itself in the user’s local time zone. Working in any other zone means reaching formoment-timezoneordate-fns-tz.Parsing roulette.
new Date("2023-01-01")andnew Date("Jan 1, 2023")can return different things depending on the browser and the assumed time zone.Math that lies. Adding a month to January 31st?
Daterolls it forward to March 3rd because February doesn’t have 31 days. That’s not a bug exactly. It’s justDatebeing honest that it doesn’t really understand calendars.What Temporal Actually Fixes
Temporal is a new global object designed from the ground up to address all of this. The design choices are worth walking through because they’re opinionated in the right ways.
Everything is immutable
Every operation returns a new object. Your original data stays put.
const start = Temporal.PlainDate.from('2023-01-01'); const end = start.add({ days: 5 }); console.log(start.toString()); // '2023-01-01' console.log(end.toString()); // '2023-01-06'Different types for different concepts
This is the part I find most interesting.
Datetries to be everything, a timestamp, a calendar date, a wall clock time, all at once. Temporal splits these into distinct types and forces you to pick:Temporal.PlainDate: a calendar date, no time, no zone. Birthdays, anniversaries.Temporal.PlainTime: a wall-clock time, no date.Temporal.PlainDateTime: date and time, no zone.Temporal.ZonedDateTime: fully zone-aware and calendar-aware. The one for global apps.Temporal.Instant: an exact point in time, like epoch milliseconds.Temporal.Duration: a length of time.
Making you pick the right type up front is the whole game. Half the bugs in date code come from pretending a
Dateis one thing when it’s actually another.Time zones and calendars built in
Temporal natively understands IANA time zones (
America/New_York,Europe/Paris) and non-Gregorian calendars (Hebrew, Islamic, Japanese). No external library needed.Math that respects the calendar
const t = Temporal.PlainDate.from('2023-01-31'); const nextMonth = t.add({ months: 1 }); console.log(nextMonth.toString()); // '2023-02-28'It clamps to the end of the month instead of rolling over. That’s almost always what you actually wanted.
Comparisons and Diffs
A couple of quick ones, because these are the operations you do constantly.
Comparing two dates:
const t1 = Temporal.PlainDate.from('2023-01-01'); const t2 = Temporal.PlainDate.from('2023-01-01'); console.log(Temporal.PlainDate.compare(t1, t2) === 0); // trueNo more
getTime()dance to compare primitives. There’s an actual comparison function.Finding the difference:
const start = Temporal.PlainDate.from('2023-01-01'); const end = Temporal.PlainDate.from('2023-12-31'); const diff = start.until(end, { largestUnit: 'days' }); console.log(diff.days); // 364No more dividing milliseconds by
1000 * 60 * 60 * 24and hoping DST doesn’t mess you up.Should You Use It Yet?
Check your runtime. Browser and Node support has been landing, but you’ll want to verify Temporal is available where you’re shipping, or use the official polyfill while you wait.
For most date and time work, this replaces Moment.js and date-fns entirely. Moment has been in maintenance mode for years. Temporal gives you the good parts of those libraries as a standard, immutable, well-typed API.
Datewill stick around forever for backwards compatibility. But for new code, use Temporal. The API is better, the semantics are saner, and less bug-prone.I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
-
SAST vs AI PR Review: Two Tools, Different Jobs
If you have worked in DevSecOps, you might be wondering if AI pull request review tools are going to replace traditional SAST scanners. Short answer: no. Longer answer: they’re solving different problems, and if you’re picking one over the other, you might be making a mistake.
Here is how I think about it.
SAST is the Compliance Gatekeeper
Static Application Security Testing tools, think Semgrep, SonarQube, Checkmarx, Fortify, parse your source code (usually into an Abstract Syntax Tree) and hunt for known vulnerability patterns. They don’t run the code. They just read it and “pattern-match” against rules.
The focus here is security, compliance, and strict rule enforcement. SAST is the automated gatekeeper that makes sure your code clears the OWASP Top 10 bar before it merges.
What SAST does well:
- It’s deterministic. If a rule matches a pattern, the engine flags it every single time. Run it twice on the same code, get the same result.
- It satisfies auditors. Frameworks like PCI-DSS, SOC 2, and HIPAA expect documented secure-development practices, and a formal SAST scanner is the easiest way to produce that evidence. AI agents don’t count here, at least not yet.
- It can do real taint analysis. Enterprise tools can track untrusted input from the moment it enters your app to the moment it hits a dangerous sink.
Where SAST falls down:
- The false positive rate is brutal. Rigid rules with no context means a lot of noise. Developer fatigue is real, and once your team starts ignoring scanner output, you’ve lost the game.
- It can’t see your business logic. A SAST tool has no idea what your application is supposed to do, so it can’t tell you when the logic itself is broken.
- Comprehensive scans are slow. Hours on large codebases isn’t unusual, though Semgrep has been doing good work on this front.
AI PR Agents are the Peer Reviewer
Tools like CodeRabbit, Qodo, Greptile, GitHub Copilot Code Review, Cursor Bugbot, and Claude Code (set up as a review skill) plug into your version control and read the PR diff with the surrounding code context. They behave less like a scanner and more like a colleague who actually read your changes.
The focus is developer productivity, code quality, logic bugs, and contextual feedback.
What they do well:
- They understand intent. LLMs can reason about why the code is changing, not just whether it matches a rule. That’s a different category of feedback.
- The signal-to-noise ratio is good. When an AI flags something, it usually comes with an explanation that makes sense. Less noise, more useful comments.
- They suggest fixes. Not just “this is wrong” but “here’s a diff you can apply.” That’s huge for actually closing the loop on review feedback.
- The scope is broader. Architecture, performance, style, security, all in one pass.
Where they fall down:
- They’re non-deterministic. Same vulnerability, two PRs, two different outcomes. That’s not a bug, that’s how LLMs work, and it’s why auditors don’t trust them.
- They don’t satisfy compliance. No auditor is going to accept “the AI looked at it” as a substitute for a formal scanner.
- Hallucinations happen. Invented issues, misread intent, suggestions that refactor things that didn’t need refactoring. You still need a human filtering the output.
The Quick Comparison
Feature SAST AI PR Review Primary Goal Security & Compliance Code Quality & Productivity Analysis Method Deterministic rules & AST Non-deterministic LLMs Business Logic Blind Context-aware False Positives Often high Usually low Compliance Proof Accepted as evidence Not accepted Feedback Loop Dashboard / CI output PR comments / chat The Lines Are Starting to Blur
The interesting thing happening right now is convergence from both directions.
On the SAST side, tools like DryRun Security are pitching themselves as “AI-native SAST,” trying to keep the deterministic backbone while using LLMs to filter out the false positives that make traditional scanners painful to live with.
On the AI agent side, CodeRabbit and Greptile keep getting better at catching real security vulnerabilities, not just style issues. They’re slowly creeping into territory that used to belong exclusively to SAST.
This is going somewhere, but it’s not there yet.
Where to Start Your Evaluation
Treat them as complementary, not competitive.
For SAST, evaluate against your audit footprint, the languages in your codebase, and how much false-positive triage your team can absorb. Semgrep, SonarQube, Checkmarx, and Fortify all sit in different price-and-friction zones, and the right one depends on what your business actually needs to prove.
For AI PR review, evaluate based on how it fits your existing review workflow, what languages and frameworks it understands well, and the signal-to-noise ratio in practice on your codebase. CodeRabbit, Qodo, Greptile, Copilot Code Review, Bugbot, and a Claude Code review skill all approach the problem differently.
If you pick one category and skip the other, you’re either passing compliance with mediocre code review, or getting great review feedback while failing your next audit. Neither is a win.
The AI tools aren’t replacing SAST. They’re filling in the gap SAST was never designed to cover.
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
-
pgvector vs Pinecone: You Probably Don't Need a Separate Vector Database
Every time someone starts building a RAG pipeline, the same question will come up: do I need a “real” vector database like Pinecone, or can I just use pgvector with the Postgres I already have?
I can imagine teams agonizing over this decision for weeks. So maybe this will save you some time?
The Case for Staying Put
If you already have a PostgreSQL instance in your stack, adding
pgvectoris almost always the right first move.You manage one stateful service instead of two. Your existing backup strategy, monitoring, and security all stay the same. Your vector embeddings live next to your metadata, so you get ACID compliance and standard SQL joins. No syncing between two data stores. No eventual consistency headaches.
Performance? From what I found, for datasets under a few million vectors,
pgvectorwith HNSW indexes is fast. Really fast. It satisfies the latency requirements of most applications without breaking a sweat.And you’re not paying for another SaaS subscription…
When Pinecone Actually Makes Sense
Pinecone is a purpose-built vector database designed for high-dimensional data at massive scale. It’s serverless and fully managed.
If you’re dealing with hundreds of millions or billions of vectors, a specialized engine handles memory and disk I/O for similarity searches more efficiently than Postgres can. Pinecone also gives you native namespace support, metadata filtering optimized for vector search, and live index updates that are faster than re-indexing a large Postgres table.
Those are real advantages. At a certain scale.
The Decision Is Simpler Than You Think
Stay with Postgres + pgvector if:
- You want to minimize infra sprawl and moving parts
- Your vector dataset is under 5 to 10 million records
- You rely on relational joins between vectors and other business data
- You have existing observability and DBA expertise for Postgres
Consider Pinecone if:
- Your Postgres instance needs massive, expensive vertical scaling just to keep the vector index in memory
- You don’t want to tune HNSW parameters,
mmapsettings, or vacuuming schedules for large vector tables - You need sub-millisecond similarity search at a scale where Postgres starts to struggle
That is what I would use to make that decision.
Most teams are probably nowhere near the scale where Pinecone becomes necessary. They have a few hundred thousand vectors, maybe a million or two. Postgres handles that without flinching. Adding a separate managed vector database at that point is just adding operational complexity for no measurable benefit.
The trap is thinking you need to “plan ahead” for scale you don’t have yet. You can always migrate later if you actually hit the ceiling. Moving from pgvector to Pinecone is a well-documented path. But moving from two services back to one because you overengineered your stack? That’s a conversation nobody wants to have.
Start with what you have. Add complexity when the numbers force you to, not when a vendor’s marketing page makes you nervous.
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
-
LangChain and LLM Routers, the Short Version
LangChain is important to know and understand in the age of agents. Also, LLM routing. They’re related but they’re not the same thing, and the distinction matters.
So lets break it down.
LangChain is the Plumbing
Out of the box, an LLM is a text-in, text-out engine. It only knows what it was trained on. That’s it. LangChain is an open-source framework that connects that engine to the outside world.
It gives you standardized tools to build pipelines:
- Models: Interfaces for talking to different LLMs (Gemini, Claude, OpenAI, whatever you’re using)
- Prompts: Templates for dynamically constructing instructions based on user input
- Memory: Letting the LLM remember past turns in a conversation
- Retrieval (RAG): Connecting the LLM to external databases, PDFs, or the internet so it can answer questions about your data
- Agents & Tools: Letting the LLM actually do things, like execute code, run a SQL query, or send an email
You could wire all of this up yourself, but LangChain gives you the standard pieces so you’re not reinventing the plumbing every time.
LLM Routers are the Traffic Controller
A router is an architectural pattern you build on top of that plumbing. Instead of sending every request through the same prompt to the same massive model, a router evaluates the request and directs it to the right destination. Simple concept, big impact.
Three reasons you’d want one:
- Cost: You don’t need a giant, expensive model to answer “Hello!” or look up a basic fact. Send simple queries to a smaller, cheaper model. Save the heavy model for complex reasoning.
- Specialization: Maybe you have one prompt for writing code and another for searching a company HR manual. The router makes sure the query hits the right expert system.
- Speed: Smaller models and direct database lookups are faster. Routing makes your whole application more responsive.
How Routing Actually Works
In LangChain, there are two main approaches:
Logical Routing uses a fast LLM to read the user’s prompt and categorize it. You tell the router LLM something like: “If the user asks about math, output MATH. If they ask about history, output HISTORY.” LangChain then branches to a specialized chain based on that output.
Semantic Routing skips the LLM entirely for the routing decision. It converts the user’s text into a vector (an array of numbers representing the meaning of the text) and compares it to predefined routes to find the closest match. This is significantly faster and cheaper than asking an LLM to make the call.
LangChain provides
RunnableBranchin LCEL (LangChain Expression Language, their declarative syntax for chaining components) for this, basically if/then/else logic for your AI pipelines. Worth digging into if you’re building with LangChain.Routing is what makes AI applications practical at scale. LangChain is one way to build it. They’re complementary, not interchangeable.
I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week, or you can find me on Mastodon at @[email protected].
-
What I Learned Building My First Chrome Extension
I built a Chrome extension to navigate Letterboxd movie lists with keyboard shortcuts. Rate, like, watch, next. Here’s what I learned.
The Idea
I going through the Letterboxd lists, but wanted a better way. “Top 250 Films,” curated genre lists, friends' recommendations. The flow becomes tedious: click a movie, rate it, go back to the list, find where you were, click the next one. I wanted to load a list into a queue and step through movies one by one with keyboard shortcuts.
So I did what any reasonable person would do and built a Chrome extension for it.
The Framework Graveyard
The Chrome extension ecosystem has a framework problem. CRXJS, the most popular Vite plugin for extensions, was being archived. Its successor, vite-plugin-web-extension, was deprecated in favor of WXT. WXT is solid but it’s another abstraction layer that could go the same way.
I went with plain Vite and manual configuration. Four separate Vite configs, one per entry point (content script, background worker, popup, manager page). A simple build script that runs them sequentially and copies the manifest. No framework dependency that could die on me.
For the UI I used React and TypeScript. Not because the extension needed React, most of the work is content scripts and background messaging, but the popup and settings page benefit from component structure.
Four Separate Worlds
One thing I learned was a Chrome extension isn’t one app. It’s four separate JavaScript contexts that can’t directly share state:
- Content scripts run on the webpage (letterboxd.com). They can read and modify the DOM but can’t access chrome.tabs or other extension APIs.
- Background service worker runs independently. It handles messaging, storage, and tab navigation. It can die at any time and restart.
- Popup is a tiny React app that opens and closes with the extension icon. It loses all state when closed.
- Extension page (the manager) is a full tab running your own HTML. It persists as long as the tab is open.
They communicate through
chrome.runtime.sendMessageandchrome.storage.local. This is an important architectural challenge you need to be aware of. If you’ve never built an extension before, it could trip you up.Letterboxd’s DOM Is a Moving Target
The existing open-source Letterboxd Shortcuts extension uses selectors like
.ajax-click-action.-liketo click the like button. Those selectors don’t exist anymore. Letterboxd has migrated to React components, and the sidebar buttons (watch, like, watchlist, rate) are loaded asynchronously via CSI (Client Side Includes). They’re not in the initial HTML at all.I had to inspect the actual loaded DOM to find the current selectors:
.watch-link,.like-link,a.action.-watchlist. The rating widget still uses the old.rateitpattern with adata-rate-actionattribute and CSRF token POST.If you’re building an extension that interacts with a third-party site’s DOM, expect the selectors to break. Build your DOM interaction layer as a thin, isolated module so you can update selectors without touching the rest of the codebase.
Service Workers Can’t Use DOMParser
My list scraper used
DOMParserto parse HTML responses. Works fine in tests (jsdom), works fine in content scripts (browser context), fails completely in the background service worker. Service workers don’t have access to DOM APIs.I rewrote the parser to use regex. Less elegant but it works everywhere. If I were doing it again, I’d run the parsing in a content script and message the results back to the background worker.
The Build System Is Simpler Than You Think
I expected the multi-entry-point build to be painful. It wasn’t. Each Vite config is about 20 lines. Content script and background worker build as IIFE (single file, no imports). Popup and manager build as standard React apps. The build script is 30 lines of
execFileSynccalls.One gotcha: asset paths. Vite defaults to absolute paths (
/assets/index.js), but extension popups and pages need relative paths (./assets/index.js). Addingbase: './'to the popup and manager configs fixed it.TDD Was Worth It (For the Right Parts)
The extension has four pure logic modules: rating double-tap behavior, auto-advance detection, queue state operations, and keyboard shortcut matching. These are the core of the extension and they’re completely testable without a browser.
Writing tests first caught edge cases I wouldn’t have thought of. What happens when you press the same rating key on a movie that was already rated in a previous session? What if the queue is empty and someone hits “next”? The tests document these decisions.
For DOM interaction code, the Letterboxd API layer, overlays, CSI-loaded content, unit testing isn’t practical. I tested those manually.
What I’d Do Differently … or might change
Start with the DOM. I built the pure logic first and the DOM interaction last. This meant I didn’t discover the CSI loading issue, the changed selectors, or the DOMParser problem until the end. Next time I’d build a minimal content script first, verify it can interact with the target site, then build the logic on top.
Use fewer Vite configs. Four config files with duplicated path aliases is annoying. A single config with a build mode flag, or a shared config factory function, would be cleaner.
Consider the popup lifecycle earlier. Popups close when you click outside them. Any state they hold is gone. I designed around this (the popup is stateless, it queries the background on every open), but it’s easy to get wrong if you don’t plan for it.
The Result
The extension loads any Letterboxd list into a queue, navigates through movies one by one, and lets me rate/like/watch/watchlist with single keystrokes. Auto-advance moves to the next movie when I’ve completed my actions. A dark-themed manager page shows the full queue and lets me customize every shortcut.
It’s a personal tool right now, so not published to the Chrome Web Store. But it’s made going through movie lists is pretty cool. Sometimes the best software is the kind you build for yourself!
If you’re a developer, I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week. Or you can find me on Mastodon at @[email protected].
-
NotebookLM Is Just RAG With a Nice UI
I’ve been watching AI YouTubers recommend NotebookLM integrations that involve authenticating your Claude instance with some random skill they built. “Download my thing, hook it up, trust me bro.” No details on how it works under the hood. No mention of why piping your credentials through someone else’s code might be a terrible idea. Let’s just gloss over that, I guess.
So here we are. Let me explain what NotebookLM actually is, because once you understand RAG, the magic disappears pretty quickly.
What Is RAG?
RAG stands for Retrieval Augmented Generation. It’s an AI framework that improves LLM accuracy by retrieving data from trusted sources before generating a response.
The LLM provides the reasoning and token generation. RAG provides specific, trusted context. Combining the two gives you general reasoning grounded in your actual data instead of whatever the model has or hallucinated from its training set.
The core pipeline looks like this:
- Take your trusted data (docs, PDFs, YouTube transcripts, whatever)
- Chunk it into pieces
- Create vector embeddings from those chunks
- Store the vectors in a database
- When you ask a question, embed the question into the same vector space
- Find the most similar chunks
- Feed those chunks into the LLM as context alongside your question
That’s it. That’s NotebookLM. Steps 1 through 6 are the retrieval half. Step 7 is where the LLM synthesizes an answer. The nice UI on top doesn’t change what’s happening underneath.
I Accidentally Built Half of It?
I was interested in the semantic embeddings portion of this pipeline and ended up building something I called Semantic Docs. It handles the retrieval half, steps 1 through 6.
You point it at a knowledge base, internal company docs, research papers, whatever you’re interested in. It chunks the content, creates vector embeddings, and stores them in a database. When you search, it creates a new embedding from your query, finds the most similar chunks, and returns those as search results.
The difference between Semantic Docs and NotebookLM is that last step. Semantic Docs gives you the relevant files and passages. It says “here’s where the answers live, go read it.” It doesn’t pipe everything through an LLM to generate a synthesized response. This is a choice, a deliberate choice, not a missing feature.
Why No Official API Is a Problem
NotebookLM doesn’t have an official API. People have reverse-engineered how it works, which means every integration you see is built on undocumented behavior that could break at any time. The AI YouTubers recommending these workflows are essentially saying “trust this unofficial thing with your data and credentials.” That should make you uncomfortable.
If you understand RAG, you can build the parts you actually need. The retrieval half is genuinely useful on its own, and you control the whole pipeline. No third-party authentication. No undocumented APIs. No wondering what happens to your data.
I’ll probably write more about RAG in the future. It’s a good topic and there’s a lot of noise to cut through. For now, just know that the next time someone tells you NotebookLM is magic, it’s really just vector search with a chat interface on top.
If you’re a developer, I’d appreciate a follow. You can subscribe with your email below. The emails go out once a week. Or you can find me on Mastodon at @[email protected].
-
The Human's Guide to the Command Line: From Script to System
This is Part 3 of The Human’s Guide to the Command Line. If you missed Part 1 or Part 2, go check those out first, you’ll need the setup from both.
In Part 2, we built a note-taking app. It works. But running it looks like this:
uv run ~/code/note/note.py add "call the dentist"That’s… not great. Real command-line tools don’t make you remember where a Python file lives. You just type the name and it works. By the end of this post, you’ll be able to type:
note add "call the dentist"From anywhere on your computer. Let’s make that happen.
Why Can’t I Just Type
note?When you type a command like
lsorbrew, your shell doesn’t search your entire computer for it. It checks a specific list of directories, one by one, looking for a program with that name. That list is called your PATH.You can see yours right now:
echo $PATHYou’ll get a long string of directories separated by colons. Something like
/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin. When you typebrew, the shell checks each of those directories until it finds a match.Right now,
note.pyis sitting in~/code/note/. That folder isn’t in your PATH, so the shell has no idea it exists. We need to turn our script into something installable, and put it somewhere the shell knows to look.Restructuring Your Project
Python has specific expectations about how a project needs to be organized before it can be installed as a tool. Right now your project looks like this:
note/ pyproject.toml note.pyWe need to reorganize it into a package. Here’s what we’re aiming for:
note/ pyproject.toml src/ note/ __init__.py cli.pyLet’s do it step by step. Open Ghostty and navigate to your project:
cd ~/code/noteCreate the new directories
mkdir -p src/noteThe
-pflag means “create parent directories too.” This creates bothsrc/andnote/inside it in one shot.Move your script
mv note.py src/note/cli.pyYour note-taking code now lives at
src/note/cli.pyinstead ofnote.py.Create the package marker
touch src/note/__init__.pyThis creates an empty file. Python uses
__init__.pyto recognize a folder as a package — it can be completely empty, it just needs to exist.Update pyproject.toml
Open your project in Zed:
zed .Find
pyproject.tomlin the sidebar and open it. You need to add two things. First, tell uv where your code lives by adding this section:[tool.uv] package = trueThen add this section to define your command:
[project.scripts] note = "note.cli:main"That line is doing something specific: it’s saying “when someone types
note, find thenotepackage, go into theclimodule, and call themainfunction.” That’s the samemain()function at the bottom of the code we wrote in Part 2.Your full
pyproject.tomlshould look something like this (the exact version numbers may differ):[project] name = "note" version = "0.1.0" description = "A simple command-line note taker" requires-python = ">=3.12" dependencies = [] [tool.uv] package = true [project.scripts] note = "note.cli:main"Verify the structure
Run this in Ghostty to make sure everything looks right:
ls -R src/You should see:
src/note: __init__.py cli.pyInstalling It For Real
Here’s the moment. From your project directory (
~/code/note), run:uv tool install .That
.means “install the project in the current folder” — same dot from Part 2 when we ranzed .to mean “open this folder.”What just happened? uv created an isolated environment for your tool and dropped a link to it in
~/.local/bin/. That directory is on your PATH (or should be), which means your shell can now find it.Try it:
note listYou should see your notes from Part 2 (or “No notes yet” if you cleared them). Try adding one:
note add "I installed my first CLI tool" note listThat works from anywhere. Open a new terminal tab, navigate to your home directory, your Desktop, wherever…
notejust works now.If you get “command not found”: Run
uv tool update-shelland then restart your terminal. This adds~/.local/binto your PATH in.zshrcso your shell knows where to find tools installed by uv.One thing to know about editing
When you ran
uv tool install ., uv copied your code into its own isolated environment. That means if you go back and editsrc/note/cli.py, your changes won’t show up until you reinstall:uv tool install --force .The
--forceflag tells uv to overwrite the existing installation. During development, you could also useuv tool install -e .(the-eis for “editable”) which keeps a live link to your source code so changes show up immediately without reinstalling.
You went from a Python script you had to run with
uv run note.pyto a real command that works from anywhere on your machine.Welcome to your first CLI!
-
The revenue design company | Orb
Design, execute, and operate revenue with usage-based billing. Orb helps modern software companies adapt pricing as products, usage, and costs evolve.
-
Lisette is a New Rust-to-Go Language, So I Built It a Test Library
This morning I dove into a new programming language called Lisette. I saw it from @lmika and had to take a look. It gives you Rust-like syntax but compiles down to Go, and you can import from the Go standard library directly.
It’s early in development, so a lot of things don’t exist yet. They have a roadmap posted with plans for third-party package support, a test runner, bitwise operators, and configurable diagnostics.
So Naturally, I Built a Test Library
Anyone who reads my blog knows I care a lot about testing. So when I saw “implement a test runner” sitting on the roadmap, I did what any reasonable person would do on a Monday morning. I built a testing library for Lisette called LisUnit.
I wanted something that felt familiar if you’ve used Jest or PHPUnit. Test cases are closures that return a result, and assertions work the same way. Here’s what it looks like:
lisunit.Suite.new("math") .case("add produces sum", || { lisunit.assert_eq_int(add(2, 3), 5)? Ok(()) }) .case("add is commutative", || { lisunit.assert_eq_int(add(2, 3), add(3, 2))? Ok(()) }) .run()Define a suite, chain your test cases, run it.
Why Bother?
I don’t know exactly what direction the Lisette team is headed with their own test runner, so this is just a prototype. Building a test library turns out to fun way to try out a new language because you end up touching a lot of language constucts?
I’ll probably keep poking at it as Lisette evolves. Happy Monday.
-
Lisette — Rust syntax, Go runtime
Little language inspired by Rust that compiles to Go.
-
Hiding Poems Inside Images
I built a tool that hides poems inside images. Not as metadata, not as a watermark. The actual text of the poem drives the visual pattern, and you can reconstruct the poem perfectly from the image alone.
How It Works
You give it a poem. It analyzes the syllable count, rhyme scheme, and stress patterns. Then it generates a visual pattern where those poetic features drive the aesthetics: spiral width, dot placement, block size, line weight.
The text itself is encoded into the pattern in a variety of ways. The pattern isn’t just inspired by the poem, it IS the poem. Run the decoder on the image and you get back the original text, character for character, including whitespace and punctuation.
Seven Renderers, Two Approaches
There are seven different visual styles, split into two categories.
The steganographic renderers (geometric, concentric, waveform) hide text invisibly in pixel color channels using LSB encoding. The visual pattern is purely decorative. This is a well-known technique. nothing new there.
I wanted to build something different, so I focused on visual encoding patterns. The encoding is the art, and the art is the encoding. Everything about how the image is constructed follows repeatable algorithms so it can be decoded back:
- Nautilus draws a golden spiral where line width carries the data
- Fibonacci uses a sunflower phyllotaxis dot pattern
- Mosaic creates an adaptive block grid
- Dotline connects dots with varying line weight
Each has different capacity. The nautilus spiral can hold about 2,200 characters, enough for a full Whitman poem. A fibonacci pattern holds about 1,600. Even the smallest renderer handles a haiku easily.
Right now I’m just having fun building different visuals, images that encode and decode. Eventually I might build an API around it, but for now it’s a side project…
A Basho’s haiku is here:

-
The Human's Guide to the Command Line: Your First CLI App
This is Part 2 of The Human’s Guide to the Command Line. If you missed Part 1, go check that out first — we got Homebrew and Ghostty set up, which you’ll need for everything here.
Now we’re going to do something that feels like a big leap: we’re going to write a real command-line application. A small one, but a real one. Before we get there, though, we need two things — a programming language and a code editor.
Picking a Language
You can write CLI apps in a lot of languages. JavaScript, Go, Rust — they all work. But if you’re new to programming, I think Python is the right starting point. It reads almost like English, it’s everywhere, and you won’t spend your first hour fighting a compiler.
Python is what we’ll use for this series. That said, Python is notoriously tricky to set up locally; there are version managers, virtual environments, and a whole ecosystem of packaging tools that can make your head spin.
First things first, we need a code editor (an IDE).
A Code Editor: Zed
Before we write any code, you need somewhere to write it. We’re going to install Zed — it’s fast, clean, and won’t overwhelm you with buttons.
brew install --cask zedOpen Zed from your Applications folder once it finishes.
Install the
zedCommandHere’s the move that ties everything together. Open Zed’s command palette with Cmd + Shift + P, type
cli install, and hit Enter.Now you can open any folder in Zed directly from Ghostty:
zed . # open the current folder zed note.py # open a specific fileThat
.means “right here” — you’ll see it everywhere in the terminal and it always means the same thing: the folder you’re currently in.Set Up Your Project
Time to create a home for your code. Back in Ghostty:
mkdir ~/code mkdir ~/code/note cd ~/code/noteThree commands. You just created a
codefolder in your home directory, created anotefolder inside it, and stepped into it. Now open it in Zed:zed .Zed opens with your empty project folder in the sidebar on the left. This is where we’ll create our script.
Building a Note Taker
We’re going to build a tiny app called
note. It does three things: add a note, list your notes, and clear them all. That’s it. No database, no accounts, no cloud. Just you and a JSON file.Create your script in Zed’s sidebar by clicking the New File icon, name it
note.py, and paste in the code from the next section.import argparse import json import sys from datetime import datetime from pathlib import Path # Where your notes live — ~/notes.json NOTES_FILE = Path.home() / "notes.json" def load_notes(): if not NOTES_FILE.exists(): return [] with open(NOTES_FILE) as f: return json.load(f) def save_notes(notes): with open(NOTES_FILE, "w") as f: json.dump(notes, f, indent=2) def cmd_add(args): notes = load_notes() note = { "text": args.text, "added": datetime.now().strftime("%Y-%m-%d %H:%M"), } notes.append(note) save_notes(notes) print(f"Added: {args.text}") def cmd_list(args): notes = load_notes() if not notes: print('No notes yet. Add one with: note add "your note"') return for i, note in enumerate(notes, start=1): print(f"{i}. {note['text']} ({note['added']})") def cmd_clear(args): answer = input("Delete all notes? This can't be undone. (y/n): ") if answer.lower() == "y": save_notes([]) print("All notes cleared.") else: print("Cancelled.") def main(): parser = argparse.ArgumentParser( prog="note", description="A simple command-line note taker.", ) subparsers = parser.add_subparsers(dest="command") # note add "some text" add_parser = subparsers.add_parser("add", help="Add a new note") add_parser.add_argument("text", help="The note to save") # note list subparsers.add_parser("list", help="Show all notes") # note clear subparsers.add_parser("clear", help="Delete all notes") args = parser.parse_args() if args.command == "add": cmd_add(args) elif args.command == "list": cmd_list(args) elif args.command == "clear": cmd_clear(args) else: parser.print_help() if __name__ == "__main__": main()Part 3: Python Without the Mess (uv)
Your Mac already has Python on it, but don’t use it. That version belongs to macOS which it uses internally, and if you start installing things into it you can cause yourself real headaches. We’re going to use our own Python that’s completely separate.
The tool for this is uv. It manages Python for you and keeps everything isolated so you never touch the system version.
brew install uvThat’s the whole install. Verify it worked:
uv --versionYou should see a version number printed back. If you do, you’re good.
Create Your Project
Inside your
~/code/notefolder, run:uv initThis sets up a proper Python project in the current folder. You’ll see a few new files appear in Zed’s sidebar, but don’t worry. If you see a
hello.py, which uv creates as a starter file, you can delete it.Running Your Script
Once you have code in
note.py, run it like this:uv run note.py add "call the dentist" uv run note.py list uv run note.py clearuv runhandles everything — it picks the right Python version, keeps it sandboxed to this project, and runs your script. You never typepython3directly, you never activate a virtual environment, you never install packages globally. It just works.Try it now:
uv run note.py listIf you see
No notes yet. Add one with: note add "your note"but otherwise, everything should be working.
Try this: Once your script is running, try
uv run note.py --help. You’ll get a clean description of every command, automatically. That’s one of the thingsargparsegives you for free. -
The Human's Guide to the Command Line (macOS Edition)
If you’ve ever stared at a terminal window and felt like you were looking at the cockpit of an airplane, this post is for you. The command line doesn’t have to be scary, and I’m going to walk you through setting up a genuinely great terminal experience on your Mac, from scratch.
This is Part 1 of a Series. I will update the links here as I publish the other articles.
The Mental Model
Before we type anything, here’s the thing you need to understand: the terminal is just a text-based version of Finder.
When you see a folder in Finder, you double-click to open it. In the terminal, you type
cd(change directory) to enter it. You can run applications from Finder, and you can run applications from the terminal. It’s just a different way to do things you already understand.Once that clicks, everything else falls into place.
Part 1: The Foundation (Homebrew)
On a Mac, you install apps from the App Store. On the command line, we use Homebrew — it’s basically the App Store for the terminal. It’s a package manager that makes installing tools painless.
Let’s start by opening the terminal. Hit Command + Space, type “Terminal”, and hit Enter. Then paste this:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"It’ll ask for your Mac password. When you type it, nothing will appear on the screen. This is normal, it’s a security feature, not a bug. Just type your password and hit Enter.
Homebrew needs admin access because it creates system-level folders (usually
/opt/homebrew) and changes their ownership from the system account to your user account.Once it finishes, verify it’s working:
which brewIf that doesn’t return a path, run these two lines to add Homebrew to your shell:
echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> ~/.zprofile eval "$(/opt/homebrew/bin/brew shellenv)"Part 2: A Better Terminal (Ghostty)
Now that we have Homebrew, the first thing we’re going to install is a better terminal. The built-in Terminal app works, but Ghostty is faster, more configurable, and just nicer to use.
brew install --cask ghosttyOnce it finishes, find Ghostty in your Applications folder and open it. You can close the old Terminal app forever, congratulations, you’ve graduated.
Part 3: Making It Beautiful (Nerd Fonts & Starship)
Standard fonts don’t have icons for folders, git branches, or cloud status. We need a Nerd Font so the terminal can speak in pictures.
Install the Font
brew install --cask font-fira-code-nerd-fontThen go to Ghostty’s settings (
Cmd + ,), find the Text section, and set your font toFiraCode Nerd Font.Install Starship
Starship is the paint job that turns a boring
$prompt into something colorful and actually helpful. It shows you what folder you’re in, what git branch you’re on, and more…brew install starshipWhile we’re at it, let’s install two plugins that make typing in the terminal way more pleasant. One whispers suggestions based on your command history, and the other color-codes your commands so you can spot typos before you hit Enter.
brew install zsh-autosuggestions zsh-syntax-highlightingWire It All Up
We need to tell your Mac to turn these features on every time you open a terminal. Copy and paste this entire block into Ghostty:
# Add Starship grep -qq 'starship init zsh' ~/.zshrc || echo 'eval "$(starship init zsh)"' >> ~/.zshrc # Add Auto-suggestions grep -qq 'zsh-autosuggestions.zsh' ~/.zshrc || echo "source $(brew --prefix)/share/zsh-autosuggestions/zsh-autosuggestions.zsh" >> ~/.zshrc # Add Syntax Highlighting grep -qq 'zsh-syntax-highlighting.zsh' ~/.zshrc || echo "source $(brew --prefix)/share/zsh-syntax-highlighting/zsh-syntax-highlighting.zsh" >> ~/.zshrc # Refresh your terminal source ~/.zshrcEach line checks if the setting already exists before adding it, so it’s safe to run more than once. When it finishes, your prompt should look completely different. Welcome to your CLI future.
Part 4: Basic Survival Commands
Now that your terminal looks proper, here’s how you actually move around.
Command Human Translation pwd“Where am I right now?” ls“Show me everything in this folder.” cd [folder]“Go inside this folder.” cd ..“Go back one folder.” clear“Wipe the screen and start fresh.” touch [file.txt]“Create a new blank file.” That’s about 90% of what you need to navigate around.
Two Golden Rules
1. Tab is your best friend. Type
cd Docand hit Tab — the terminal will finish the wordDocumentsfor you. This works for files, folders, and even commands. If there are multiple matches, hit Tab twice to see all the options.2. Ctrl + C is the panic button. If the terminal is doing something you didn’t expect, or a command is running and won’t stop, hold Ctrl and press C. It’s the emergency brake, and it’s always there for you.
A good next step: try navigating to your Desktop using only
cdandls. Find a file there, maybe create one withtouch, and then look at it withls. Once you can do that comfortably, you’ve officially mastered the basics. -
I Benchmarked JSON Parsing in Bun, Node, Rust, and Go
I’m just going to start posting about JSON everyday. Well ok, maybe not every day, but for the next few days at least. Later this week I’ve committed to writing a guide on getting started with CLIs for non-programmers, so stay tuned for that.
This morning I benchmarked JSON parsing across four runtimes: Bun, Node, Rust, and Go.
The Results
- Bun is the overall winner on large files — 307-354 MB/s, beating even Rust’s serde_json for untyped parsing
- Rust wins on small/nested data (225 MB/s small, 327 MB/s nested) due to low overhead
- Node is close behind Bun — V8’s JSON.parse is very optimized
- Go is ~3x slower than the JS runtimes on large payloads (encoding/json is notoriously slow)
- Memory: Bun reports 0 delta (likely GC reclaims before measurement), Rust’s tracking allocator shows the true heap cost (73-96MB), Go uses 52-65MB
Rust’s numbers were the most honest here since the tracking allocator catches everything. We should take Bun result with grain of salt because benchmarking memory in GC’d languages is tricky.
The json parser in v8 in node is the exact same as what is in Chrome…
Here’s the full test results if you want to dig into the numbers yourself.
More JSON content coming soon. You’ve been warned.