Claude Code may accidentally give you a spinning fire Dorito

In January of 2024, I published a free coding curriculum called Internet Menace. It covered Python, SQL, Docker, Git, and the basic architectural vocabulary of cloud computing. I wanted people to understand things like why you select a certain type of database, and what a container actually is versus what people imagine a container does. It’s the sort of stuff that I think my brain takes for granted, simply because I have done this for so long now, it’s muscle memory. But if you’re new to it, it sounds like you’re recanting Harry Potter spells. This is especially true for AWS, where everything is named something absolutely nonsensical. To this day, I feel like AWS marketing people wake up every morning and think to themselves, “What crazy shit can we think up that will fuck with our customers? Let’s call the next service, ‘AWS Deez’ and have everyone just saying ‘deez nuts’ in the office all day.”

I digress…

I gave away Internet Menace for free because I believe that the fundamentals matter, and I still believe this. Heck, I believe it now more than I did two years ago, which is convenient because two years ago the prevailing wisdom was that coding was going to become this vestigial skill that you only did 30 years ago, like being able to write in cursive or the Dallas Cowboys winning Super Bowls. That’s not the way things worked at all. The argument two years ago was that large language models were going to write all the code, and the rest of us would be free to learn watercolors or something, I guess?

The last part about machines writing code has definitely not happened. What has actually happened is… weird. In my opinion, the “weirdness” is the reason that I need to address how machine-generated coding actually works.

I had a conversation this week with my co-worker Alex. We were going through markdown templates in our internal Git repo. (That’s a crazy sentence to write. Five years ago, that sentence didn’t exist, because you couldn’t prompt your way into a working application.) Somehow, this is extremely normal today. Alex said something that was profound, and I wanted to highlight why I think he was right. He said that I have “allowed Claude Code to become my hands.”

This is exactly how I’m using machine-generated code. I want to provide some clarity on what I mean by “becoming my hands” — because it sounds like the kind of thing that you say before you open an incognito tab. There’s a machine for that, but it’s not Claude Code. It’s called a Fleshlight, and no, you should not Google that from your work computer. If you don’t know… wait until you get home.

Machine generated code is misunderstood, especially in the engineering and development industry. I think that a lot of people are getting it wrong.

Some people have called this process “specification coding.” I don’t think that’s accurate, and I also don’t believe that specifications are enough to generate reliable applications. Specifications can be open-ended. Specifications are also annoyingly absent in specificity when it comes to real engineering decisions.

I often come back to cars as an analogue for software, because it’s something I connect with personally, and I think it’s also something we have enough first-hand experience with to understand how writing a specification is often not enough. So if you will please, allow me this analogy, even if you don’t follow it directly…

If I told you that I wanted a specification for a two-seat sports car that made over 500 wheel horsepower using an internal combustion engine, and had a six-speed manual transmission, leather bucket seats and a host of other sports car-centric features, I could get some radically different outcomes.

In this sports car example, we cannot even assume this car has cylinders. For those of you unfamiliar, there is a type of engine design called a rotary engine, and it looks like a Dorito spinning around inside a circular chamber. There are these things called apex seals that tend to degrade. It consumes oil; it sounds different, and they rev to stratospheric levels. In this case, if I was using specification coding, I could tell you very specific things about engine output levels, How it must consume gasoline and even things as specific as it must be fuel injected rather than carbureted. Unless I specifically said, “This must be a car that has cylinders for its internal combustion engine,” it is within the realm of possibility that Claude Code would metaphorically know that the Mazda RX-7 exists and could give our metaphorical car a rotary engine.

There is nothing wrong with having a rotary engine. They are amazing for their engineering specifications and their unique sound, which sounds like a barrel full of angry bees, And also their ability to rev to 9,000 RPM. But if you were to create this specification, you would then be married to the maintenance that comes with rotary engines, which can often be painful and tedious, especially if you’re the owner of a Mazda RX-8.

So now we have to get back to the whole thing where I was talking about specifications for how we use and design applications that have been “vibe coded.” Knowing specifications is not enough, and being able to write specifications only allows you to perceive how an end user may abstract an application rather than engineer an application.

Engineering decisions are actually really fucking important. There’s a lot of shit you don’t know as an end user.

When I write a markdown file or a series of markdown files and I hand it off to a coding assistant, I am not asking the machine to figure out what I want. I am telling a machine in specific and tedious details exactly what I want. I am telling it which framework, I am telling it how the database should be structured and why, and the shape of the data, what the model-view-controller relationship looks like in this particular application. If I’m going to deviate from any standard MVC pattern, I write down why that deviation matters, and why it will structurally change how the application should work. I even go into great detail about how things on the front end of an interface should look. Things like why the button pixel radius should be a certain way, and why drop shadows should look a certain way, because enterprise and consumer-facing applications have different ways that they look.

Look, I don’t know why these things exist in my brain. They just do. And they’re true. They’re true in a way that Claude Code doesn’t know to be true, because Claude Code has never actually been the end user of a shitty government interface, versus a gorgeous consumer interface.

I’m going prompt together whether it’s going to use a REST API or a GraphQL API and how role and policy-based access control will be enforced at the middleware layer rather than just bolted on at the controller layer, like some sort of afterthought. I’m telling it about any third-party software vendors that it’s going to be using, what quirks to expect with that software, and sometimes why the documentation in third-party software does not match its actual capability.

The reason I know to do all these things is because I have spent 20 plus years writing code and building systems for the private sector and the public sector. I know when you ship something that doesn’t work, people will whine about it, because shipping bad code makes you a shitty engineer. This is what the fundamentals are for. The fundamentals are not a quaint pre-requisite, like when your parents tell you you need to learn to drive a manual before you drive an automatic. The fundamentals are the vocabulary you use to instruct the machine. Without the fundamentals, you are not programming. You are not vibe coding. You are not writing specifications.

You are wishing.

Wishing is not the way production applications have ever been made or will ever be made.

There is an impossibly large cohort of people online right now who have decided that the correct response to large language models is something they call vibe coding. Vibe coding, as far as I understand it, is the practice of typing a vague desire into a chat window and hoping magically that somehow it ships. This is like standing in front of your stove and saying, “I’d like dinner, please.” You haven’t told it anything about what you want for dinner, what ingredients are available, or things that you may take for granted, like how to use the stovetop or what foods may burn and will not burn.

Sometimes vibe coding works, but usually it does not. When it does not work, people who vibe code blame the model for hallucinating, which is a verbal sleight of hand that makes absolutely no sense. The model is not automatically hallucinating because you didn’t get it to do what you wanted it to do. The model is filling in gaps that you left because you did not know what to ask for, how to ask for it, or when to ask for it. A model does not know how to invent what you want. This is not a flaw. This is just how large language models work.

I bought a new dishwasher this week. It’s called the Bosch 800 series. It is the Rolls Royce of dishwashers.

This is a very hard pivot in the middle of an essay, and you’re just going to have to deal with it. I promise it makes sense in the context of how we use machine-generated code. I’m going to make this about dishwashers. But if you know me, you know that I have a particular affinity for dishwashers. I did a whole podcast episode about dishwashers because I contain multitudes and I also care about dishwashers in an unhealthy way.

A dishwasher is a genuinely elegant machine. It is a sealed tub that fills about 2.5 gallons of water in the bottom and then recycles that water through pressurized jets using detergents that combine long-chain surfactants and enzymes designed to specifically break apart proteins, starches, and fats that make up food residue. It is more water efficient than hand washing, and it is more time efficient than hand washing. It is, in almost every measurable way, better than hand washing, but it is only better at hand washing if you load it correctly.

If you put in a cup upside down, or in this case right side up, the way you might put a cup on a counter to fill it with liquid — it will fill with water. If you put plastic Tupperware on the bottom rack, it will come out wet because the drying cycle depends upon the thermal mass of your ceramic plates and glasses to retain the heat and evaporate residual water. Since plastic has no thermal mass, it just doesn’t work.

If you do not put your forks and spoons and knives into the silverware basket and organize rows, you will lose minutes on the other end of unloading by having to sort them later, which eats the entire efficiency gain the dishwasher provided you in the first place by creating organized baskets. I don’t really want to talk about the points up versus points down debate right now because it’s a distraction, but there is a correct way to do it.

My point is that a dishwasher is exactly as effective as the person loading it, and most people load dishwashers badly, and then they complain about the dishwasher.

People also use the incorrect soap. You need to use powdered soap.

Machine-generated code is a dishwasher. You have to load it correctly.

The loading is what the fundamentals are for. The fundamentals are how you know what the spoons go in the silverware basket. The fundamentals are how you know that the wine glasses must be angled a certain way in order for the stems and the interiors to both get clean. If you do not understand at a working level what a foreign key constraint does, you cannot write a prompt that produces a database schema worth deploying. If you do not understand what a load balancer is, you cannot articulate why the application needs to be stateless. If you have never built a front-end interface and lived with the consequences of shitty front-end design, you cannot describe a user flow in enough detail for a machine to render it correctly, because you will not know which details matter and which details are errant and sometimes stupid.

I’m viewing the current moment through the same lens I used to view the Industrial Revolution. This is probably something that makes me insufferable as a software engineer, but bear with me.

The late 19th century produced the same anxiety we are producing now. The looms were going to replace the weavers. The weavers did not need to learn how to weave anymore because the machines would do it. This was both true and also incomplete. What actually happened was that the people who understood textiles, who understood what made a good shirt versus a bad shirt, who understood thread count and weave pattern and fabric weight, those people designed the mills and ran the mills and got rich. The people who thought the machines would think for them became the people who worked on the machines 14 hours a day and lost fingers.

The Industrial Revolution did not eliminate expertise. It concentrated it and amplified it — and it made expertise more valuable than it had ever been before, because the person with expertise and a machine could now pair up to do the work of a hundred people who had no expertise.

This is the thing that I want everyone to understand about agentic code: I am not replacing senior engineers. I am concentrating senior engineering. A middle-level engineer who knows what they are doing can now do the work of ten mid-level engineers who know what they are doing. A senior engineer with a voice-to-text app exactly like the one I’m using right now (shoutout to Wispr Flow) and a well-structured markdown file can build things that an entire team would have built over the course of six months. I know this because I am doing it actively right now in my day-to-day activities. My team is doing it. A three-person team is doing the work of a twenty-person team.

This week I wrote a prompt that consisted of three top-level markdown files and four subdirectories. In each subdirectory, I had another four markdown files. The total was roughly 9,000 lines of markdown files. I did not type 9,000 lines. I dictated them into Wispr Flow, which translated my speech to text, which I then fed into a planning model, which restructured the plan, which then fed the plan into another agent, which generated the code. That is one abstraction on top of another abstraction on top of another abstraction on top of another abstraction, and none of this works if I do not know at the very bottom of the stack exactly what I want the machine to do.

Right now, my team is building an application for internal contract management at our company. This is a very narrow, very specific task, and it does exactly one thing for a very small part of our business. Two years ago, this project simply would not have existed. It would have been solved, or probably more generously half-solved, by buying a SaaS subscription from a vendor who built a contract management tool for a generic version of our company.

SaaS is buying off the rack. You buy it and you wear it, and it’s fine. Mostly the sleeves are okay, maybe a little long, and maybe the shoulders are a little too wide, but it works. It’s simply fine. What we are building instead of SaaS is a bespoke suit. It is cut exactly to the measurements of our exact actual workflow. Historically, the only way to get a bespoke suit was to pay an enormous premium to a craftsman, a tailor who has decades of experience and can make a perfect suit for you. With machine-generated coding, that premium is almost gone because the tailor is a machine and the machine will make whatever you can describe.

The catch iis that most people absolutely suck at describing what they want because most people never had to describe what they want. This is not a skill that most people have because you didn’t need the skill of describing what you wanted before. SaaS trained an entire generation of workers to simply accept approximation as the default condition for software. You could ask for a feature, and then the vendor might say no, and then you would just adjust your workflow until every adjustment becomes the memory of what you wanted in the first place.

We call these work-arounds. I’d also like to clarify that these are not called “reach-arounds.” Recently, I was part of an engineering meeting where someone (non-technical) said that they were doing “reach-arounds” — and I know they meant “work-arounds.” I think they thought these two terms were interchangeable or the same. They are not interchangeable. A “reach-around” is not a “work-around”.

But it was a very entertaining mistake, and I did not bother correcting them, in the same way sometimes Claude Code will allow you to make mistakes because technically the thing you’re saying does exist. Claude Code will give you a reach around if you ask for a reach-around.

I hope the the person that I did not correct continues using this phrase in meetings well into the future.

Now, I need to continue…

Bespoke software requires you to actually know your requirements. It means specific measurements. It means tight specifications, and it requires the kind of clarity that most software engineers are not used to asking for. Most organizations have spent the better part of the past 30 years actively destroying the name of agile flexibility, which is why most organizations are discovering right now that their flexibility has made them structurally incapable of taking advantage of this one new technology that actually rewards being granular and specific.

I want to summarize this with a piece of vocabulary, and it was a piece of vocabulary that I picked up from a friend of mine who was a mechanic. This is one of those dudes who worked at Ford for like 30 years as a mechanic, and he taught me about AM and FM.

AM stood for actual machines.

FM stood for fucking magic.

The idea is that when something broke, you could either fix it through an actual machine, which required understanding the mechanism, diagnosing the failure, and applying the correct intervention to fix the thing. Alternatively, you could fix it through fucking magic, which is what you tell your customer when you have no idea why it started working again for no reason at all.

Claude code is not fucking magic. Cursor is not fucking magic. Codex is not fucking magic. None of these are fucking magic. They are actual machines. They are underneath all the marketing copy and the breathless LinkedIn threads, enormous piles of linear algebra making probabilistic guesses about the next token in a sequence. Sometimes the next token is correct; sometimes the next token is plausible, which is not the same thing, and that is the reason a senior engineer still has to go in and fix things where the machine chose plausible instead of correct.

This is why senior engineers still have to hand finish everything. This is the part of the suit where the tailor sits down with a needle and does the last stitches of all the buttons and the sleeves. The tailor makes sure that everything fits perfectly, because the machine is incapable of doing those last eight stitches. The last eight stitches are not about logic; they are about specification, and they are about these particularly irrational requirements of people who use software or wear the suit to use the metaphor.

The rule of AM-FM is that you do not get to skip the AM in order to pretend FM is real. You can’t outsource your understanding of a problem to a machine and then blame the machine when the understanding is missing. The machine cannot think for you. The machine was never going to think for you. The machine is a dishwasher. It is a loom. In the final accounting, it is a product that is a faster way for a person who knows what they want to get what they want. Everyone else is just going to have cups full of water.