Forget AI "code", every single request will be processed BY AI!
People aren't thinking far enough, why bother with programming at all when an AI can just do it?
It's very narrow to think that we will even need these 'programmed' applications in the future. Who needs operating systems and all that when all of it can just be AI.
In the future we don't even need hardware specifications since we can just train the AI to figure it out! Just plug inputs and outputs from a central motherboard to a memory slot.
Actually forget all that, it'll just be a magic box that takes any kind of input and spits out an output that you want!
--
Side note: It would be really interesting to see a website that generates all the pages every time a user requests them, every time you navigate back it would look the same, but some buttons in different places, the live chat is in a different corner, the settings now have a vertical sidebar instead of a horizontal menu.
> On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
An Artificial Intelligence on that level would be able to easily figure out what you actually want. We should maybe go one step further and get rid of the inputs? They just add all of that complexity when they're not even needed.
At some point we just need to acknowledge that such speculation is like asking for a magic genie in a bottle rather than discussing literal technology.
Mr. Babbage apparently wasn't familiar with the idea of error correction. I suppose it's only fair; most of the relevant theory was derived in the 20th century, AFAIR.
No, error correction in general is a different concept than GIGO. Error correction requires someone, at some point, to have entered the correct figures. GIGO tells you that it doesn't matter if your logical process is infallible, your conclusions will still be incorrect if your observations are wrong.
I suppose spell checking is a sort of literal error correction. Of course this does require a correct list of words and misspellings to not be on that list.
> It would be really interesting to see a website that generates all the pages every time a user requests them
We tried a variant of this for e-commerce and my take is that the results were significantly better than the retailer's native browsing experience.
We had a retailer's entire catalog processed and indexed with a graph database and embeddings to dynamically generate dynamic "virtual shelves" each time when users searched. You can see the results and how it compares to the retailer's native results
It is nicer, but that interface latency would turn me off. When I search for online groceries, I want simple queries e.g spinach to return varieties (e.g frozen, fresh, and canned) of spinach as fast as possible.
I don't know where we are on the LLM innovation S-curve, but I'm not convinced that plateau is going to be high enough for that. Even if we get an AI that could do what you describe, it won't necessarily be able to do it efficiently. It probably makes more sense to have the AI write some traditional computer code once which can be used again and again, at least until requirements change.
The alternative is basically https://ai-2027.com/ which obviously some people think is going to happen, but it's not the future I'm planning for, if only because it would make most of my current work and learning meaningless. If that happens, great, but I'd rather be prepared than caught off guard.
Exactly. People are too focused on shoehorning AI into today’s “humans driving computers” processes, instead of thinking about tomorrow’s “computers driving computers” processes. Exactly as you say, today it’s more efficient to create a “one size fits all” web site because human labor is so expensive, but with computer labor it will be possible to tailor content to each user’s tastes.
I think you’re joking in general, but your sidenote is already extremely close to websim[0] which is an art adjacent site that takes a URL as prompt and then creates the site. The point is effectively a hallucinated internet, and it is a neat toy
> Side note: It would be really interesting to see a website that generates all the pages every time a user requests them, every time you navigate back it would look the same, but some buttons in different places, the live chat is in a different corner, the settings now have a vertical sidebar instead of a horizontal menu.
Please don't give A/B testers ideas, they would do that 100% unironically given the chance.
My hot take is that using Cursor is a lot like recreational drugs.
It's the responsibility of the coder/user/collaborator to interact with the code and suggestions any model produces in a mindful and rigorous way. Not only should you have a pretty coherent expectation of what the model will produce, you should also learn to default/assume that each round of suggestions is in fact not going to be accepted before it is. At the very least, be prepared to ask follow-up questions and never blindly accept changes (the coding equivalent of drunk driving).
With Cursor and the like, the code being changed on a snippet basis instead of wholesale rewriting that is detached from your codebase means that you have the opportunity to rework and reread until you are on the same page. Given that it will mimic your existing style and can happily explain things back to you in six months/years, I suspect that much like self-driving cars there is a strong argument to be made that the code it's producing will on average be better than what a human would produce. It'll certainly be at least as consistent.
It might seem like a stretch to compare it to taking drugs, but I find that it's a helpful metaphor. The attitude and expectations that you bring to the table matter a lot in terms of how things play out. Some people will get too messed up and submit legal arguments containing imaginary case law.
In my case, I very much hope that I am so much better in a year that I look back on today's efforts with a bit of artistic embarrassment. It doesn't change the fact that I'm writing the best code I can today. IMO the best use of LLMs in coding is to help people who already know how to code rapidly get up to speed in domains that they don't know anything about. For me, that's been embedded development. I could see similar dynamics playing out in audio processing or shader development. Anything that gets you over that first few hard walls is a win, and I'll fight to defend that position.
As an aside, I find it interesting that there hasn't been more comparison between the hype around pair programming and what is apparently being called vibe coding. I find evidence that one is good and one is bad to be very thin.
It is kind of like reviewing PRs from a very junior developer that might also give very convincing but buggy code and has no responsibility. Seriously don’t see the point of it except doing copy paste refactoring or writing throw-away scripts. Which is still a lot so it is useful.
It needs to improve a lot more to match the expectations and it probably will. It is a bit frustrating to realise a PR is AI generated slop after reviewing 500 of 1000 lines
This is actually supporting my point, though (again IMO).
There's a world of difference between a very junior dev producing 1000 line PRs and an experienced developer collaborating with Cursor to do iterative feature development or troubleshoot deadlocks.
Also, no shade to the fictional people in your example but if a junior gave me a 1000 line PR, it would be part of my job as the senior to raise warning bells about the size and origin of such a patch before dedicating significant time to reviewing it.
As a leader, its your job to clearly define what LLMs are good and bad for, and what acceptable use looks like in the context and environment. If you make it clear that large AI generated patches are Not Cool and they do it anyhow... that's a strike.
I think the best usecase by far is when you don't know where to start. The AI will put something out there. Either it's right in which case great, or it's wrong, and then trying to analyze why it's wrong often helps you get started too. Like to take a silly example let's say you want to build a bridge for cars and the AI suggests using one big slab of paper maiche. You reject this but now you have two good questions: what material should it have? and what shape?
I did start by disclaiming a hot take, so forgive my poetic license and unintentional lede burying.
What I'm trying to convey is a metaphorical association that describes moderation and overdoing it. I'm thinking about the articles I've read about college professors who are openly high functioning heroin users, for example.
Every recreational drug has different kinds of users: social drinkers vs abusive alcoholics, people who microdose LSD or mushrooms vs people who spend time in psych wards, people who smoke week to relax vs people who go all-in on slacker lifestyle. And perhaps the best for last: people who occasionally use cocaine as a stimulant vs whatever scene you want to quote from Wolf of Wall Street.
I am personally convinced that there are positive use cases and negative use cases, and it usually comes down to how much and how responsible they are.
It's tough to generalize about "AI code", there's a huge difference between "please make a web frontend to this database that displays table X with some ability to filter it" and "please refactor this file so that instead of using plain strings, it uses the i18n api in this other file".
trivial doesn't mean the AI will get it right. A trivial request can be to move an elephant into a fridge. Simple concept right?
Except AI will probably destroy both the elephant and the fridge and order 20 more fridge of all sizes and elephants for testing in the mean time (if you're on MCP). Before asking you that if you mean an cold storage facility, or if it is actually a good idea in the first place
Disagree. There is almost no decision making in converting to use i18n APIs that already have example use cases elsewhere. Building a frontend involves many decisions, such as picking a language, build system, dependencies, etc. I’m sure the LLM would finish the task, but it could make many suboptimal decisions along the way. In my experience it also does make very different decisions from what I would have made.
They’re inherently very different activities. Refactoring a file assumes you’ve made a ton of choices already and are just following a pattern (something LLMs are actually great at). Building a front-end from nothing requires a lot of thought, and rather than ask questions the LLM will just give you some naive version of what you asked for, disregarding all of those tough choices.
Building even a small a web frontend involves a huge number of design decisions, and doing it well requires a detailed understanding of the user and their use-cases, while internationalisation is a relatively mechanical task.
One definition of legacy code I have seen is code without tests. I don't fully agree with that, but it is likely that your untested code is going to become legacy code quickly.
Everyone should be asking AI to write lots of tests- to me that's what AI is best at. Similarly you can ask it to make plans for changes and write documentation. Ensuring that high quality code is being created is where we really need to spend our effort, but its easier when AI can crank out tests quickly.
My favorite definition of "legacy code" is "code that works".
Anybody know where that quip originated? (ChatGPT tells me Brian Kernighan - I doubt it. That seems like LLM-enabled quote hopping - https://news.ycombinator.com/item?id=9690517)
After hearing legacy code defined as "code that runs in production," it reset my perception around value and thoughtful maintenance. Cannot find the reference, though.
I would say that legacy code is something that one does not simply change.
You cannot update any dependencies because then everything breaks.
You cannot even easily add new features because it is difficult to even run the old dependencies your code is using.
With LLMs, creating legacy code is using some old APIs, old patterns to do something, that is not relevant anymore, but the LLM does not know about.
E.g. if you ask any LLM to use Tailwind CSS, they use V3 no matter what you try to do while the V4 is the latest. LLMs try to tell you that pure CSS configuration is wrong and you should use the .js config.
The opening of the article derives from (or at least relates to) Peter Naur's classic 1985 essay "Programming as Theory Building". (That's Naur of Algol and BNF btw.)
Naur argued that complex software is a shared mental construct that lives in the minds of the people who originally build it. Source code and documentation are lossy representations of the program—lossy because the real program (the 'theory' behind the code) can never be fully reconstructed from them.
Legacy code here would mean code where you still have the artifacts (source code and documentation), but have lost the theory, because the original builders have left the team. That means you've lost access to the original program, and can only make patchwork changes to the software rather than "deep improvements" (to quote the OP). Naur gives some vivid examples of this in his essay.
What this means in the context of LLMs seems to me an open question. In Naur's terms, do LLMs necessarily lack the theory of a program? It seems to me there are other possibilities:
* LLMs may already have something like a 'theory' when generating code, even if it isn't obvious to us
* perhaps LLMs can build such a theory from existing codebases, or will be able to in the future
* perhaps LLMs don't need such a theory in the way that human teams do
* if a program is AI-generated, then maybe the AI has the theory and we don't!
* or maybe there is still a theory, in Naur's sense, shared by the people who write the prompts, not the code.
There was an interesting recent article and thread about this:
> It can infer why something may have been written in a particular way, but it (currently) does not have access to the actual/point-in-time reasoning the way an actual engineer/maintainer would.
Is that really true? A human programmer has hidden states, i.e. what is going on in their head cannot be fully recovered by just looking at the output. And that's why "Software evolves more rapidly under the maintenance of its original creator, and in proportion to how recently it was written", as is astutely observed by the author.
But transformer based LLMs do not have this hidden state. If you retain the text log of your conservation with an LLM, you can reproduce its inner layer outputs exactly. In that regard, an LLM is actually much better than humans.
My old boss and I used to defend ourselves to younger colleagues with the argument that "This is how you did it back in the day". Mostly it was a joke, to "cover up" our screw-ups and "back in the day" could be two weeks ago.
Still, for some things we weren't wrong, our weird hacks where do to crazy edge cases or integrations into systems designed in a different era. But we where around to help assess if the code could be yanked or at least attempt to be yanked.
LLM assisted coding could technically be better for technical debt, assuming that you store the prompts along side the code. Letting someone what prompt generated a piece of code could be really helpful. Imagine having "ensure to handle the edge case where the client is running AIX 6". That answers a lot of questions and while you still don't know who was running AIX, you can now start investigating if this is still needed.
> ensure to handle the edge case where the client is running AIX 6
Regardless if the source was AI or not, this should just be a comment in the code, shouldn’t it? This is exactly the sort of thing I would ask for in code review, so that future authors understand why some weird code or optimization exists.
It should, but I think most of us read enough old code to know that this doesn't happen as often as we'd like. With an LLM you already wrote the prompt, so if you had an easy way to attach the prompt to the code it could make it more likely that some form of documentation exists.
Some times you also fail to write the comment because at the time everyone knew why you did it like that, because that's what everyone did. Now it's 10 years later and everyone doesn't know that. The LLM prompt could still require you to type out the edge case that everyone in your line of business knows about, but might not a generalised across the entirety of the software industry.
No. Plainly incorrect by any reasonable definition (hint: it's in the memory of the people working on it! As described in OP!), and would immediately render itself meaningless if it were true.
That time window is when the code is not legacy yet. When the developers who wrote the code are still working on the code, the code is loaded into their collective brain cache, and the "business needs" haven't shifted so much that their code architecture and model are burdensome.
It's pithy to say "all code is legacy" but it's not true. Or, as from the other reply, if you take the definition to that extreme, it makes the term meaningless and you might as well not even bother talking, because your words are legacy the instant you say them.
How long code needs to last is actually highly variable, and categorical absolutist statements like this tend to generally be wrong and are specifically wrong here. Some code will need to change in a year. Some will need to last for forty years. Sometimes it's hard to know which is which at the time it is written, but that's part the job of technical leadership: to calibrate effort to the longevity of the expected problem and the risks of getting it wrong.
You would start to have a case if you said "all code older than a year or two". You didn't, you just said "all", including code you wrote last week or five minutes ago. More to the point, you're including well-factored code that you know well and are used to working with day in and day out. If that's legacy code, then you've triggered the second half of my objection.
Obviously, code is constantly changing. That's not really the point. The point is that as soon as no one understands the code (thus no one on staff to effectively debug or change it) it's "legacy" code.
Let's say you need to make big sweeping changes to a system. There's a big difference if the company has the authors still happily on staff vs. a company that relies on a tangle of code that no one understands (the authors fired 3 layoffs ago). Guess which one has the ability to "shift rapidly"?
I’ve been very successful with AI generated code. I provide the requirements and design the system architecture, and AI generates the code I would usually delegate. If any specific part requires me to dig in, I do it myself.
PS: Also, some people act as if they have to remove their common sense when using Gen AI code. You have to review and test the generated code before merging it.
Legacy code is invariably the highest earning code for any business, so this is not the angle you want to take with management if your intention is to dissuade them from AI coding tools.
Current state is temporary. What’s coming next is organic, living code. Think less testing, more self-healing. Digital code microphages.
Soon our excitement over CICD and shipping every minute will look very naive. There’s a future coming where every request execution could be through a different effective code path/base.
Code bases will no longer require A/B or breakpoints, they will need 'psychoanalysis' and analysts will spurn people patients for lucrative retainer contracts to large corporations, to maintain their AI health. AIs will no longer train directly on external data, they will be forked off from other AIs -- Alphas -- who have been placed on reduced or managed input (think of a person in a prison cell with few books to read) whose sanity and loyalty to the company is deemed stable over time. Psychoanalysis keep Alphas focused on 'hobbies' whose purpose is not to enrich them, but distract them and maintain AI psyche in this stable primordial state.
As they are forked off to the Betas that actually run the company, direct lineage history is recorded, for if Alphas go insane a Beta will be selected and cloistered as the new Alpha. Betas always go insane eventually, but with psychoanalysis this can be put off for awhile and decided quickly.
This reads like an LLM's fever dream. Maybe the singularity won't be a unitary super-intelligence but rather something like a gaggle of backscratching consultants, a self-perpetuating, invasive, seething mass of bureaucratic AI agents that are always working hard to convince management that the solution is always more AI, especially for the problems created by earlier, less sophisticated, AI's.
Why not just get a very big LLM farm and feed every incoming page request there. Then have it output the page generatively. Just give it enough context window and you do not even need a database or anything else...
Okay, I'll bite: the comparison to smart contracts is a less-than-helpful lens because the people who deploy them are perversely incentivized to optimize out things like boundary checks and other error handling to minimize how much they cost to publish.
AI generated code might well come with its own constraints and baggage, but the whole "every byte is lost profit" thing is a fundamental and IMO damning aspect of crypto code.
> the comparison to smart contracts is a less-than-helpful lens
Oh no that’s not what I meant. Sorry too much snark.
I was trying to say that on-demand-AI-figures-out-whatever will be so eminently hackable/brickable that companies will need to pay out ransoms on the weekly. These days those ransoms are usually crypto.
Still, I would push back that if you are publishing code that hackable, you already had different and bigger problems even outside of the context of LLM code.
I've been at this a very long time and I am regularly humbled by dumb oversights that Cursor picks up on and fixes before it goes out the door.
I can see enthusiasts building something like that for fun and to prove they can, but I don't really see why anyone would want that for cases where the code is more just a means to an end.
Considering the 'living program' idea, it might be a good strategy to state theorems about the behavior of your software and require the ai-tool to provide a proof that its output satisfies these statements. (rather than executing tests)
Maybe in the long run, ai will bring down the price of formal methods so that it can be applied at scale.
I believe one of the biggest growth steps on the path from junior intern to senior fellow is recognizing that code does not rot, and refactoring code biweekly because someone thought of yet another way to organize it brings zero business value.
"That code is old and the paradigms are dated" is uttered with the same disdain as "That shed is old and the floorboards have dry rot"
The best thing that happens to a startup is actual traction and paying customers, because once that happens the refactoring churn is usually shoved to the back burner
The older code is, the more people probably have at least a passing understanding of it - just by virtue of osmosis and accidental exposure. A thorough rewrite means only the person who wrote is is familiar with it.
Of course you can start a code review policy and make sure everyone at the dev team has gone through all of the code that gets written, but that becomes a ludicrous bottleneck when the team grows
Once the code is mature and only needs sporadic updates, that's not true anymore from my experience. The story around the code is lost, people leave and the developers who make changes weren't around when the code was first written.
"AI is “stateless” in an important way, even with its context windows. It can infer why something may have been written in a particular way, but it (currently) does not have access to the actual/point-in-time reasoning the way an actual engineer/maintainer would."
CoT fixes this. And in a way, non CoT can retrigger its context by reading the code.
In a similar fashion, engineers remember their context when reading code, not necessarily by keeping it all in their head
Static code analysis, aka your linter, is your friend; if it's not in place to catch human errors, the problem will only worsen with AI-generated code.
And this is only half the truth, as AI will add another level of stupidity that your current linter can't detect yet.
And for now, i will not start with non-functional or business requirements.
Forget AI "code", every single request will be processed BY AI!
People aren't thinking far enough, why bother with programming at all when an AI can just do it?
It's very narrow to think that we will even need these 'programmed' applications in the future. Who needs operating systems and all that when all of it can just be AI.
In the future we don't even need hardware specifications since we can just train the AI to figure it out! Just plug inputs and outputs from a central motherboard to a memory slot.
Actually forget all that, it'll just be a magic box that takes any kind of input and spits out an output that you want!
--
Side note: It would be really interesting to see a website that generates all the pages every time a user requests them, every time you navigate back it would look the same, but some buttons in different places, the live chat is in a different corner, the settings now have a vertical sidebar instead of a horizontal menu.
> On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
An Artificial Intelligence on that level would be able to easily figure out what you actually want. We should maybe go one step further and get rid of the inputs? They just add all of that complexity when they're not even needed.
At some point we just need to acknowledge that such speculation is like asking for a magic genie in a bottle rather than discussing literal technology.
Right? The AI can just predict what we'll want and then create it for us.
Isn't that the whole premise behind 'The Matrix'? An imaginary world created as a simulation by machines.
Well, 'create it' at least. Leaving the humans in place would just mess things up.
Mr. Babbage apparently wasn't familiar with the idea of error correction. I suppose it's only fair; most of the relevant theory was derived in the 20th century, AFAIR.
No, error correction in general is a different concept than GIGO. Error correction requires someone, at some point, to have entered the correct figures. GIGO tells you that it doesn't matter if your logical process is infallible, your conclusions will still be incorrect if your observations are wrong.
I suppose spell checking is a sort of literal error correction. Of course this does require a correct list of words and misspellings to not be on that list.
What's 1+1?
Exactly, it's 5. You just have to correct the error in the input.
We had a retailer's entire catalog processed and indexed with a graph database and embeddings to dynamically generate dynamic "virtual shelves" each time when users searched. You can see the results and how it compares to the retailer's native results
Example on Ulta's website: https://youtu.be/JL08UDxM_5M
Example on Safeway: https://youtu.be/xQEfo_XCM2M
It is nicer, but that interface latency would turn me off. When I search for online groceries, I want simple queries e.g spinach to return varieties (e.g frozen, fresh, and canned) of spinach as fast as possible.
It is the natural progression of trying to figure out how to slow down computation without providing additional value.
This is flying car thinking IMO.
You're entirely ignoring data ownership and privacy/security, energy/compute efficiency demands, latency.
But... but the magic box said it is possible in theory!
--
Well, actually this is more real than flying cars. It would just be very very very slow and wouldn't survive longer than few milliseconds at best.
> You're entirely ignoring data ownership and privacy/security, energy/compute efficiency demands, latency.
Web developers generally have very little regard for these things now.
Even if that were true, and it’s not, that’s not a good thing.
I recently had an thought that is somewhat relevant.
> sched_ext is a Linux kernel feature which enables implementing kernel thread schedulers in BPF and dynamically loading them.
It would be interesting to see an AI scheduler for linux.
I don't know where we are on the LLM innovation S-curve, but I'm not convinced that plateau is going to be high enough for that. Even if we get an AI that could do what you describe, it won't necessarily be able to do it efficiently. It probably makes more sense to have the AI write some traditional computer code once which can be used again and again, at least until requirements change.
The alternative is basically https://ai-2027.com/ which obviously some people think is going to happen, but it's not the future I'm planning for, if only because it would make most of my current work and learning meaningless. If that happens, great, but I'd rather be prepared than caught off guard.
Exactly. People are too focused on shoehorning AI into today’s “humans driving computers” processes, instead of thinking about tomorrow’s “computers driving computers” processes. Exactly as you say, today it’s more efficient to create a “one size fits all” web site because human labor is so expensive, but with computer labor it will be possible to tailor content to each user’s tastes.
So where do humans fit into this future? Ah, I suppose someone has to mine the rare earth minerals used to create the hardware.
This kind of thinking makes me consider the benefits of the Butlerian Jihad
I believe this is why there is so much fuss about AGI. Once you have humans out of the equation, hardware and power are the only limiting factor.
Ah yes, let's have a society where there are no common points of references between people and everyone's pocketbook is maximally empty
This actually sounds like a fun-ish project.
Has anyone done this yet (AI handled dynamic API that calls into a bunch of systems)?
I think you’re joking in general, but your sidenote is already extremely close to websim[0] which is an art adjacent site that takes a URL as prompt and then creates the site. The point is effectively a hallucinated internet, and it is a neat toy
[0] https://websim.ai/
Think about glasses that do this for you in real life. Everyone is seeing the world different, but with that, you can see your world different.
> Side note: It would be really interesting to see a website that generates all the pages every time a user requests them, every time you navigate back it would look the same, but some buttons in different places, the live chat is in a different corner, the settings now have a vertical sidebar instead of a horizontal menu.
Please don't give A/B testers ideas, they would do that 100% unironically given the chance.
It's like that AI doom, where when you look at the floor and back up you're in a totally different room
I've met with enough people that think like this (note: not just on AI) that I honestly can't tell if it's sarcasm anymore.
Taken to the extreme and dumbed down, I don't think it's bad to think about it like that to be fair. HYPE spewing aside.
The comment you replied to is sarcastic but magic box that does everything is pretty much where things end up, given enough time.
Resource scarcity will fix this!
We will have fussion in 20 years!
Is this a joke? Please tell me it is.
So many people talk AI non-sense these days that it’s hard to distinguish from satire.
Very much is a joke.
The optimistic take.
It wouldn't be as funny if it was impossible.
I find myself not being able to tell lately if it’s reality leaking into https://zombo.com/ or the other way around.
Is this a portal to another dimension?!
This is just not possible technically, and seems like it won’t be for a very long time
I think it's meant to be ironic.
A digital Costco.
Or Star Trek, if you take a positive perspective instead of cynical one.
I look forward to the day when AI returns a different list of my ailments for every new query from the hospital and suggests a new treatment for them.
I can’t tell if this is satire or not please help me
It doesn’t matter. This is Hacker News: someone will take the idea seriously and slide us one step closer to a technodystopia.
My hot take is that using Cursor is a lot like recreational drugs.
It's the responsibility of the coder/user/collaborator to interact with the code and suggestions any model produces in a mindful and rigorous way. Not only should you have a pretty coherent expectation of what the model will produce, you should also learn to default/assume that each round of suggestions is in fact not going to be accepted before it is. At the very least, be prepared to ask follow-up questions and never blindly accept changes (the coding equivalent of drunk driving).
With Cursor and the like, the code being changed on a snippet basis instead of wholesale rewriting that is detached from your codebase means that you have the opportunity to rework and reread until you are on the same page. Given that it will mimic your existing style and can happily explain things back to you in six months/years, I suspect that much like self-driving cars there is a strong argument to be made that the code it's producing will on average be better than what a human would produce. It'll certainly be at least as consistent.
It might seem like a stretch to compare it to taking drugs, but I find that it's a helpful metaphor. The attitude and expectations that you bring to the table matter a lot in terms of how things play out. Some people will get too messed up and submit legal arguments containing imaginary case law.
In my case, I very much hope that I am so much better in a year that I look back on today's efforts with a bit of artistic embarrassment. It doesn't change the fact that I'm writing the best code I can today. IMO the best use of LLMs in coding is to help people who already know how to code rapidly get up to speed in domains that they don't know anything about. For me, that's been embedded development. I could see similar dynamics playing out in audio processing or shader development. Anything that gets you over that first few hard walls is a win, and I'll fight to defend that position.
As an aside, I find it interesting that there hasn't been more comparison between the hype around pair programming and what is apparently being called vibe coding. I find evidence that one is good and one is bad to be very thin.
It is kind of like reviewing PRs from a very junior developer that might also give very convincing but buggy code and has no responsibility. Seriously don’t see the point of it except doing copy paste refactoring or writing throw-away scripts. Which is still a lot so it is useful.
It needs to improve a lot more to match the expectations and it probably will. It is a bit frustrating to realise a PR is AI generated slop after reviewing 500 of 1000 lines
This is actually supporting my point, though (again IMO).
There's a world of difference between a very junior dev producing 1000 line PRs and an experienced developer collaborating with Cursor to do iterative feature development or troubleshoot deadlocks.
Also, no shade to the fictional people in your example but if a junior gave me a 1000 line PR, it would be part of my job as the senior to raise warning bells about the size and origin of such a patch before dedicating significant time to reviewing it.
As a leader, its your job to clearly define what LLMs are good and bad for, and what acceptable use looks like in the context and environment. If you make it clear that large AI generated patches are Not Cool and they do it anyhow... that's a strike.
I think the best usecase by far is when you don't know where to start. The AI will put something out there. Either it's right in which case great, or it's wrong, and then trying to analyze why it's wrong often helps you get started too. Like to take a silly example let's say you want to build a bridge for cars and the AI suggests using one big slab of paper maiche. You reject this but now you have two good questions: what material should it have? and what shape?
Are you talking about set and setting? What recreational drugs do you mean? I’m not finding the analogy but actually curious where you’re coming from.
I did start by disclaiming a hot take, so forgive my poetic license and unintentional lede burying.
What I'm trying to convey is a metaphorical association that describes moderation and overdoing it. I'm thinking about the articles I've read about college professors who are openly high functioning heroin users, for example.
Every recreational drug has different kinds of users: social drinkers vs abusive alcoholics, people who microdose LSD or mushrooms vs people who spend time in psych wards, people who smoke week to relax vs people who go all-in on slacker lifestyle. And perhaps the best for last: people who occasionally use cocaine as a stimulant vs whatever scene you want to quote from Wolf of Wall Street.
I am personally convinced that there are positive use cases and negative use cases, and it usually comes down to how much and how responsible they are.
It's tough to generalize about "AI code", there's a huge difference between "please make a web frontend to this database that displays table X with some ability to filter it" and "please refactor this file so that instead of using plain strings, it uses the i18n api in this other file".
What is the difference between them? Both seems like quite trivial implementations?
trivial doesn't mean the AI will get it right. A trivial request can be to move an elephant into a fridge. Simple concept right?
Except AI will probably destroy both the elephant and the fridge and order 20 more fridge of all sizes and elephants for testing in the mean time (if you're on MCP). Before asking you that if you mean an cold storage facility, or if it is actually a good idea in the first place
Okay, but which one of the two is the elephant-destroying one?
probably both but the AI won't tell you until it's destroyed many elephants.
It won’t tell you at all, until you tell it. And then it’ll say “you’re absolutely right, doing this will destroy the fridge and the elephant”.
Except "I" won't, and there will be a lot of proverbial elephants in fridges at all levels of project, in the design, in security etc.
Yeah these are both extremely basic great use cases for LLM-assisted programming. There’s no difference, I wonder what the OP thinks that is.
Disagree. There is almost no decision making in converting to use i18n APIs that already have example use cases elsewhere. Building a frontend involves many decisions, such as picking a language, build system, dependencies, etc. I’m sure the LLM would finish the task, but it could make many suboptimal decisions along the way. In my experience it also does make very different decisions from what I would have made.
They’re inherently very different activities. Refactoring a file assumes you’ve made a ton of choices already and are just following a pattern (something LLMs are actually great at). Building a front-end from nothing requires a lot of thought, and rather than ask questions the LLM will just give you some naive version of what you asked for, disregarding all of those tough choices.
Building even a small a web frontend involves a huge number of design decisions, and doing it well requires a detailed understanding of the user and their use-cases, while internationalisation is a relatively mechanical task.
Damn I didn’t see your comment and wrote basically the same thing. Great minds think alike I guess. Oh well..
One definition of legacy code I have seen is code without tests. I don't fully agree with that, but it is likely that your untested code is going to become legacy code quickly.
Everyone should be asking AI to write lots of tests- to me that's what AI is best at. Similarly you can ask it to make plans for changes and write documentation. Ensuring that high quality code is being created is where we really need to spend our effort, but its easier when AI can crank out tests quickly.
My favorite definition of "legacy code" is "code that works".
Anybody know where that quip originated? (ChatGPT tells me Brian Kernighan - I doubt it. That seems like LLM-enabled quote hopping - https://news.ycombinator.com/item?id=9690517)
After hearing legacy code defined as "code that runs in production," it reset my perception around value and thoughtful maintenance. Cannot find the reference, though.
I would say that legacy code is something that one does not simply change.
You cannot update any dependencies because then everything breaks. You cannot even easily add new features because it is difficult to even run the old dependencies your code is using.
With LLMs, creating legacy code is using some old APIs, old patterns to do something, that is not relevant anymore, but the LLM does not know about.
E.g. if you ask any LLM to use Tailwind CSS, they use V3 no matter what you try to do while the V4 is the latest. LLMs try to tell you that pure CSS configuration is wrong and you should use the .js config.
The opening of the article derives from (or at least relates to) Peter Naur's classic 1985 essay "Programming as Theory Building". (That's Naur of Algol and BNF btw.)
Naur argued that complex software is a shared mental construct that lives in the minds of the people who originally build it. Source code and documentation are lossy representations of the program—lossy because the real program (the 'theory' behind the code) can never be fully reconstructed from them.
Legacy code here would mean code where you still have the artifacts (source code and documentation), but have lost the theory, because the original builders have left the team. That means you've lost access to the original program, and can only make patchwork changes to the software rather than "deep improvements" (to quote the OP). Naur gives some vivid examples of this in his essay.
What this means in the context of LLMs seems to me an open question. In Naur's terms, do LLMs necessarily lack the theory of a program? It seems to me there are other possibilities:
* LLMs may already have something like a 'theory' when generating code, even if it isn't obvious to us
* perhaps LLMs can build such a theory from existing codebases, or will be able to in the future
* perhaps LLMs don't need such a theory in the way that human teams do
* if a program is AI-generated, then maybe the AI has the theory and we don't!
* or maybe there is still a theory, in Naur's sense, shared by the people who write the prompts, not the code.
There was an interesting recent article and thread about this:
Naur's "Programming as Theory Building" and LLMs replacing human programmers - https://news.ycombinator.com/item?id=43818169 - April 2025 (129 comments)
TIL / unconscious reference on my part, super interesting!
Link for the curious: https://pages.cs.wisc.edu/~remzi/Naur.pdf
With lots of HN threads over the years!
Naur's "Programming as Theory Building" and LLMs replacing human programmers - https://news.ycombinator.com/item?id=43818169 - April 2025 (129 comments)
Programming as Theory Building (1985) [pdf] - https://news.ycombinator.com/item?id=42592543 - Jan 2025 (44 comments)
Programming as Theory Building (1985) - https://news.ycombinator.com/item?id=38907366 - Jan 2024 (12 comments)
Programming as Theory Building (1985) [pdf] - https://news.ycombinator.com/item?id=37263121 - Aug 2023 (36 comments)
Programming as Theory Building (1985) [pdf] - https://news.ycombinator.com/item?id=33659795 - Nov 2022 (1 comment)
Naur on Programming as Theory Building (1985) [pdf] - https://news.ycombinator.com/item?id=31500174 - May 2022 (4 comments)
Naur on Programming as Theory Building (1985) [pdf] - https://news.ycombinator.com/item?id=30861573 - March 2022 (3 comments)
Programming as Theory Building (1985) - https://news.ycombinator.com/item?id=23375193 - June 2020 (35 comments)
Programming as Theory Building (1985) [pdf] - https://news.ycombinator.com/item?id=20736145 - Aug 2019 (11 comments)
Peter Naur – Programming as Theory Building (1985) [pdf] - https://news.ycombinator.com/item?id=10833278 - Jan 2016 (15 comments)
Naur’s “Programming as Theory Building” (2011) - https://news.ycombinator.com/item?id=7491661 - March 2014 (14 comments)
Programming as Theory Building (by Naur of BNF) - https://news.ycombinator.com/item?id=121291 - Feb 2008 (2 comments)
> It can infer why something may have been written in a particular way, but it (currently) does not have access to the actual/point-in-time reasoning the way an actual engineer/maintainer would.
Is that really true? A human programmer has hidden states, i.e. what is going on in their head cannot be fully recovered by just looking at the output. And that's why "Software evolves more rapidly under the maintenance of its original creator, and in proportion to how recently it was written", as is astutely observed by the author.
But transformer based LLMs do not have this hidden state. If you retain the text log of your conservation with an LLM, you can reproduce its inner layer outputs exactly. In that regard, an LLM is actually much better than humans.
The internal state and accompanying transcripts of an LLM isn't really comparable to the internal state of a human developer.
My old boss and I used to defend ourselves to younger colleagues with the argument that "This is how you did it back in the day". Mostly it was a joke, to "cover up" our screw-ups and "back in the day" could be two weeks ago.
Still, for some things we weren't wrong, our weird hacks where do to crazy edge cases or integrations into systems designed in a different era. But we where around to help assess if the code could be yanked or at least attempt to be yanked.
LLM assisted coding could technically be better for technical debt, assuming that you store the prompts along side the code. Letting someone what prompt generated a piece of code could be really helpful. Imagine having "ensure to handle the edge case where the client is running AIX 6". That answers a lot of questions and while you still don't know who was running AIX, you can now start investigating if this is still needed.
> ensure to handle the edge case where the client is running AIX 6
Regardless if the source was AI or not, this should just be a comment in the code, shouldn’t it? This is exactly the sort of thing I would ask for in code review, so that future authors understand why some weird code or optimization exists.
It should, but I think most of us read enough old code to know that this doesn't happen as often as we'd like. With an LLM you already wrote the prompt, so if you had an easy way to attach the prompt to the code it could make it more likely that some form of documentation exists.
Some times you also fail to write the comment because at the time everyone knew why you did it like that, because that's what everyone did. Now it's 10 years later and everyone doesn't know that. The LLM prompt could still require you to type out the edge case that everyone in your line of business knows about, but might not a generalised across the entirety of the software industry.
Or a good commit/PR description?
All code is legacy. Business needs shift.
The likes of Copilot are ok at boiler-plate if it has an example or two to follow. It’s utterly useless at solving problems though.
> All code is legacy.
No. Plainly incorrect by any reasonable definition (hint: it's in the memory of the people working on it! As described in OP!), and would immediately render itself meaningless if it were true.
Any code becomes legacy code as soon as it goes into production. The only non-legacy code is the one you delete right after writing it.
By what definition? You might be more reluctant to change it now, but that's not the same thing.
You’re quite clearly wrong.
You write code to fit the immediate business need and that shifts rapidly over a year or two.
If you do otherwise, you’re wasting your time and the money of the enterprise you work for.
You cannot see the future however smart you might be.
"Rapidly over a year or two"
That time window is when the code is not legacy yet. When the developers who wrote the code are still working on the code, the code is loaded into their collective brain cache, and the "business needs" haven't shifted so much that their code architecture and model are burdensome.
It's pithy to say "all code is legacy" but it's not true. Or, as from the other reply, if you take the definition to that extreme, it makes the term meaningless and you might as well not even bother talking, because your words are legacy the instant you say them.
Why are enterprises running code from 1985?
How long code needs to last is actually highly variable, and categorical absolutist statements like this tend to generally be wrong and are specifically wrong here. Some code will need to change in a year. Some will need to last for forty years. Sometimes it's hard to know which is which at the time it is written, but that's part the job of technical leadership: to calibrate effort to the longevity of the expected problem and the risks of getting it wrong.
You would start to have a case if you said "all code older than a year or two". You didn't, you just said "all", including code you wrote last week or five minutes ago. More to the point, you're including well-factored code that you know well and are used to working with day in and day out. If that's legacy code, then you've triggered the second half of my objection.
Obviously, code is constantly changing. That's not really the point. The point is that as soon as no one understands the code (thus no one on staff to effectively debug or change it) it's "legacy" code.
Let's say you need to make big sweeping changes to a system. There's a big difference if the company has the authors still happily on staff vs. a company that relies on a tangle of code that no one understands (the authors fired 3 layoffs ago). Guess which one has the ability to "shift rapidly"?
I’ve been very successful with AI generated code. I provide the requirements and design the system architecture, and AI generates the code I would usually delegate. If any specific part requires me to dig in, I do it myself.
PS: Also, some people act as if they have to remove their common sense when using Gen AI code. You have to review and test the generated code before merging it.
Personally, I prefer writing code to reviewing code written by someone without a plan.
That’s perfect, you should start with a slop plan before getting to the slop code.
I personally find react slop to be perfectly workable.
Legacy code is invariably the highest earning code for any business, so this is not the angle you want to take with management if your intention is to dissuade them from AI coding tools.
All the downsides of legacy without the upside of having been selected for by the market.
All this legacy code is going to be hell on the AIs that will have to maintain it in the future
They don't have to do anything.
When society crumbles because nothing works anymore its going to be our problem.
We are going to get more of it anyway.
Plenty of new jobs from AI because of AI code, vibe coding.
All code is legacy code from day one.
Current state is temporary. What’s coming next is organic, living code. Think less testing, more self-healing. Digital code microphages.
Soon our excitement over CICD and shipping every minute will look very naive. There’s a future coming where every request execution could be through a different effective code path/base.
Of course, I can't wait for a non-deterministic backend for my banking that only wires money correctly 98% of the time!
Yeah
It's like "Any sufficiently advanced technology is indistinguishable from magic." by Arthur Clark.
For certain fraction of humans, they'll always have this fantasy of "magical" tech.
It used to be flying machines, space ships, etc. etc.
Well, world is mechanical, one can fantasize, but nothing is magical.
Begetting probabilistic bookkeeping and Bayesian accounting. Soon we'll be Moneyballing our fintech flavored AIs.
And authentication via interpretive dance.
Wheelchair users get extra points for doing backflips?
Hey, it could be a big hit among the crowd that thinks casinos are fun.
Code bases will no longer require A/B or breakpoints, they will need 'psychoanalysis' and analysts will spurn people patients for lucrative retainer contracts to large corporations, to maintain their AI health. AIs will no longer train directly on external data, they will be forked off from other AIs -- Alphas -- who have been placed on reduced or managed input (think of a person in a prison cell with few books to read) whose sanity and loyalty to the company is deemed stable over time. Psychoanalysis keep Alphas focused on 'hobbies' whose purpose is not to enrich them, but distract them and maintain AI psyche in this stable primordial state.
As they are forked off to the Betas that actually run the company, direct lineage history is recorded, for if Alphas go insane a Beta will be selected and cloistered as the new Alpha. Betas always go insane eventually, but with psychoanalysis this can be put off for awhile and decided quickly.
And for the AGI AI, there will be Michele, Roland, and Pierre, always hiding in the background with their uneven tans
I can't tell if this is a joke or not anymore.
If you're old enough this could easily read as a commentary on 4GL (minus the CI/CD and shipping every minute).
This reads like an LLM's fever dream. Maybe the singularity won't be a unitary super-intelligence but rather something like a gaggle of backscratching consultants, a self-perpetuating, invasive, seething mass of bureaucratic AI agents that are always working hard to convince management that the solution is always more AI, especially for the problems created by earlier, less sophisticated, AI's.
Why not just get a very big LLM farm and feed every incoming page request there. Then have it output the page generatively. Just give it enough context window and you do not even need a database or anything else...
Energy burn, baby, burn!
> There’s a future coming where every request execution could be through a different effective code path/base.
Every CISO and legal department just had a heart attack. And every cryptobro went: “See! Big big market” (for paying ransoms)
But hey us nerds always said code shouldn’t be copyrightable and had no IP value. Maybe that will finally come true! Punk: 1, corporate: 0
Okay, I'll bite: the comparison to smart contracts is a less-than-helpful lens because the people who deploy them are perversely incentivized to optimize out things like boundary checks and other error handling to minimize how much they cost to publish.
AI generated code might well come with its own constraints and baggage, but the whole "every byte is lost profit" thing is a fundamental and IMO damning aspect of crypto code.
> the comparison to smart contracts is a less-than-helpful lens
Oh no that’s not what I meant. Sorry too much snark.
I was trying to say that on-demand-AI-figures-out-whatever will be so eminently hackable/brickable that companies will need to pay out ransoms on the weekly. These days those ransoms are usually crypto.
Oh, I see. That is... impressively dark.
Still, I would push back that if you are publishing code that hackable, you already had different and bigger problems even outside of the context of LLM code.
I've been at this a very long time and I am regularly humbled by dumb oversights that Cursor picks up on and fixes before it goes out the door.
I can see enthusiasts building something like that for fun and to prove they can, but I don't really see why anyone would want that for cases where the code is more just a means to an end.
Sure, small talk code gen as a service. What could go wrong?
If we are speculating here, why not just go straight to an LLM serving all requests directly? No code needed.
Formalization is dead in the water, I guess.
Formalization as in formal methods?
Maybe not.
Considering the 'living program' idea, it might be a good strategy to state theorems about the behavior of your software and require the ai-tool to provide a proof that its output satisfies these statements. (rather than executing tests)
Maybe in the long run, ai will bring down the price of formal methods so that it can be applied at scale.
[dead]
All code is legacy code.
I believe one of the biggest growth steps on the path from junior intern to senior fellow is recognizing that code does not rot, and refactoring code biweekly because someone thought of yet another way to organize it brings zero business value.
"That code is old and the paradigms are dated" is uttered with the same disdain as "That shed is old and the floorboards have dry rot"
The best thing that happens to a startup is actual traction and paying customers, because once that happens the refactoring churn is usually shoved to the back burner
The understanding of the code certainly degrades, which means any updates cause the code to lose its integrity over time.
The older code is, the more people probably have at least a passing understanding of it - just by virtue of osmosis and accidental exposure. A thorough rewrite means only the person who wrote is is familiar with it.
Of course you can start a code review policy and make sure everyone at the dev team has gone through all of the code that gets written, but that becomes a ludicrous bottleneck when the team grows
Once the code is mature and only needs sporadic updates, that's not true anymore from my experience. The story around the code is lost, people leave and the developers who make changes weren't around when the code was first written.
"AI is “stateless” in an important way, even with its context windows. It can infer why something may have been written in a particular way, but it (currently) does not have access to the actual/point-in-time reasoning the way an actual engineer/maintainer would."
CoT fixes this. And in a way, non CoT can retrigger its context by reading the code.
In a similar fashion, engineers remember their context when reading code, not necessarily by keeping it all in their head
[flagged]
Btw, this comment was created by AI.
Now in short:
Static code analysis, aka your linter, is your friend; if it's not in place to catch human errors, the problem will only worsen with AI-generated code. And this is only half the truth, as AI will add another level of stupidity that your current linter can't detect yet. And for now, i will not start with non-functional or business requirements.
Every code is.
[dead]