I would love to see an anti-AI take that doesn't hinge on the idea that technology forces people to be lazy/careless/thoughtless.
The plan-build-test-reflect loop is equally important when using an LLM to generate code, as anyone who's seriously used the tech knows: if you yolo your way through a build without thought, it will collapse in on itself quickly. But if you DO apply that loop, you get to spend much more time on the part I personally enjoy, architecting the build and testing the resultant experience.
> While the LLMs get to blast through all the fun, easy work at lightning speed, we are then left with all the thankless tasks
This is, to me, the root of one disagreement I see playing out in every industry where AI has achieved any level of mastery. There's a divide between people who enjoy the physical experience of the work and people who enjoy the mental experience of the work. If the thinking bit is your favorite part, AI allows you to spend nearly all of your time there if you wish, from concept through troubleshooting. But if you like the doing, the typing, fiddling with knobs and configs, etc etc, all AI does is take the good part away.
> I would love to see an anti-AI take that doesn't hinge on the idea that technology forces people to be lazy/careless/thoughtless.
The article sort of goes sideways with this idea but pointing out that AI coding robs you a deep understanding of the code it produces is a valid and important criticism of AI coding.
A software engineer's primary job isn't producing code, but producing a functional software system. Most important to that is the extremely hard to convey "mental model" of how the code works and an expertise of the domain it works in. Code is a derived asset of this mental model. And you will never know code as well as a reader and you would have as the author for anything larger than a very small project.
There are other consequences of not building this mental model of a piece of software. Reasoning at the level of syntax is proving to have limits that LLM-based coding agents are having trouble scaling beyond.
> And you will never know code as well as a reader and you would have as the author for anything larger than a very small project.
This feels very true - but also consider how much code exists for which many of the current maintainers were not involved in the original writing.
There are many anecdotal rules out there about how much time is spent reading code vs writing. If you consider the industry as a whole, it seems to me that the introduction of generative code-writing tools is actually not moving the needle as far as people are claiming.
We _already_ live in a world where most of us spend much of our time reading and trying to comprehend code written by others from the past.
What's the difference between a messy codebase created by a genAI, and a messy codebase where all the original authors of the code have moved on and aren't available to ask questions?
> What's the difference between a messy codebase created by a genAI, and a messy codebase where all the original authors of the code have moved on and aren't available to ask questions?
The difference is the hope of getting out of that situation. If you've inherited a messy and incoherent code base, you recognize that as a problem and work on fixing it. You can build an understanding of the code through first reading and then probably rewriting some of it. This over time improves your ability to reason about that code.
If you're constantly putting yourself back into that situation through relegating the reasoning about code to coding agent, then you won't develop a mental model. You're constantly back at Day 1 of having to "own" someone else's code.
The key point is "relegating the reasoning". The real way to think about interfacing with LLMs is "abstraction engineering". You still should fully understand the reasoning behind the code. If you say "make a form that captures X, Y, Z and passes it to this API" you relegate how it accomplishes that goal and everything related to it. Then you look at the code and realize it doesn't handle validation (check the reasoning), so you have it add validation and toasts. But you are now working on a narrower level of abstraction because the bigger goal of "make a user form" has been completed.
Where this gets exhausting is when you assume certain things that you know are necessary but don't want to verify - maybe it let's you submit an email form with no email, or validates password as an email field for some reason, etc. But as LLMs improve their assumptions or you manage context correctly, the scale tips towards this being a useful engineering tool, especially when what you are doing is a well-trodden path.
I find this to be too rosy a story about using agentic coding to add to a codebase. In my experience, miss a small detail about the code and the agent may can go out of control creating a whole new series of errors that you wouldn’t have had to fix. And even if you don’t miss a detail, the agent eventually forgets because of the limited context window.
This is why I’ve constrained my use of AI agents to mostly “read-only and explain” use cases, but I have very strict conditions for letting it write. In any case, whatever productivity gains you supposedly “get” for its write scenarios, you should be subtracting your expenses to fix its output later and/or payments made for a larger context window or better reasoning. It’s usually not worth the trouble to me when I have plenty of experience and knowledge to draw from and can write the code as it should be myself.
So there’s another force at work here that to me answers the question in a different way. Agents also massively decrease the difficulty of coming into someone else’s messy code base and being productive.
Want to make a quick change or fix? The agent will likely figure out a way to do it in minutes rather the than hours it would take me to do so.
Want to get a good understanding of the architecture and code layout? Working with an agent for search and summary cuts my time down by an order of magnitude.
So while agree there’s a lot more “what the heck is this ugly pile of if else statements doing?” And “why are there three modules handling transforms?”, there is a corresponding drop in cost to adding features and paying down tech debt. Finding the right balance is a bit different in the agentic coding world, but it’s a different mindset and set of practices to develop.
In my experience this approach is kicking the can down the road. Tech debt isn't paid down, it's being added to, and at some point in the future it will need to be collected.
When the agent can't kick the can any more who is going to be held responsible? If it is going to be me then I'd prefer to have spent the hours understanding the code.
> In the current times you’re either an agent manager or you’re in for a surprise.
This opinion seems to be popular, if only in this forum and not in general.
What I do not understand is this;
In order to use LLM's to generate code, the engineer
has to understand the problem sufficient enough to
formulate prompt(s) to use in order to get usable
output (code). Assuming the engineer has this level
of understanding along with knowledge of the target
programming language and libraries used, how is using
LLM code generation anything more than a typing saver?
> What's the difference between a messy codebase created by a genAI, and a messy codebase where all the original authors of the code have moved on and aren't available to ask questions?
Messy codebases made by humans are known to be a bad thing that causes big problems for software that needs to be maintained and changed. Much effort goes into preventing them and cleaning them up.
If you want to work with AI code systems successfully then you better apply these exact same efforts. Documentation, composition, validation, evaluation, review and so on.
> We _already_ live in a world where most of us spend much of our time reading and trying to comprehend code written by others from the past.
We also live in a world where people argue endlessly about how we don't need to write documentation or how it's possible to write self documenting code. This is where we spend so much of our time and yet so few actually invest in the efforts to decrease that time.
I bring this up because it's a solution to what you're pointing out as a problem and yet the status quo is to write even messier and harder to understand code (even before AI code). So I'm just saying, humans are really good at shooting themselves in the foot and blaming it on someone else or acting like the bullet came out of nowhere.
> What's the difference between
More so, I must get misreading because it sounds like you're asking what's the difference between "messy" and "messier"?
If it's the same level of messiness, then sure, it's equal. But in a real world setting there's a continuous transition of people. One doesn't work on code in isolation, quit, and then a new person works on that code also in isolation. So maybe it's not the original authors but rather the original authors are a Ship of Theseus. Your premise isn't entirely accurate and I think the difference matters
>The article sort of goes sideways with this idea but pointing out that AI coding robs you a deep understanding of the code it produces is a valid and important criticism of AI coding.
You can describe what the code should do with natural language.
I've found that using literate programming with agent calls to write the tests first, then code, then the human refining the description of the code, and going back to 1 is surprisingly good at this. One of these days I'll get around to writing an emacs mode to automate it because right now it's yanking and killing between nearly a dozen windows.
Of course this is much slower than regular development but you end up with world class documentation and understanding of the code base.
> The article sort of goes sideways with this idea but pointing out that AI coding robs you a deep understanding of the code it produces is a valid and important criticism of AI coding.
Why? Code has always been the artifact. Thinking about and understanding the domain clearly and solving problems is where the intrinsic value is at (but I'd suspect that in the future this, too, will go away).
Code is the final artifact after everything is shipped. But while the development is active, it is more than that (at least for now), as you need to know implementation details even if you are really proficient at the domain knowledge.
Although I do agree that there is a possibility that we'll build a relatively reliable abstraction using LLMs at some point, so this issue will go away. There probably be some restrictions, but I think it is possible.
Code isn't an "artifact", it's the actual product that you are building and delivering. You can use flowery language and pontificate about the importance of the problem domain if you like, but at the end of the day we are producing a low level sequences of instructions that will be executed by a real world device. There has always been, and likely will always be, value in understanding exactly what you are asking the computer to do
I'm familiar with "artifact" being used to describe the inconsequential and easy to reproduce output of some deterministic process (e.g. build artifact). Even given the terminology you provide here it doesn't change the content of my point above.
When I see someone dismissing the code as a small irrelevant part of the task of writing software, it's like hearing that the low-level design and physical construction of a bridge is an irrelevant side-effect of my desire to cross a body of water. Like, maybe that's true in a philosophical sense, but at the end of the day we are building a real-world bridge that needs to conform to real-world constraints, and every little detail is going to be important. I wouldn't want to cross a bridge built by someone who thinks otherwise.
In most domains, code is not the actual product. Data is. Code is how you record, modify and delete data. But it is ultimately data that has meaning and value.
This is why we have the idiom: “Don’t tell me what the code says—show me the data, and I’ll tell you what the code does.”
Reminds me of critisms of python decades ago. that you wouldn't understand what the "real code" was doing since you were using a scripting language. But then over the years it showed tremendous value and many unicorns were built by focusing on higher level details and not lower level code
Comparing LLMs to programming languages is a fake equivalence. I don’t have to write assembly because LLVM will do that for me correctly in 100% of the cases, while AI might or might not (especially the more I move away from template crud apps)
That is a myth, cpu time is time spent waiting around by your users as the cpu is taking seconds to do something that could be instant, if you have millions of users and that happens every day that quickly adds up to many years worth of time.
It might be true if you just look at development cost, but if you look at value as a whole it isn't. And even just development cost its often not true, since time spent waiting around by the developer for tests to run and things to start also slows things down, taking a bit of time there to reduce cpu time is well worth it just to get things done faster.
Yeah, it's time spent by the users. Maybe it's an inneficiency of the market because the software company doesn't feel the negative effect enough, maybe it really is cheaper in aggregate that doing 3 different native apps in C++. But if CPU time is so valuable, why aren't we arguing for hand written C or even assembly code instead of the layers upon layers of abstraction in even native modern software?
> But if CPU time is so valuable, why aren't we arguing for hand written C or even assembly code instead of the layers upon layers of abstraction in even native modern software?
Many of us do frequently argue for something similar. Take a look at Casey Muratori’s performance aware programming series if you care about the arguments.
> But if CPU time is so valuable, why aren't we arguing for hand written C or even assembly code instead of the layers upon layers of abstraction in even native modern software?
That is an extreme case though, I didn't mean that all optimizations are always worth it, but if we look at marginal value gained from optimizations today the payback is usually massive.
It isn't done enough since managers tend to undervalue user and developer time. But users don't undervalue user time, if your program wastes their time many users will stop using it, users are pretty rational about that aspect and prefer faster products or sites unless they are very lacking. If a website is slow a few times in a row I start looking for alternatives, and data says most users do that.
I even stopped my JetBrains subscription since the editor got so much slower in an update, so I just use the one I can keep forever as I don't want their patched editor. If it didn't get slower I'd gladly keep it as I liked some of the new features, but it being slower was enough to make me go back.
Also, while managers can obvious agree that making developer spend less time waiting is a good thing, it is very rare for managers to tell you to optimize compilation times or such, and pretty simple optimizations there can often make that part of the work massively faster. Like, if you profile your C++ compiler and look what files it spends time compiling, then look at those files to figure out why its so slow there, you can find these weird things and fixing those speeds it up 10x, so what took 30 seconds now takes 3 seconds, that is obviously very helpful and if you are used to that sort of thing you could do it in a couple of hours.
That's not the same thing. LLMs don't just obscure low-level technical implementation details like Python does, they also obscure your business logic and many of its edge cases.
Letting a Python interpreter manage your memory is one thing because it's usually irrelevant, but you can't say the same thing about business logic. Encoding those precise rules and considering all of the gnarly real-world edge cases is what defines your software.
There are no "higher level details" in software development, those are in the domain of different jobs like project managers or analysts. Once AI can reliably translate fuzzy natural language into precise and accurate code, software development will simply die as a profession. Our jobs won't morph into something different - this is our job.
But working with AI isn’t really a higher level of abstraction. It’s a completely different process. I’m not hating on it, I love LLMs and use em constantly, but it doesn’t go assembly > C > python > LLMs
> The article sort of goes sideways with this idea but pointing out that AI coding robs you a deep understanding of the code it produces is a valid and important criticism of AI coding.
In any of my teams with moderate to significant code bases, we've always had to lean very hard into code comments and documentation, because a developer will forget in a few months the fine details of what they've previously built. And further, any org with turnover needs to have someone new come in and be able to understand what's there.
I don't think I've met a developer that keeps all of the architecture and design deeply in their mind at all times. We all often enough need to go walk back through and rediscover what we have.
Which is to say... if the LLM generator was instead a colleague or neighboring team, you'd still need to keep up with them. If you can adapt those habits to the generative code then it doesn't seem to be a bit leap.
What is "understanding code", mental model of the problem? These are terms for which we all have developed a strong & clear picture of what they mean. But may I remind us all that used to not be the case before we entered this industry - we developed it over time. And we developed it based on a variety of highly interconnected factors, some of which are e.g.: what is a program, what is a programming language, what languages are there, what is a computer, what software is there, what editors are there, what problems are there.
And as we mapped put this landscape, hadn't there been countless situations where things felt dumb and annoying, and then situation in sometimes they became useful, and sometimes they remained dumb? Something you thought is making you actively loosing brain cells as you're doing them, because you're doing them wrong?
Or are you to claim that every hurdle you cross, every roadblock you encounter, every annoyance you overcome has pedagogical value to your career? There are so many dumb things out there. And what's more, there's so many things that appear dumb at first and then, when used right, become very powerful. AI is that: Something that you can use to shoot yourself in the foot, if used wrong, but if used right, it can be incredibly powerful. Just like C++, Linux, CORS, npm, tcp, whatever, everything basically.
I can imagine an industry where we describe business rules to apply to data in natural language, and the AI simply provides an executable without source at all.
The role of the programmer would then be to test if the rules are being applied correctly. If not, there are no bugs to fix, you simply clarify the business rules and ask for a new program.
I like to imagine what it must be like for a non technical business owner who employees programmers today. There is a meeting where a process or outcome is described, and a few weeks / months / years a program is delivered. The only way to know if it does what was requested is to poke it a bit and see if it works. The business owner has no metal modal of the code and can't go in and fix bugs.
update: I'm not suggesting I believe AI is anywhere near being this capable.
Not really, its more a case of "potentially can" rather than "will". This dynamic has always been there with the whole junior, senior dev. split, its not a new problem. You 100% can use it without losing this, in an ideal world you can even go so far as to not worry about the understanding for parts that are inconsequential.
>> The article sort of goes sideways with this idea but pointing out that AI coding robs you a deep understanding of the code it produces is a valid and important criticism of AI coding.
All code is temporary and should be treated as ephemeral. Even if it lives for a long time, at the end of the day what really matters is data. Data is what helps you develop the type of deep understanding and expertise of the domain that is needed to produce high quality software.
In most problem domains, if you understand the data and how it is modeled, the need to be on top of how every single line of code works and the nitty-gritty of how things are wired together largely disappears. This is the thought behind the idiom “Don’t tell me what the code says—show me the data, and I’ll tell you what the code does.”
It is therefore crucial to start every AI-driven development effort with data modeling, and have lots of long conversations with AI to make sure you learn the domain well and have all your questions answered. In most cases, the rest is mostly just busywork, and handing it off to AI is how people achieve the type of productivity gains you read about.
Of course, that's not to say you should blindly accept everything the AI generates. Reading the code and asking the AI questions is still important. But the idea that the only way to develop an understanding of the problem is to write the code yourself is no longer true. In fact, it was never true to begin with.
> The article sort of goes sideways with this idea but pointing out that AI coding robs you a deep understanding of the code it produces is a valid and important criticism of AI coding.
No it isn't. There's literally nothing about the process that forces you to skip understanding. Any such skips are purely due to the lack of will on the developer's side. This lack of will to learn will not change the outcomes for you regardless of whether you're using an LLM. You can spend as much time as you want asking the LLM for in-depth explanations and examples to test your understanding.
So many of the criticisms of coding with LLMs I've seen really do sound like they're coming from people who already started with a pre-existing bias, fiddled with with for a short bit (or worse, never actually tried it at all) and assumed their limited experience is the be-all end-all of the subject. Either that, or they're typical skill issues.
> There's literally nothing about the process that forces you to skip understanding.
There's nothing about C that "forces" people to write buffer overflows. But, when writing C, the path of least resistance is to produce memory-unsafe code. Your position reminds me of C advocates who say that "good developers possess the expertise and put in the effort to write safe code without safeguards," which is a bad argument because we know memory errors do show up in critical code regardless of what a hypothetical "good C dev" does.
If the path of least resistance for a given tool involve using that tool dangerously, then it's a dangerous tool. We say chefs should work with sharp knives, but with good knife technique (claw grip, for instance) safety is the path of least resistance. I have yet to hear of an LLM workflow where skimming the generated code is made harder than comprehensively auditing it, and I'm not sure that such a workflow would feel good or be productive.
Your point of view assumes the best of people, which is naive. It may not force you to skip understanding, however it makes it much easier to than ever before.
People tend to take the path of least resistance, maybe not everyone, maybe not right away, but if you create opportunities to write poor code then people will take them - more than ever it becomes important to have strong CI, review and testing practices.
Edit: okay, maybe I am feeling a little pessimistic this morning :)
People will complain about letting the LLM code because you won't understand every nuance. Then they will turn around and pip install a dependency without even glancing at the underlying code.
> No it isn't. There's literally nothing about the process that forces you to skip understanding. Any such skips are purely due to the lack of will on the developer's side
This is the whole point. The marginal dev will go to the path of least resistance, which is to skip the understanding and churn out a bunch of code. That is why it's a problem.
You are effectively saying "just be a good dev, there's literally nothing about AI which is stopping you from being a good dev" which is completely correct and also missing the point.
The marginal developer is not going to put in the effort to wield AI in a skillful way. They're going to slop their way through. It is a concern for widespread AI coding, even if it's not a concern for you or your skill peers in particular.
To add to the above - I see a parallel to the "if you are a good and diligent developer there is nothing to stop you from writing secure C code" argument. Which is to say - sure, if you also put in extra effort to avoid all the unsafe bits that lead to use-after-free or race conditions it's also possible to write perfect assembly, but in practice we have found that using memory safe languages leads to a huge reduction of safety bugs in production. I think we will find similarly that not using AI will lead to a huge reduction of bugs in production later on when we have enough data to compare to human-generated systems. If that's a pre-existing bias, then so be it.
> The marginal developer is not going to put in the effort to wield AI in a skillful way. They're going to slop their way through. It is a concern for widespread AI coding, even if it's not a concern for you or your skill peers in particular.
My mental model of it is that coding with LLMs amplified both what you know and what you don't.
When you know something, you can direct it productively much faster to a desirable outcome than you could on your own.
When you don't know something, the time you normally would have spent researching to build a sufficient understanding to start working on it can be replaced with evaluating the random stuff the LLM comes up with which oftentimes works but not in the way it ought to, though since you can get to some result quickly, the trade-off to do the research feels somehow less worth it.
Probably if you don't have any idea how to accomplish the task you need to cultivate the habit of still doing the research first. Wielding it skillfully is now the task of our industry, so we ought to be developing that skill and cultivating it in our team members.
I don't think that is a problem with AI, it is a problem with the idea that pure vibe-coding will replace knowledgeable engineers. While there is a loud contingent that hypes up this idea, it will not survive contact with reality.
Purely vibe-coded projects will soon break in unexplainable ways as they grow beyond trivial levels. Once that happens their devs will either need to adapt and learn coding for real or be PIP'd. I can't imagine any such devs lasting long in the current layoff-happy environment. So it seems like a self-correcting problem no?
(Maybe AGI, whatever that is, will change things, but I'm not holding my breath.)
The real problem we should be discussing is, how do we convince students and apprentices to abstain from AI until they learn the ropes for real.
> The real problem we should be discussing is, how do we convince students and apprentices to abstain from AI until they learn the ropes for real.
Learning the ropes looks different now. You used to learn by doing, now you need to learn by directing. In order to know how to direct well, you have to first be knowledgeable. So, if you're starting work in an unfamiliar technology, then a good starting point is read whatever O'Reilly book gives a good overview, so that you understand the landscape of what's possible with the tool and can spot when the LLM is doing (now) obvious bullshit.
You can't just Yolo it for shit you don't know and get good results, but if you build a foundation first through reading, you will do a lot better.
On vibe coding being self-correcting, I would point to the growing number of companies mandating usage of AI and the quote "the market can stay irrational longer than you can stay solvent". Companies routinely burn millions of dollars on irrational endeavours for years. AI has been promised as an insane productivity booster.
I wouldn't expect things to calm down for a while, even if real-life results are worse. You can make excuses for underperformance of these things for a very long time, especially if the CEO or other executives are invested.
> The real problem we should be discussing is, how do we convince students and apprentices to abstain from AI until they learn the ropes for real
I hate to say it but that's never going to happen :/
We most definitely should, especially so if you're working in a team or organization bigger than a handful of people. Because it's almost certain that you may need to change or interact with that code very soon in the lifetime of the project. When that happens you want to make sure the code aligns with your own mental model of how things work.
The industry has institutionalized this by making code reviews a very standard best practice. People think of code reviews mainly as a mechanism to reduce bugs, but turns out the biggest benefits (born out by studies) actually are better context-sharing amongst the team, mentoring junior engineers, and onboarding of new team-mates. It ensures that everyone has the same mental model of the system despite working on different parts of it (c.f. the story of the blind men and the elephant.) This results in better ownership and fewer defects per line of code.
Note, this also doesn't mean everybody reviews each and every PR. But any non-trivial PR should be reviewed by team-mates with appropriate context.
AI is not my coworker, with different tasks and responsibilities.
The comparison is oniy reasonable if most of your job is spent trying to understand their code, and make sure it did what you wanted. And with them standing next to you, ready to answer questons, explain anything I don't understand and pull in any external, relevant parts of the codebase.
Of course not that's a bit disingenuous. I would hope my colleagues write code that is comprehensible so it's maintainable. I think that if the code is so complex and inscrutable that only the author can understand it then it's not good code. AI doesn't create or solve this problem.
I do think when AI writes comprehensible code you can spend as much time as necessary asking questions to better understand it. You can ask about tradeoffs and alternatives without offending anybody and actually get to a better place in your own understanding than would be possible alone.
Who are this endless cohort of develops who need to maintain a 'deep understanding' of their code. I'd argue a high % of all code written globally on any given day that is not some flavour of boilerplate, while written with good intention, is ultimately just short-lived engineering detritus of it even gets a code review to pass.
If you're on HN there's a good chance you've self-selected into "caring about the craft and looking for roles that require more attention."
You need to care if (a) your business logic requirements are super annoyingly complex, (b) you have hard performance requirements, or (c) both. (c) is the most rare, (a) is the most common of those three conditions; much of the programmer pay disparity between the top and the middle or bottom is due to this, but even the jobs where the complexity is "only" business requirements tend to be quite a bit better compensated than the "simple requirements, simple needs" ones.
I think there's a case to be made that LLM tools will likely make it harder for people to make that jump, if they want to. (Alternately they could advance to the point where the distinction changes a bit, and is more purely architectural; or they could advance to the point where anyone can use an LLM to do anything - but there are so many conditional nuances to what the "right decision" is in any given scenario there that I'm skeptical.)
A lot of times floor-raising things don't remove the levels, they just push everything higher. Like a cheap crap movie today will visually look "better" from a technology POV (sharpness, special effects, noise, etc) than Jurassic Park from the 90s, but the craft parts won't (shot framing, deliberate shifts of focus, selection of the best takes). So everyone will just get more efficient and more will be expected, but still stratified.
And so some people will still want to figure out how to go from a lower-paying job to a higher-paying one. And hopefully there are still opportunities, and we don't just turn into other fields, picking by university reputations and connections.
> You need to care if (a) your business logic requirements are super annoyingly complex, (b) you have hard performance requirements, or (c) both. (c) is the most rare
But one of the most fun things you can do is C: creative game development coding. Like coding world simulations etc, you want to be both very fast but the rules and interactions etc is very coupled and complex compared to most regular enterprise logic that is more decoupled.
So while most work programmers do fits A, the work people dream about doing is C, and that means LLM doesn't help you make fun things, it just removes the boring jobs.
In my experience the small percent of developers who do have a deep understanding are the only reason the roof doesn’t come crashing in under the piles of engineering detritus.
> I would love to see an anti-AI take that doesn't hinge on the idea that technology forces people to be lazy/careless/thoughtless.
Here's mine, I use Cline occasionally to help me code but more and more I find myself just coding by hand. The reason is pretty simple which is with these AI tools you for the most part replace writing code with writing a prompt.
I look at it like this, if writing the prompt, and the inference time is less than what it would take me to write the code by hand I usually go the AI route. But this is usually for refactoring tasks where I consider the main bottleneck to be the speed at which my fingers can type.
For virtually all other problems it goes something like this: I can do X task in 10 minutes if i code it manually or I can prompt AI to do it and by the time I finish crafting the prompt and execute, it takes me about 8 minutes. Yes that's a savings of 2 minutes on that task and that's all fine and good assuming that the AI didn't make a mistake, if I have to go back and re-prompt or manually fix something, then all of a sudden the time it took me to complete that task is now 10-12 minutes with AI. Here the best case scenario is I just spent some AI credits for zero time savings and worse case is I spent AI credits AND the task was slower in the end.
With all sorts of tasks I now find myself making this calculation and for the most part, I find that doing it by hand is just the "safer" option, both in terms of code output but also in terms of time spent on the task.
GitHub copilot already does have speech to text and as my sibling comment mentions, on the Mac, it is globally available. It varies according to typing and speaking speed but speaking should be about five times faster than typing.
On a mac you can just use a hotkey to talk to an agentic CLI. It needs to be a bit more polished still IMO, like removing the hotkey requirement, with a voice command to break the agents current task.
I believe it does on newer macs (m4 has neural engine). It's not perfect, but I'm using it without issue. I suspect it'll get better each generation as Apple leans more into their AI offering.
There are also third parties like Wispr that I haven't tried, but might do a better job? No idea.
The mac one is pretty limited. I paid for a similar tool as above and the LLM backing makes the output so much better. All my industry specific jargon gets captured perfectly whereas the Apple dictation just made up nonsense.
I find myself often writing pseudo code (CLI) to express some ideas to the agent. Code can be a very powerful and expressive means of communication. You don't have to stop using it when it's the best / easiest tool for a specific case.
That being said, these agents may still just YOLO and ignore your instructions on occasion, which can be a time suck, so sometimes I still get my hands dirty too :)
> the idea that technology forces people to be careless
I don't think anyone's saying that about technology in general. Many safety-oriented technologies force people to be more careful, not less. The argument is that this technology leads people to be careless.
Personally, my concerns don't have much to do with "the part of coding I enjoy." I enjoy architecture more than rote typing, and if I had a direct way to impose my intent upon code, I'd use it. The trouble is that chatbot interfaces are an indirect and imperfect vector for intent, and when I've used them for high-level code construction, I find my line-by-line understanding of the code quickly slips away from the mental model I'm working with, leaving me with unstable foundations.
I could slow down and review it line-by-line, picking all the nits, but that moves against the grain of the tool. The giddy "10x" feeling of AI-assisted coding encourages slippage between granular implementation and high-level understanding. In fact, thinking less about the concrete elements of your implementation is the whole advantage touted by advocates of chatbot coding workflows. But this gap in understanding causes problems down the line.
Good automation behaves in extremely consistent and predictable ways, such that we only need to understand the high-level invariants before focusing our attention elsewhere. With good automation, safety and correctness are the path of least resistance.
Chatbot codegen draws your attention away without providing those guarantees, demanding best practices that encourage manually checking everything. Safety and correctness are the path of most resistance.
> The argument is that this technology leads people to be careless.
And this will always be a result of human preference optimization. There's a simple fact: humans prefer lies that they don't know are lies over lies that they do know are lies.
We can't optimize for an objective truth when that objective truth doesn't exist. So while doing our best to align our models they must simultaneously optimize they ability to deceive us. There's little to no training in that loop where outputs are deeply scrutinized, because we can't scale that type of evaluation. We end up rewarding models that are incorrect in their output.
We don't optimize for correctness, we optimize for the appearance of correctness. We can't confuse the two.
The result is: when LLMs make errors, those errors are difficult for humans you detect.
This results in a fundamentally dangerous tool, does it not? Tools that when they error or fail they do so safely and loudly. Instead this one fails silently. That doesn't mean you shouldn't use the tool but that you need to do so with an abundance of caution.
> I could slow down and review it line-by-line, picking all the nits, but that moves against the grain of the tool.
Actually the big problem I have with coding with LLMs is that it increases my cognitive load, not decreases it. Bring over worked results in carelessness. Who among us does not make more mistakes when they are tired or hungry?
That's the opposite of lazy, so hopefully answers OP.
I use LLMs for coding and I like it the way I am using it. I do not outsource thinking, and I do not expect it to know what I want without giving it context to my thoughts with regarding to the project. I have written a 1000 LOC program in C using an LLM. It was a success. I have reviewed it "line by line" though, I do not know why I would not do this. Of course it did not spit out 1000 LOC from the get go, we started small and we built upon our foundations. It has an idea of my thinking and my preferences with regarding to C and the project because of our interactions that gave it context.
> I have written a 1000 LOC program in C using an LLM.
> I have reviewed it "line by line" though, I do not know why I would not do this.
1k LOC is not that much. I can easily do this in a day's project.
But it's pretty rare you're going to be able to review every line in a mature project, even if you're developing that project. Those can contain hundreds or even thousands of files with hundreds (hopefully not thousands) of LOC. While it's possible to review every line it's pretty costly in time and it's harder since the code is changing as you're doing this...
Think of it this way, did you also review all the lines of code in all the libraries you used? Why not? The reasoning will be pretty similar. This isn't to say we shouldn't spend more time exploring the code we work with nor that we likely wouldn't benefit from this, but that time is a scarce resource. So the problem is when the LLM is churning out code faster than you can review.
While coding you are hopefully also debugging and thinking. By handing coding over to the LLM you decouple this. So you reduce your time writing lines of code but increase time spent debugging and analyzing. There will be times where this provides gains but IME this doesn't happen in serious code. But yeah, my quick and dirty scripts can be churned out a lot faster. That saves time, but not 10x. At least not for me
nobody in this or any meaningful software engineering discussion is talking about software projects that are 1000, or even 10000, SLoC. these are trivial and uninteresting sizes. the discussion is about 100k+ SLoC projects.
I do not see how this is always necessarily implied. And should I seriously always assume this is the case? Where are you getting this from? None of these projects people claim to successfully (or not) written with the help from LLM have 10k LOC, let alone >100k. Should they just be ignored because LOC is not >100k?
Additionally, why is it that whenever I mention success stories accomplished with the help of LLMs, people rush to say "does not count because it is not >100k LOC". Why does it not count, why should it not count? I would have written it by hand, but I finished much faster with the help of an LLM. These are genuine projects that solve real problems. Not every significant project has to have >100k LOC. I think we have a misunderstanding of the term "significant".
> nobody in this or any meaningful software engineering discussion is talking about software projects that are 1000, or even 10000, SLoC.
Because small programs are really quick and easy to write, there was never a bottleneck making them and the demand for people to write small programs is very small.
The difficulty of writing a program scales super linearly with size, an experienced programmer in his current environment easily writes a 500 line program in a day, but writing 500 meaningful lines to an existing 100k line codebase in a day is not easy at all. So almost all developer time in the world is spent making large programs, small programs is a drop in an ocean and automating that doesn't make a big difference overall.
Small programs can help you a lot, but that doesn't replace programmers since almost no programmers are hired to write small programs, instead automatically making such small programs mostly helps replace other tasks like regular white collar workers etc whose jobs are now easier to automate.
> but writing 500 meaningful lines to an existing 100k line codebase in a day is not easy at all.
I've had plenty of instances where it's taken more than a day to write /one line/ of code! I suspect most experienced devs have also had these types of experiences.
Not because the single line was hard to write but because the context in which it needed to be written.
Typing was never the bottleneck and I'm not sure why this is the main argument for LLMs (e.g. "LLMs save me from the boilerplate). When typing is a bottleneck it seems like it's more likely that the procedure is wrong. Things like libraries, scripts, and skeletons tend to be far better solutions for those problems. In tough cases abstraction can be extremely powerful, but abstraction is a difficult tool to wield.
There's an argument to be made that this gap is actually highlighting design issues rather than AI limitations.
It's entirely possible to have a 100k LOC system be made up of effective a couple hundred 500 line programs that are composed together to great effect.
That's incredibly rare but I did once work for a company who had such a system and it was a dream to work in. I have to think AIs are making a massive impact there.
> It's entirely possible to have a 100k LOC system be made up of effective a couple hundred 500 line programs that are composed together to great effect.
I'm confused. Are you imagining a program with 100k LoC is contained in a single file? Because you'd be insane to do such a thing. It's normally a lot of files with not LoC each, which de facto meets this criteria.
You may also wish to look at UNIX Philosophy. The idea that programs should be small and focused. A program should do one thing and do it well. But there's a generalization to this philosophy when you realize a function is a program.
I do agree there's a lot of issues with design these days but I think you've vastly oversimplified the problem.
> There's a simple fact: humans prefer lies that they don't know are lies over lies that they do know are lies.
As an engineer and researcher, I prefer lies (models, simplifications), that are known to me, rather than unknown unknowns.
I don't need to know exact implementation details, knowledge of aggregate benchmarks, fault rates and tolerances is enough. A model is a nice to have.
This approach works, in science (physics, chemistry, biology, ...) and in engineering (including engineering agentic and social sustems- social engineering).
> As an engineer and researcher, I prefer lies (models, simplifications), that are known to me, rather than unknown unknowns.
I think you misunderstood.
I'll make a corollary to help:
~> There's a simple fact: humans prefer lies that they believe are truths over lies that they do know are lies.
I'm insure if you: misread "lies that they don't know are lies", conflated unknown unknowns with known unknowns, or (my guess) misunderstood that I am talking about the training process which involves a human evaluator evaluating an LLM output. That last one would require the human evaluator to preference a lie over a lie that they do not know is actually a lie. I think you can see how we can't expect such an evaluation to occur (except through accident). For the evaluator to preference the unknown unknown they would be required to preference what they believe to be a falsehood over what they believe is truth. You'd throw out such an evaluator for not doing their job!
As a researcher myself, yes, I do also prefer known falsehoods over unknown falsehoods but we can only do this from a metaphysical perspective. If I'm aware of an unknown then it is, by definition, not an unknown unknown.
How do you preference a falsehood which you cannot identify as a falsehood?
How do you preference an unknown which you do not know is unknown?
We have strategies like skepticism to deal with this help with this but this doesn't make the problem go away. It ends up with "everything looks right, but I'm suspicious". Digging in can be very fruitful but is more frequently a waste of time for the same reason: if a mistake exists we have not identified the mistake as a mistake!
> I don't need to know exact implementation details, knowledge of aggregate benchmarks, fault rates and tolerances is enough.
I think this is a place where there's a divergence in science and engineering (I've worked in both fields). The main difference in them is at what level of a problem you're working on. At the more fundamental level you cannot get away with empirical evidence alone.
Evidence can only bound your confidence in the truth of some claim but it cannot prove it. The dual to this is a much simpler problem, as disproving a claim can be done with a singular example. This distinction often isn't as consequential in engineering as there are usually other sources of error that are much larger.
As an example, we all (hopefully) know that you can't prove the correctness of a program through testing. It's a non-exhaustive process. BUT we test because it bounds our confidence about its correctness and we usually write cases to disprove certain unintended behaviors. You could go through the effort to prove correctness but this is a monumental task and usually not worth the effort.
But right now we're talking about a foundational problem and such a distinction matters here. We can't resolve the limits of methods like RLHF without considering this problem. It's quite possible that there's no way around this limitation since there are no objective truths the majority of tasks we give LLMs. If that's true then the consequence is that a known unknown is "there are unknown unknowns". And like you, I'm not a fan of unknown unknowns.
We don't actually know the fault rates nor tolerances. Benchmarks do not give that to us in the general setting (where we apply our tools). This is a very different case than, say, understanding the performance metrics and tolerances of an o-ring. That part is highly constrained and you're not going to have a good idea of how well it'll perform as a spring, despite those tests having a lot of related information.
> If the thinking bit is your favorite part, AI allows you to spend nearly all of your time there if you wish, from concept through troubleshooting.
This argument is wearing a little thin at this point. I see it multiples times a day, rephrased a little bit.
The response, "How well do you think your thinking will go if you had not spent years doing the 'practice' part?", is always followed by either silence or a non-sequitor.
So, sure, keep focusing on the 'thinking' part, but your thinking will get more and more shallow without sufficient 'doing'
Separate from AI, as your role becomes more tech lead / team lead / architect you're also not really "doing" as much and still get involved in a lot of thinking by helping people get unstuck. The thinking part still builds experience. You don't need to type the code to have a good understanding of how to approach problems and how to architect systems. You just need to be making those decisions and gaining experience from them.
> You just need to be making those decisions and gaining experience from them.
The important part that everyone glosses over is the "gaining experience" part.
The experience you gained writing code lead to you being tech lead / team lead /architect.
The experience you get from those roles, including "helping people get unstuck", makes you valuable because there are people involved, not just technology. IOW, that is different to the experience you get from prompting.
We have yet to see how valuable the experience from prompting will be. At this point the prompters are just guessing that their skills won't atrophy, and that their new experience won't be at the same level as vibe-coders who can't spell "Python".
As a fairly senior person myself, and an occasional user of LLMs, and someone who has tried CC in recent months, the experience I got from LLMs, while not nothing, was not recognised by me as valuable in any way - it basically put me at the same skill level as a vibe-coder.
OTOH, the experience I got mentoring very junior engineers the month before that I recognised as instantly valuable; at the end of it I had learned new strategies for dealing with people, growing them, etc.
The only "experience" you get with LLM is "put another coin into the slot and pull the lever again".
> The only "experience" you get with LLM is "put another coin into the slot and pull the lever again".
I relate it to directors on a production. It's certainly very valuable to know how to operate a camera, and especially to understand lighting, storytelling, etc. It gives you insight in how to work with the people who are actually doing these tasks. It helps you to know when someone is gaslighting you, etc.
That being said, it's kind of an insane statement to say that all a director does is pull a lever. I'm sure there are a ton of wannabe directors who try to do exactly that and proceed to fail miserably if they don't adapt quickly to reality. But having a great director is obviously a huge differentiator in output.
Do I think we'll have as many programmers in the future as we do today? Probably not. I think we're going to see a real decimation of coders, but at the same time we might (I say "might") see much greater overall production that may not otherwise exist from the less talented vibers or w/e ridiculously critical name you want. Some of that is certainly going to be interesting and maybe even radically game changing.
IMO our feelings about this are about as relevant as shaking our fist at the cosmos.
> Separate from AI, as your role becomes more tech lead / team lead / architect you're also not really "doing" as much and still get involved in a lot of thinking by helping people get unstuck
True. But the roles as such require you to do a lot of thinking by helping a LOT of people. You end up shuffling between multiple projects/deliverables. Here we are talking about probably a developer working on a single project/deliverable and then equating it to AI. Not to mention the easy to forget part is that by the time you are a tech lead / team lead / architect you have so many hours that you know some stuff like back of your hand.
Do you think that all managers and tech leads atrophy because they don’t spend all day “doing”? I think a good number of them become more effective because they delegate the simple parts of their work that don’t require deep thought, leaving them to continue to think hard about the thorniest areas of what they’re working on.
Or perhaps you’re asking how people will become good at delegation without doing? I don’t know — have you been “doing” multiple years of assembly? If not, how are you any good at Python (or whatever language you currently use?). Probably you’d say you don’t need to think about assembly because it has been abstracted away from you. I think AI operates similarly by changing the level of abstraction you can think at.
> Do you think that all managers and tech leads atrophy because they don’t spend all day “doing”?
People have argued for years that software architects must write code.
Regarding your second paragraph: When you write python you then debug it at the level of the abstraction. You never debug the python interpreter. You can try to treat AI like an abstraction but it immediately breaks down as soon as you go to debug. It would only be a complete abstraction if you never had to deal with the generated code.
As an IC turned temporary manager that went back to being IC, yes, absolutely my skills atrophied. This isn't even a programming thing, this is just a regular human thing with most, arguably all, things that you don't practice for a while.
Also I find the idea that most managers or technical leads are doing any kind of "deep thought" hilarious, but that's just maybe my apathy towards management speaking.
Managers 100% lose their abilities, their focus shifts to completely different concerns -- codebase health, enabling people, tracking velocity metrics, etc. They still understand high-level concerns, of course (if we are talking about strong technical background), but they'd struggle a lot if just dropped into the codebase.
Tech leads can exist in many variants, but usually they spend the majority of time in code, so they don't lose it. If they become too good at managing and change their priorities, they _will_ gradually drift away too.
I hear all the time from people who have moved into management that their engineering skills atrophy. The only antidote is to continue doing IC work while managing.
It's about as much time as I think about caching artifacts and branch mispredict latencies. Things I cared a lot about when I was doing assembly, but don't even think about really in Python (or C++).
My assembly has definitely rotted and I doubt I could do it again without some refreshing but it's been replaced with other higher-level skills, some which are general like using correct data structures and algorithms, and others that are more specific like knowing some pandas magic and React Flow basics.
I expect this iteration I'll get a lot better at systems design, UML, algorithm development, and other things that are slightly higher level. And probably reverse-engineering as well :) The computer engineering space is still vast IMHO....
I think it's more like code review, which really is the worst part of coding. With AI, I'll be doing less of the fun bits (writing, debugging those super hard customer bugs), and much much more code review.
Conflict of interest or not, he's not really wrong. Anyone shipping code in a professional setting doesn't just push to prod after 5 people say LGTM to their vibe coded PR, as much as we like to joke around with it. There are stages of tests and people are responsible for what they submit.
As someone writing lots of research code, I do get caught being careless on occasion since none of it needs to work beyond a proof of concept, but overall being able to just write out a spec and test an idea out in minutes instead of hours or days has probably made a lot of things exist that I'd otherwise never be arsed to bother with. LLMs have improved enough in the past year that I can easily 0-shot lots of ad-hoc visualization stuff or adapters or simple simulations, filters, etc. that work on the first try and with probably fewer bugs than I'd include in the first version myself. Saves me actual days and probably a carpal tunnel operation in the future.
> I would love to see an anti-AI take that doesn't hinge on the idea that technology forces people to be lazy/careless/thoughtless.
I think this might simply be how the human brain works. Take autonomous driving as an example: while the car drives on its own the human driver is supposed to be alert and step in if needed. But does that work? Or will the driver's mind wander off because the car has been driving properly for the last half hour? My gut feeling is that it's inevitable that we'll eventually just shut out everything that goes smoothly and by the time it doesn't it might be too late.
We are not that different from our ancestors who used to roam the forests, trying to eat before they get eaten. In such an environment there is constantly something going on, some critters crawling, some leaves rustling, some water flowing. It would drive us crazy if we could not shut out all this regular noise. It's only when an irregularity appears that our attention must spring into action. When the leaves rustle differently than they are supposed to there is a good chance that there is some prey or a predator to be found. This mechanism only works if we are alert. The sounds of the forest are never exactly the same, so there is constant stimulation to keep up on our toes. But if you are relaxing in your shelter the tension is gone.
My fear is that AI is too good, to the point where it makes us feel like being in our shelter rather than in the forest.
> My gut feeling is that it's inevitable that we'll eventually just shut out everything that goes smoothly and by the time it doesn't it might be too late.
Yes. Productivity accelerates at an exponential rate, right up until it drives off a cliff (figuratively or literally).
> AI allows you to spend nearly all of your time there if you wish, from concept through troubleshooting
It does not! If you're using interactive IDE AI, you spend your time keeping the AI on the rails, and reminding it what the original task is. If you're using agents, then you're delegating all the the mid-level/tactical thinking, and perhaps even the planning, and you're left with the task of writing requirements granular enough for an intern to tackle, but this hews closer to "Business Analyst" than "Software Engineer"
Using an agentic workflow does not require you to delegate tge thinking. Agents are great at taking exactly what you want to do and executing. So spend an extra few minutes and lay out the architecture YOU want then let the ai do the work.
It's "anti-AI" from the perspective of an investor or engineering manager who assumes that 10x coding speed should 10x productivity in their organization. As a staff IC, I find it a realistic take on where AI actually sits in my workflow and how it relates to juniors.
> assumes that 10x coding speed should 10x productivity
This same error in thinking happens in relation to AI agents too. Even if the agent is perfect (not really possible) but other links in the chain are slower, the overall speed of the loop still does not increase. To increase productivity with AI you need to think of the complete loop, reorganize and optimize every link in the chain. In other words a business has to redesign itself for AI, not just apply AI on top.
Same is true for coding with AI, you can't just do your old style manual coding but with AI, you need a new style of work. Maybe you start with constraint design, requirements, tests, and then you let the agent loose and not check the code, you need to automate that part, it needs comprehensive automated testing. The LLM is like a blind force, you need to channel it to make it useful. LLM+Constraints == accountable LLM, but LLM without constraints == unaccountable.
I’ve been trying to re-orient for this exact kind of workflow and I honestly can’t declare whether it’s working.
I’ve switched to using Rust because of the rich type system and pedantic yet helpful compiler errors. I focus on high level design, traits, important types - then I write integration tests and let Claude go to town. I’ve been experimenting with this approach on my side project (backend web services related to GIS - nothing terribly low level) for about 4 months now and I honestly don’t know if it’s any faster than just writing the code myself. I suspect it’s not or only marginally faster at best.
I often find that I end up in a place where the ai generated code just has too many issues collected over iterations and needs serious refactoring that the agent is incapable of performing satisfactorily. So I must do it myself and that work is substantially harder than it would have been had I just written everything myself in the first place.
At work - I find that I have a deep enough understanding of our codebase that the agents are mostly a net-loss outside of boilerplate.
Perhaps I’m holding it wrong but I’ve been doing this for a while now. I am extremely motivated to build a successful side project and try to bootstrap myself out of the corporate world. I read blogs and watch vlogs on how others build their workflows and I just cannot replicate these claims of huge productivity gains.
> in every industry where AI has achieved any level of mastery.
Which industries are those? What does that mastery look like?
> There's a divide between people ...
No, there is not. If one is not willing to figure out a couple of ffmpeg flags, comb through k8s controller code to see what is possible and fix that booting error in their VMs then failure in "mental experiences" is certain.
The most successful people I have met in this profession are the ones who absolutely do not tolerate magic and need to know what happens from the moment they press the ON on their machine, till the moment they turn is OFF again.
I never made a case against LLMs and similar ML applications in the sense that they negatively impact mental agility. The cases I made so far include, but are not limited to:
— OSS exploded on the promise that software you voluntarily contributed to remains to benefit the public, and that a large corporation cannot tomorrow simply take your work and make it part of their product, never contributing anything back. Commercially operated LLMs threaten OSS both by laundering code and by overwhelming maintainers with massive, automatically produced and sometimes never read by a human patches and merge requests.
— Being able to claim that any creative work is merely a product of an LLM (which is a reality now for any new artist, copywriter, etc.) removes a large motivator for humans to do fully original creative work and is detrimental to creativity and innovation.
— The ends don’t justify the means, as a general philosophical argument. Large-scale IP theft had been instrumental at the beginning of this new wave of applied ML—and it is essentially piracy, except done by the powerful and wealthy against the rest of us, and for profit rather than entertainment. (They certainly had the money to license swaths of original works for training, yet they chose to scrape and abuse the legal ambiguity due to requisite laws not yet existing.)
— The plain old practical “it will drive more and more people out of jobs”.
— Getting everybody used to the idea that LLMs now mediate access to information increases inequality (making those in control of this tech and their investors richer and more influential, while pushing the rest—most of whom are victims of the aforementioned reverse piracy—down the wealth scale and often out of jobs) more than it levels the playing field.
— Diluting what humanity is. Behaving like a human is how we manifest our humanness to others, and how we deserve humane treatment from them; after entities that walk and talk exactly like a human would, yet which we can be completely inhumane to, become commonplace, I expect over time this treatment will carry over to how humans treat each other—the differentiator has been eliminated.
— It is becoming infeasible to operate open online communities due to bot traffic that now dwarves human traffic. (Like much of the above, this is not a point against LLMs as technology, but rather the way they have been trained and operated by large corporate/national entities—if an ordinary person wanted to self-host their own, they would simply not have the technical capability to cause disruption at this scale.)
This is just what I could recall off the top of my head.
Good points here, particularly the ends not justifying the means.
I'm curious for more thoughts on "will drive more and more people out of jobs”. Isn't this the same for most advances in technology (e.g., steam engine, computers s, automated toll plazas, etc.). In some ways, it's motivation for making progress; you get rid of mundane jobs. The dream is that you free those people to do something more meaningful, but I'm not going to be that blindly optimistic :) still, I feel like "it's going to take jobs" is the weakest of arguments here.
It happened before, and it was an issue back then as well.
Mundane job may be mundane (though note that it is sometimes subjective), but it earns someone bread and butter and it is always economic stress when the job is gone and many people have to retrain.
If we were to believe those of us who paint this technology as mind-bogglingly world-changing, that someone is now nearly everyone and unlike the previous time there is no list of jobs you could choose from (that would last longer than the time it takes to train).
If we were not to believe the hype, still: when those jobs got automated back then, people moved to jobs that are liable to be obsolete this time, except there is also just more people overall, so even purely in terms of numbers this seems to be a bigger event.
Yeah but it’s the same issue. Open source licenses (just like other laws) weren’t designed for the age of LLMs. I’m sure most people don’t care, but I bet a lot of maintainers don’t want their code fed to LLMs!
Intellectual property as a concept wasn't designed for the age of LLM. You have to add a bunch of exceptions to copyright (fair use, first sale) to get it to not immediately lead to scenarios that don't make any intuitive sense. LLMs explode these issues because now you can mechanically manipulate ideas, and this forces to light new contradictions that intellectual property causes.
I agree that commercially operated LLMs undermine the entire idea of IP, but it is one of the problems with them, not with the concept of intellectual property, which is an approximation of what has been organically part of human society motivating innovation since forever: benefits of being an author and degree of ownership over intangible ideas. When societies were smaller and local, it just worked out and you would earn respect and status if you came up with something cool, whereas in a bigger and more global society that relies on the rule of law rather than informal enforcement legal protections are needed to keep things working sort of the same way.
I doubt anyone would consider it a problem if large-scale commercial LLM operators were required to respect licenses and negotiate appropriate usage terms. Okay, maybe with one exception: their investors and shareholders.
> IP is an approximation of what has been organically part of human society and drove innovation since forever: benefits of being an author and degree of ownership over intangible ideas.
It is not! It's a very recent invention. Especially its application to creative works contradicts thousands of years of the development of human culture. Consider folk songs.
> I doubt anyone would consider it a problem if large-scale commercial LLM operators were required to respect licenses and negotiate appropriate usage terms. Okay, maybe with one exception: their investors and shareholders.
And the issue I'm gesturing at is that you run into different contradicting conclusions about how LLMs should interact with copyright depending on exactly what line of logic you follow, so the courts will never be able to resolve how it should work. These are issues can only be conclusively resolved with writing new laws to decide it's going to work, but that will eventually only make the contradictions worse and complicate the hoops that people will have to jump through as the technology evolves in new ways.
In my experience AI coding is not going to spew out a derivative of another project unless your objective is actually to build a derivative of that software. If your code doesn't do the same or look the same it doesn't really meet the criteria to be a derivative of someone else's.
I mostly use Cursor for writing test suites in Jest with TypeScript, these are so specific to my work I don't think it's possible they've infringed someone else's.
Intellectual property theft? If gp’s referring to the Books3 shadow library not having been legally bought, it’s not realistically more than 197k books worth less than $10MM. And let’s not forget Intellectual property rights only exist “ To promote the Progress of Science and useful Arts.”
There's certainly some debate to be had about ingesting a book about vampires and then writing a book about vampires.
But I think programming is much more "how to use the building blocks" and mathematics than ingesting narratives and themes. More like ingesting a dictionary and thesaurus and then writing a book about vampires.
Yes. But here we are, people ignoring all the theft that has happened. People generating images on stolen art and call themselves artists. People using it to program and call themselves programmers. Also, it seems to me that so many people just absolutely ignore all the security related issues coming with coding agents. Its truly a dystopia. But we are on hackernews so obviously people will glaze about "AI" on here.
Maybe we should get upset about people using cameras to take pictures of art on the same principles. And what about that Andy Warhol guy, what a pretender!
… so I hope you can see why I don’t actually agree with your comment about who’s allowed to be a artist, and not just dismiss me as a glazer
Who is taking pictures of art and calls themselves artist for that? People are generating images from stolen art and creating businesses off of that. People are faking being an artist on social media. But I shouldn't be surprised that people with no actual talent defend all of this.
> If the thinking bit is your favorite part, AI allows you to spend nearly all of your time there if you wish, from concept through troubleshooting
I think this depends. I prefer the thinking bit, but it's quite difficult to think without the act of coding.
It's how white boarding or writing can help you think. Being in the code helps me think, allows me to experiment, uncover new learnings, and evolve my thinking in the process.
Though maybe we're talking about thinking of different things? Are you thinking in the sense of what a PM thinks about ? User features, user behavior, user edge cases, user metrics? Or do you mean thinking about what a developer thinks about, code clarity, code performance, code security, code modularization and ability to evolve, code testability, innovative algorithms, innovative data-structure, etc. ?
I’m struggling to understand how they are asserting one follows from the other. I’m not a SWE, but do a lot of adjacent types of work (infrastructure automation and scripting, but also electronics engineering, and I’m also a musician), and the “thinking” part where I get to deploy logic and reasoning to solve novel challenges is certainly a common feature among these activities I certainly enjoy, and I feel it’s a core component of what I’m doing.
But the result of that thinking would hardly ever align neatly with whatever an LLM is doing. The only time it wouldn’t be working against me would be drafting boilerplate and scaffolding project repos, which I could already automate with more prosaic (and infinitely more efficient) solutions.
Even if it gets most of what I had in mind correct, the context switching between “creative thinking” and “corrective thinking” would be ruinous to my workflow.
I think the best case scenario in this industry will be workers getting empowered to use the tools that they feel work best for their approach, but the current mindset that AI is going to replace entire positions, and that individual devs should be 10x-ing their productivity is both short-sighted and counterproductive in my opinion.
I like to think of the essential/accidental complexity split. The true way to solve essential complexity in a business settings is to talk with stakeholders.
Tools, libraries and platforms are accidental complexities. If you have already learned how to use them, you can avoid the pitfalls and go straight to the solution, which is why the common advice is to use boring technologies, as the solutions are widely documented and there are a lot of case studies.
If it's something new, then you can learn as you go by starting small and refactor as you're gaining more confidence. Copy-pasta or code generation is usually bad in that case. You don't know enough to judge the long-term costs.
Code is tech debt. When people talk about software engineering, it's to make sure that this debt doesn't outweigh the benefits of using the software.
I agree with your comment sentiment, but I believe that you, like many others have the cycle in wrong order. I don't fault anyone for it because it's the flow that got handed down to us from the days of waterfall development.
My strong belief after almost twenty years of professional software development is that both us and LLMs should be following the order: build, test, reflect, plan, build.
Writing out the implementation is the process of materializing the requirements, and learning the domain. Once the first version is out, you can understand the limits and boundaries of the problem and then you can plan the production system.
This is very much in line with Fred Brooks' "build one to throw away" (written ~40 years ago in the "The Mythical Man-Month". While often quoted, if you never read his book, I urge you to do so, it's both entertaining, and enlightening on our software industry), startup culture (if you remove the "move fast break things" mantra), and governmental pilot programs (the original "minimum viable").
My approach has been to "yolo" my way through the first time, yes in a somewhat lazy and careless manner, get a working version, and then build a second time more thoughtfully.
> There's a divide between people who enjoy the physical experience of the work and people who enjoy the mental experience of the work. If the thinking bit is your favorite part, AI allows you to spend nearly all of your time there if you wish, from concept through troubleshooting. But if you like the doing, the typing, fiddling with knobs and configs, etc etc, all AI does is take the good part away.
I don't know... that seems like a false dichotomy to me. I think I could enjoy both but it depends on what kind of work. I did start using AI for one project recently: I do most of the thinking and planning, and for things that are enjoyable to implement I still write the majority of the code.
But for tests, build system integration, ...? Well that's usually very repetitive, low-entropy code that we've all seen a thousand times before. Usually not intellectually interesting, so why not outsource that to the AI.
And even for the planning part of a project there can be a lot of grunt work too. Haven't you had the frustrating experience of attempting a re-factoring and finding out midway it doesn't work because of some edge case. Sometimes the edge case is interesting and points to some deeper issue in the design, but sometimes not. Either way it sure would be nice to get a hint beforehand. Although in my experience AIs aren't at a stage to reason about such issues upfront --- no surprise since it's difficult for humans too --- of course it helps if your software has an oracle for if the attempted changes are correct, i.e. it is statically-typed and/or has thorough tests.
It's like folks complaining that people don't know how to code in Assembly or Machine Language.
New-fangled compiled languages...
Or who use modern, strictly-typed languages.
New-fangled type-safe languages...
As someone that has been coding since it was wiring up NAND gates on a circuit board, I'm all for the new ways, but there will definitely be a lot of mistakes, jargon, and blind alleys; just like every other big advancement.
I'm not sure if you are insinuating that the article is an anti-AI take, but in case it wasn't clear, it's not. It is about doing just what you suggested:
> Just as tech leads don't just write code but set practices for the team, engineers now need to set practices for AI agents. That means bringing AI into every stage of the lifecycle
The technology doesn't force people to be careless, but it does make it very easy to be careless, without having to pay the costs of that carelessness until later.
> There's a divide between people who enjoy the physical experience of the work and people who enjoy the mental experience of the work.
Pretty clearly that’s not the divide anyone’s talking about, right?
Your argument should maybe be something about thinking about the details vs thinking about the higher level. (If you were to make that argument, my response would be: both are valuable and important. You can only go so far working at one level. There are certainly problems that can be solved at one level, but also ones that can’t.)
My experience is that you need the “physical” coding work to get a good intuition of the mechanics of software design, the trade-offs and pitfalls, the general design landscape, and so on. I disagree that you can cleanly separate the “mental” portion of the work. Iterating on code builds your mental models, in a way that merely reviewing code does not, or only to a much more superficial degree.
Idk I feel like even without using LLMs the job is 90% thinking and planning. And it’s nice to go the last 10% on your own to have a chance to reflect and challenge your earlier assumptions.
I actually end up using LLMs in the planning phase more often than the writing phase. Cursor is super good at finding relevant bits of code in unfamiliar projects, showing me what kind of conventions and libraries are being used, etc.
Most of us do nothing but remix the past solutions.
Since we don't know what else might already exist in the world without digging very deep, we fool ourselves into thinking that we do something very original and unique.
And even truly novel and unique things are more often than not composites of things that have come before prior. We all stand on the shoulders of giants/priors.
I think that the problem is, at the end of the day, the engineer must specify exactly what they want the program to do.
You can do this in Python, or you can do this in English. But at the end of the day the engineer must input the same information to get the same behavior. Maybe LLMs make this a bit more efficient but even in English it is extremely hard to give exact specification without ambiguity (maybe even harder than Python in some cases).
"I would love to see an anti-AI take that doesn't hinge on the idea that technology forces people to be lazy/careless/thoughtless."
I'm not impressed by AI because it generates slop. Copilot can't write a thorough working test suite to save it's life. I think we need a design and test paradigm to properly communicate with AI for it to build great software.
Completely agreed. Whether it be AI or otherwise, I consider anything that gives me more time to focus on figuring out the right problem to solve or iterating on possible solutions to be good.
Yet every time that someone here earnestly testifies to whatever slight but real use they’ve found of AI, an army of commentators appears ready to gaslight them into doubting themselves, always citing that study meant to have proven that any apparent usefulness of AI is an illusion.
At this point, even just considering the domain of programming, there’s more than enough testimony to the contrary. This doesn’t say anything about whether there’s an AI bubble or overhype or anything about its social function or future. But, as you note, it means these cardboard cutout critiques of AI need to at least start from where we are.
> I would love to see an anti-AI take that doesn't hinge on the idea that technology forces people to be lazy/careless/thoughtless.
Are you genuinely saying you never saw a critique of AI on environmental impact, or how it amplifies biases, or how it widens the economic gap, or how it further concentrates power in the hands of a few, or how it facilitates the dispersion of misinformation and surveillance, directly helping despots erode civil liberties? Or, or, or…
You don’t have to agree with any of those. You don’t even have to understand them. But to imply anti-AI arguments “hinge on the idea that technology forces people to be lazy/careless/thoughtless” is at best misinformed.
Go grab whatever your favourite LLM is and type “critiques of AI”. You’ll get your takes.
I'm not an AI zealot but I think some of these are over blown.
The energy cost is nonsensical unless you pin down a value out vs value in ratio and some would argue the output is highly valuable and the input cost is priced in.
I don't know if it will end up being a concentrated power. It seems like local/open LLMs will still be in the same ballpark. Despite the absurd amounts of money spent so far the moats don't seem that deep.
Baking in bias is a huge problem.
The genie is out of the bottle as far as people using it for bad. Your own usage won't change that.
The motes are incredibly deep, because the established players are being propped up by VC money. Without that VC money, it's impossible to compete, unless you have a way to sustain losses for an indefinite amount of time.
> You don’t have to agree with any of those. You don’t even have to understand them. But to imply anti-AI arguments “hinge on the idea that technology forces people to be lazy/careless/thoughtless” is at best misinformed.
We can certainly discuss some of those points, but that’s not what is in question here. The OP is suggesting there is only one type of anti-AI argument they are familiar with and that they’d “love” to see something different. But I have to question how true that is considering the myriad of different arguments that exist and how easy they are to find.
> I would love to see an anti-AI take that doesn't hinge on the idea that technology forces people to be lazy/careless/thoughtless.
Here's a couple points which are related to each other:
1) LLMs are statistical models of text (code being text). They can only exist because huge for-profit companies ingested a lot of code under proprietary, permissive and copyleft licenses, most of which at the very least require attribution, some reserve rights of the authors, some give extra rights to users.
LLM training mixes and repurposes the work of human authors in a way which gives them plausible deniability against any single author, yet the output is clearly only possible because of the input. If you trained an LLM on only google's source code, you'd be sued by google and it would almost certainly reproduce snippets which can be tracked down to google's code. But by taking way, way more input data, the blender cuts them into such fine pieces that the source is undetectable, yet the output is clearly still based on the labor of other people who have not been paid.
Hell, GPT3 still produced verbatim snippets of inverse square root and probably other well known but licensed code. And github has a checkbox which scans for verbatim matches so you don't accidentally infringe copyright by using copilot in a way which is provable. Which means they take extra care to make it unprovable.
If I "write a book" by taking an existing book but replacing every word with a synonym, it's still plagiarism and copyright infringement. It doesn't matter if the mechanical transformation is way more sophisticated, the same rules should apply.
2) There's no opt out. I stopped writing open source over a year ago when it became clear all my code is unpaid labor for people who are much richer than me and are becoming richer at a pace I can't match through productive work because they own assets which give them passive income. And there's no license I can apply which will stop this. I am not alone. As someone said, "Open-Source has turned into a form of unpaid internship"[0]. It might lead to a complete death of open source because nobody will want to see their work fed into a money printing machine (subscription based LLM services) and get nothing in return for their work.
> But if you like the doing, the typing, fiddling with knobs and configs, etc etc, all AI does is take the good part away.
I see quite the opposite. For me, what makes programming fun is deeply understanding a problem and coming up with a correct, clear to understand, elegant solution. But most problems a working programmer has are just variations of what other programmers had. The remaining work is prompting the LLMs in the right way that they produce this (describing the problem instead of thinking about its solutions) and debugging bugs LLMs generated.
A colleague vibe coded a small utility. It's useful but it's broken is so many ways, the UI falls apart when some text gets too long, labels are slightly incorrect and misleading, some text handle decimal numbers in weird ways, etc. With manually written code, a programmer would get these right the right time. Potential bugs become obvious as you're writing the code because you are thinking about it. But they do not occur to someone prompting an LLM. Now I can either fix them manually which is time consuming and boring, or I can try prompting an LLM about every single one which is less time consuming but more boring and likely to break something else.
Most importantly, using an LLM does not give me deeper understanding of the problem or the solution, it keeps knowledge locked in a black box.
OK: AI is slow when using the said loop. AI is like poker. You bet with time. 60 seconds to type prompt and generate a response. Oh it is wrong ok let's gamble another 60 seconds...
At least when doing stuff the old way you learn something if you waste time.
That said AI is useful enough and some poker games are +EV.
So this is more caution-AI than anti-AI take. It is more an anti-vibe-koolaid take.
This depends entirely on how you use said AI. You can have it read code, explain why was it done this or that way, and once it has the context you ask to think about implementing feature X. There is almost no gambling involved there, at best the level frustration you would have with a colleague. If you start from blank context, tell it to implement full app, you are purely just gambling.
> You can have it read code, explain why was it done this or that way,
The thing is that, once you're experienced enough, it's faster to just glance at the code and have the answer right, instead of playing the guessing game with AI.
> and once it has the context you ask to think about implementing feature X
I'm always amazed at someone using that methodology. When I think about a feature, first is to understand the domain, second is which state I'm like to start from and where all the data are. If you don't get these two steps right, what you'll have is a buggy/incomplete implementation. And if you do get these steps right, the implementation is likely trivial.
I'm not sure where is the misunderstanding but your second paragraph is exactly why I ask AI the questions you question in the first paragraph. I ask the AI to do the domain research, see what we are starting from and THEN ask it to think about a feature. They are not really for me, they are for the AI to have good context what we are working on. As you said, the implementation is then almost trivial and the AI is less likely to mess it up.
The thing is, the domain is often more difficult than the actual implementation. And often only a subset matters (different for each task). So I’m wondering if teaching the AI the correct subdomain is indeed faster than just code the solution.
Also trivial work can benefit the coder. Like a light jog between full sprints for your brain. Reviewing code can be more taxing than writing it as you need to retieve the full context at once instead of incremental steps.
I suspect the root of the disagreement is more about what kinds of work people do. There are many different kinds of programming and you can’t lump them all together. We shouldn’t expect an AI tool to be a good fit for all of them, any more than we should expect Ruby to be a good fit for embedded development or C to be a good fit for web apps.
My experience with low level systems programming is that it’s like working with a developer who is tremendously enthusiastic but has little skill and little understanding of what they do or don’t understand. Time I would have spent writing code is replaced by time spent picking through code that looks superficially good but is often missing key concepts. That may count as “thinking” but I wouldn’t categorize it as the good kind.
Where it excels for me is as a superpowered search (asking it to find places where we play a particular bit-packing game with a particular type of pointer works great and saves a lot of time) and for writing one-off helper scripts. I haven’t found it useful for writing code I’m going to ship, but for stuff that won’t ship it can be a big help.
It’s kind of like an excavator. If you need to move a bunch of dirt from A to B then it’s great. If you need to move a small amount of dirt around buried power lines and water mains, it’s going to cause more trouble than it’s worth.
I think this is one of the most cogent takes on the topic that I've seen. Thanks for the good read!
It's also been my experience that AI will speed up the easy / menial stuff. But that's just not the stuff that takes up most of my time in the first place.
The last paragraph feels more wrong the more I think about it.
Imagine an AI as smart as some of the smartest humans, able to do everything they intellectually do but much faster, cheaper, 24/7 and in parallel.
Why would you spend any time thinking? All you'll be doing it is the things an AI can't do - 1) feeding it input from the real world and 2) trying out its output in the real world.
1) Could be finding customers, asking them to describe their problem, arranging meetings, driving to the customer's factory to measure stuff and take photos for the AI, etc.
2) Could be assembling the prototype, soldering, driving it to the customer's factory, signing off the invoice, etc.
None of that is what I as a programmer / engineer enjoy.
If actual human-level AI arrives, it'll do everything from concept to troubleshooting, except the parts where it needs presence in the physical world and human dexterity.
If actual human-level AI arrives, we'll become interfaces.
"AI" does not encourage real thinking. "AI" encourages hand waving grand plans that don't work, CEO style. All pro-"AI" posts focus on procedures and methodologies, which is just LARPing thinking.
Using "AI" is just like speed reading a math book without ever doing single exercise. The proponents rarely have any serious public code bases.
Well, yeah. I like getting computers to automate things and solve problems. Typing in boilerplate and syntax is just a means to that end, and not even remotely the most interesting part. I don't like managing my own memory garbage collection either, so I prefer to use tools that handle that for me.
I mean, I guess when I was really early in my career I'd get a kick out of writing a clever loop or whatever, and I drank deep from all the low level coding wisdom that was available, but the scope of what I care about these days has long since expanded outward.
I see a lot of comments like this and it reflects strongly negatively on the engineers who write it imho. As in I've been a staff level engineer at both Meta and Google and a lead at various startups in my time. I post open source projects here on HN from time to time that are appreciated. I know my shit. If someone tells me that LLMs aren't useful i think to myself "wow this person is so unable to learn new tools they can't find value in one of the biggest changes happening today".
That's not to say that LLMs as good as some of the more outrageous claims. You do still need to do a lot of work to implement code. But if you're not finding value at all it honestly reflects badly on you and your ability to use tools.
The craziest thing is i see the above type of comment on linked in regularly. Which is jaw dropping. Prospective hiring managers will read it and think "Wow you think advertising a lack of knowledge is helpful to your career?" Big tech co's are literally firing people with attitudes like the above. There's no room for people who refuse to adapt.
I put absolute LLM negativity right up there with comments like "i never use a debugger and just use printf statements". To me it just screams you never learnt the tool.
> I put absolute LLM negativity right up there with comments like "i never use a debugger and just use printf statements". To me it just screams you never learnt the tool.
To me it just feels different. Learning to use a debugger made me feel more powerful and "in control" (even though I still use a lot of print debugging; every tool has its place). Using AI assisted coding makes me feel like a manager who has to micro-manage a noob - it's exhausting.
It’s exhausting because most of us like to sit down open an IDE and start coding with the belief that ambiguous or incomplete aspects will be solved as they come up. The idea of writing out the spec of a feature from without ambiguity, handling error states, etc. and stopping to ask if the spec is clear is boring and not fun.
To many of us coding us simply more fun. At the same time, many of us could benefit from that exercise with or without the LLM.
For pet projects, it might be less fun. For real projects, having to actually think about what I'm trying to do has been a net positive, LLM or no LLM.
Agree. I've never had the attention span to learn code, but I utilize LLM's heavily and have recently started managing my first large coding project with CC to what seems like good results.
As LLM get better, more and more people will be able to create projects with only rudimentary language understanding. I don't think LLMS can ever be as good as some of the outrageous claims; it's a lot like that 3rd grade project kids do on writing instruction on making a PB&J. LLM's cannot read minds and will only follow the prompt given to them. What I'm trying to say is that eventually there will be a time where being able to effectively manage coding agents efficiently will be more externally valuable than knowing how to write code.
This isn't to say that engineering experience is not valuable. Having a deep understanding of how to design and build secure and efficient software is a huge moat between experienced engineers and vibecoders like me, and not learning how to best use the tools that are quickly changing how the world operates will leave them behind.
Why would you point out two tool obsessed companies as something positive? Meta and Google are overstaffed and produce all sorts of tools that people have to use because someone's performance evaluation depends on it.
The open source code of these companies is also not that great and definitely not bug free. Perhaps these companies should do more thinking and less tooling politics.
> That's not to say that LLMs as good as some of the more outrageous claims. You do still need to do a lot of work to implement code. But if you're not finding value at all it honestly reflects badly on you and your ability to use tools.
You are in a forum full of people that routinely claim that vibe coding is the future, that LLMs already can fully replace engineers, and if you don't think so you are just a naysayer that is doing it wrong.
Rephrasing your claim, LLMs are just moderately useful, far from being the future-defining technology people invested in it wants it to be. But you choose to rally against people not interested in marketing it further.
Given the credentials you decided to share, I find it unsurprising.
Alternatively - there's 5 million other things I could be learning and practicing to improve as a programmer before trying out the new AI codegen-du-jour. Until I'm Fabrice Bellard, focusing on my fundamental skills will make me a better programmer, faster, than focusing on the hype of the day.
I'm on a small personal project with it intentionally off, and I honestly feel I'm moving through it faster and certainly having a better time. I also have a much better feel for the code.
These are all just vibes, in the parlance of our times, but it's making me question why I'm bothering with LLM assisted coding.
Velocity is rarely the thing in my niche, and I'm not convinced babysitting an agent is all in all faster. It's certainly a lot less enjoyable, and that matters, right?
More specifically for (1), the combined set of predators, advertisers, businesses, and lazy people using it to either prey or enshittify or cheat will make up the vast majority of use cases.
First, skilled engineers using LLMs to code also think and discuss and stare off into space before the source code starts getting laid down. In fact: I do a lot, lot more thinking and balancing different designs and getting a macro sense of where I'm going, because that's usually what it takes to get an LLM agent to build something decent. But now that pondering and planning gets recorded and distilled into a design document, something I definitely didn't have the discipline to deliver dependably before LLM agents.
Most of my initial prompts to agents start with "DO NOT WRITE ANY CODE YET."
Second, this idea that LLMs are like junior developers that can't learn anything. First, no they're not. Early-career developers are human beings. LLMs are tools. But the more general argument here is that there's compounding value to working with an early-career developer and there isn't with an LLM. That seems false: the LLM may not be learning anything, but I am. I use these tools much more effectively now than I did 3 months ago. I think we're in the very early stages of figuring how to get good product out of them. That's obvious compounding value.
Regardless of that, personally i'd really like it if they could actually learn from interacting with them. From a user's perspective what i'd like to do is to be able to "save" the discussion/session/chat/whatever, with everything the LLM learned so far, to a file. Then later be able to restore it and have the LLM "relearn" whatever is in it. Now, you can already do this with various frontend UIs, but the important part in what i'd want is that a) this "relearn" should not affect the current context window (TBH i'd like that entire concept to be gone but that is another aspect) and b) it should not be some sort of lossy relearning that loses information.
There are some solutions but there are all band-aids to fundamental issues. For example you can occasionally summarize whatever discussed so far and restart the discussion. But obviously that is just some sort of lossy memory compression (i do not care that humans can do the same, LLMs are software running on computers, not humans). Or you could use some sort of RAG but AFAIK this works via "prompt triggering" - i.e. only via your "current" interaction, so even if the knowledge is in there but whatever you are doing now wouldn't trigger its index the LLM will be oblivious to it.
What i want is, e.g., if i tell to the LLM that there is some function `foo` used to barfize moo objects, then go on and tell it other stuff way beyond whatever context length it has, save the discussion or whatever, restore it next day, go on and tell it other stuff, then ask it about joining splarfers, it should be able to tell me that i can join splarfers by converting them to barfized moo objects even if i haven't mentioned anything about moo objects or barfization since my previous session yesterday.
(also as a sidenote, this sort of memory save/load should be explicit since i'd want to be able to start from clean slate - but this sort of clean slate should be because i want to, not as a workaround to the technology's limitations)
You want something that requires an engineering breakthrough.
Models don't have memory, and they don't have understanding or intelligence beyond what they learned in training.
You give them some text (as context), and they predict what should come after (as the answer).
They’re trained to predict over some context size, and what makes them good is that they learn to model relationships across that context in many dimensions. A word in the middle can affect the probability of a word at the end.
If you insanely scale the training and inference to handle massive contexts, which is currently far too expensive, you run into another problem: the model can’t reliably tell which parts of that huge context are relevant. Irrelevant or weakly related tokens dilute the signal and bias it in the wrong direction, the distribution flatten or just ends up in the wrong place.
That's why you have to make sure you give it relevant well attended context, aka, context engineering.
It won't be able to look at a 100kloc code base and figure out what's relevant to the problem at hand, and what is irrelevant. You have to do that part yourself.
Or what some people do, is you can try to automate that part a little as well by using another model to go research and build that context. That's where people say the research->plan->build loop. And it's best to keep to small tasks, otherwise the context needing for a big task will be too big.
> You want something that requires an engineering breakthrough.
Basically, yes. I know the way LLMs currently work wouldn't be able to provide what i want, but what i want is a different way that does :-P (perhaps not even using LLMs).
I'm using a "memory" MCP server which basically just stores facts to a big json file and makes a search available. There's a directive in my system prompt that tells the LLM to store facts and search for them when it starts up.
It seems to work quite well and I'll often be pleasantly surprised when Claude retrieves some useful background I've stored, and seems to magically "know what I'm talking about".
Not perfect by any means and I think what you're describing is maybe a little more fundamental than bolting on a janky database to the model - but it does seem better than nothing.
>First, skilled engineers using LLMs to code also think and discuss and stare off into space before the source code starts getting laid down
Yes, and the thinking time is a significant part of overall software delivery, which is why accelerating the coding part doesn't dramatically change overall productivity or labor requirements.
if you're spending anywhere near as many engineering hours "getting code to work" as you're spending "thinking" then something is wrong in your process
This harkens back to the waterfall vs agile debates. Ideally there would be a plan of all of the architecture with all the pitfalls found out before any code is laid out.
In practice this can’t happen because 30 minutes into coding you will find something that nobody thought about.
In the micro, sure. In the macro, if you are finding architecture problems after 30 minutes, then I’m afraid you aren’t really doing architecture planning up front.
Depends on what you're building. If it's another crud app sure, but if its something remotely novel you just can't understand the landscape without walking through it at least once.
haha I always do that. I think it's a good way to have some control and understand what it is doing before the regurgitation. I don't like to write code but I love the problem solving/logic/integrations part.
Most of my initial prompts to agents start with "DO NOT WRITE ANY CODE YET."
Copilot has Ask mode, and GPT-5 Codex has Plan/Chat mode for this specific task. They won't change any files. I've been using Codex for a couple of days and it's very good if you give it plenty of guidance.
I’ve also had success writing documentation ahead of time (keeping these in a separate repo as docs), and then referencing it for various stages. The doc will have quasi-code examples of various features, and then I can have a models stubbed in one pass, failing tests in the next, etc.
But there’s a guiding light that both the LLM and I can reference.
Sometimes I wonder if pseudocode could be better for prompting than expressive human language, because it can follow a structure and be expressive but constrained -- have you seen research on this and whether this an effective technique?
I use YOLO mode all the time with Claude Code. Start on a new branch, put it in plan mode (shift + tab twice), get a solid plan broken up in logical steps, then tell it to execute that plan and commit in sensible steps. I run that last part in "YOLO mode" with commit and test commands white listed.
This makes it move with much less scattered interactions from me, which allows focus time on other tasks. And the committing parts make it easier for me to review what it did just like I would review a feature branch created by a junior colleague.
If it's done and tests pass I'll create a pull request (assigned to myself) from the feature branch. Then thoroughly review it fully, this really requires discipline. And then let Claude fetch the pull request comments from the Github API and fix them. Again as a longer run that allows me to do other things.
YOLO-mode is helpful for me, because it allows Claude to run for 30 minutes with no oversight which allows me to have a meeting or work on something else. If it requires input or approval every 2 minutes you're not async but essentially spending all your time watching it run.
It's more about having the LLM give you a plan of what it wants to do and how it wants to do it, rather rhan code. Then you can mold the plan to fit what you really want. Then you ask it to actually start writing code.
Even Claude Code lets you approve each change, but it's already writing code according to a plan that you reviewed and approved.
With tools you know ahead of time that they will do the job you expect them to do with very high probability, or fail (with low probability) in some obvious way. With LLMs, there are few tasks you can trust them to do, and you also don't know their failure mode. They can fail yet report success. They work like neither humans nor tools.
An LLM behaves like a highly buggy compiler that too frequently reports success while emitting incorrect code. Not knowing where the bugs are, the only thing you can try to do is write the program in some equivalent way but with different syntax, hoping you won't trigger a bug. That is not a tool programmers often use. Learning to work with such a compiler is a skill, but it's unclear how transferable or lasting that skill is.
If LLMs advance as significantly and as quickly as some believe they will, it may be better to just wait for the buggy compiler to be fixed (or largely fixed). Presumably, much less skill will be required to achieve the same result that requires more skill today.
I think what the article gets at, but doesn't quite deliver on, is similar to this great take from Casey Muratori [1] about how programming with a learning-based mindset means that AI is inherently not useful to you.
I personally find AI code gen most useful for one-off throwaway code where I have zero intent to learn. I imagine this means that the opposite end of the spectrum where learning is maximized is one where the AI doesn't generate any code for me.
I'm sure there are some people for which the "AI-Driven Engineering" approach would be beneficial, but at least for me I find that replacing those AI coding blocks with just writing the code myself is much more enjoyable, and thus more sustainable to actually delivering something at the end.
"learning is maximized is one where the AI doesn't generate any code for me"
Obviously you have to work to learn, but to me this is a bit like saying learning is maximized when you never talk to anyone or ask for help — too strong.
I spend more time thinking now that I use Claude Code. I write features that are often 400-600 word descriptions of what I want—-something I never would’ve done beforehand.
That thinking does result in some big tradeoffs…I generally get better results faster but I also have a less complete understanding of my code.
But the notion that Claude Code means an experienced developer spends less thinking carefully is simply wrong. It’s possible (even likely) that a lot of people are using agents poorly…but that isn’t necessarily the agent’s fault.
1) not all coding is the same. You might be working on a production system. I might need a proof of concept
2) not everyone's use of the coding agents is the same
3) developer time, especially good developer time has a cost too
I would like to see an article that frames the tradeoffs of AI assisted coding. Specifically without assigning value judgments (ie goodness or badness). Really hard when your identity is built around writing code.
Article seems to cherry pick Microsoft marketing to claim 10% gains, when the Harvard ran study from earlier this year showed a 10% slowdown. Id rather articles point first at the research being done that's agnostic of any corporation trying to push its propaganda if possible.
Every time I read stuff like this I honestly wonder if the author is using the same tools I am.
I can have Claude Code bang out everything from boilerplate to a working prototype to a complex algorithm embedded in a very complex and confusing code base. It’s not correct 100% of the time but it’s pretty damn close. And often times it comes up with algorithms I would have never thought of initially.
These things are at least a 10x multiple of my time.
The difficulty is we skeptics have read claims like yours tens of times, and our response is always, "please share a repo built this way and an example of your prompts," and I at least have never seen anyone do so.
I'd love for what you say to be possible. Comments like yours often cause me to take another crack at agentic workflows. I'm disappointed every time.
> lack in-depth knowledge of your business, codebase, or roadmap
So give them some context. I like Cline's memory bank approach https://docs.cline.bot/prompting/cline-memory-bank which includes the architecture, progress, road map etc. Some of my more complex projects use 30k tokens just on this, with the memory bank built from existing docs and stuff I told the model along the way. Too much context can make models worse but overall it's a fair tradeoff - it maintains my coding style and architecture decisions pretty well.
I also recommend in each session using Plan mode to get to a design you are happy with before generating any code.
Not in my experience. I still spend much of the time thinking before prompting. Then I spend time reviewing the AI written code before using it. Does not feel like a trap. It mostly feels like having a super experienced pair programmer. I may be using it differently than others since I do not have it integrated to my IDE. I use it like I used google + stackoverflow before it.
To be fair you can’t really appreciate if you’ve been trapped unless you test yourself without the ai agent for some time and see if there is a difference in output. If there is, you’ve been trapped.
I think the post, while extensive, missed one important issue.
The fact then when we read others' code, we don't remember/integrate it into our thinking as well as we do when we're the authors. So mentoring "AI Juniors" provides less growth then doing the job, esp. if it is mostly corrective actions.
The article assumes that AI coding is thoughtless, but prompting is writing, and writing is thinking. If you approach AI coding the same way as regular programming, your thinking phase involves crafting a prompt that describes your thoughts.
I appreciate these takes, but I can't help to think this is just the weird interim time where nothing is quite good enough, but in a year an article like this would clearly be "overthinking" the problem.
Like when those in the know could clearly see the internet's path to consuming everything but it just hasn't happened yet so there were countless articles trying to decide if it was a fad or not, a waste of money, bad investment, etc.
Jumping straight into coding is a very junior thing to do.
Using Plan mode in Cline or other agent based workflows is day and night in the outputs.
Alas, at least in Cline it seems plan mode doesn’t read files just works off context which is insane to me and hinders its usefulness, anyone know why that happens?
> Using Plan mode in Cline or other agent based workflows is day and night in the outputs.
Agreed. My tool agnostic workflow is to work up a planning/spec doc in the docs/feature-plans folder. I use one chat thread to make that. First it creates the basic plan, then then we pick it apart together, I manually fix bad assumptions, then in a new chat, we implement.
Before and after implementation, I run my /gilfoyle command for a constructive roast, then my /sec command for a thorough security review. After implementing this, and a bit more, the final LLM output quality is much higher.
edit: adding "make sure we are applying known patterns used in our app for our solution, don't reinvent the wheel." helped a ton. I am on mobile atm, and that might not be the exact wording.
When it tells me that I need to switch to act mode for it to read files and create a detailed the plan, I just chide gently and ask it to read the damn files in plan mode. Works every time. I wish I dont have to do that.
Cline plan mode doesn't tend to read files by default but you can tell it 'read all files necessary to establish a detailed plan'. GPT5 also seems more eager to read files.
The "thinking & coding" vs "thinking & fixing" graph is interesting. I've found this to be the case recently as I've been trying out Codex. I expected to spend a lot of time fixing the AI's code. Weirdly that has led to me spending a long time fixing issues which turn out to be nothing to do with the code.
Most recently I was struggling to get an authentication setup working. I spent at least an hour combing through the code looking for the mistake. The issue turned out to be that the VM I was working on had a broken ipv6 configuration.
Broadly the critique is valid where it applies; I don’t know if it accurately captures the way most people are using LLMs to code, so I don’t know that it applies in most case.
My one concrete pushback to the article is that it states the inevitable end result of vibe coding is a messy unmaintainable codebase. This is empirically not true. At this point I have many vibecoded projects that are quite complex but work perfectly. Most of these are for my private use but two of them serve in a live production context. It goes without saying that not only do these projects work, but they were accomplished 100x faster than I could have done by hand.
Do I also have vibecoded projects that went of the rails? Of course. I had to build those to learn where the edges of the model’s capabilities are, and what its failure modes are, so I can compensate. Vibecoding a good codebase is a skill. I know how to vibecode a good, maintainable codebase. Perhaps this violates your definition of vibecoding; my definition is that I almost never need to actually look at the code. I am just serving as a very hands-on manager. (Though I can look at the code if I need to - have 20 years of coding experience. But if I find that I need to look at the code, something has already gone badly wrong.)
Relevant anecdote: A couple of years ago I had a friend who was incredibly skilled at getting image models to do things that serious people asserted image models definitely couldn’t do at the time. At that time there were no image models that could get consistent text to appear in the image, but my friend could always get exactly the text you wanted. His prompts were themselves incredible works of art and engineering, directly grabbing hold of the fundamental control knobs of the model that most users are fumbling at.
Here’s the thing: any one of us can now make an image that is better than anything he was making at the time. Better compositionality, better understanding of intent, better text accuracy. We do this out of the box and without any attention paid to promoting voodoo at all. The models simply got that much better.
In a year or two, my carefully cultivated expertise around vibecoding will be irrelevant. You will get results like mine by just telling the model what you want. I assert this with high confidence. This is not disappointing to me, because I will be taking full advantage of the bleeding edge of capabilities throughout that period of time. Much like my friend, I don’t want to be good at managing AIs, I want to realize my vision.
100x is such a crazy claim to me - you’re saying you can do in 4 days what would have previously taken over a year. 5 weeks and you can accomplish what would have taken you a decade without LLMs.
In most cases I would never have undertaken those projects at all without AI. One of the projects that is currently live and making me money took about 1 working day with Claude Code. It’s not something I ever would have started without Claude Code, because I know I wouldn’t have the time for it. I have built websites of similar complexity in the past, and since they were free-time type endeavors, they never quite crossed the finish line into commerciality even after several years of on-again-off-again work. So how do you account that with a time multiplier? 100x? Infinite speedup? The counterfactual is a world where the product doesn’t exist at all.
This is where most of the “speedup” happens. It’s more a speedup in overall effectiveness than raw “coding speed.” Another example is a web API for which I was able to very quickly release comprehensive client side SDKs in multiple languages. This is exactly the kind of deterministic boilerplate work LLMs are ideal for, and that would take a human a lot of typing, and looking up details for unfamiliar languages. How long would it have taken me to write SDKs in all those languages by hand? I don’t really know, I simply wouldn’t have done it, I would have just done one SDK in Python and said good enough.
If you really twist my arm and ask me to estimate the speedup on some task that I would have done either way, then yeah I still think a 100x speedup is the right order of magnitude, if we’re talking about Claude Code with Opus 4.1 specifically. In the past I spent about a five years very carefully building a suite of tools for managing my simulation work and serving as a pre/post-processor. Obviously this wasn’t full-time work on the code itself, but the development progressed across that timeframe. I recently threw all that out and replaced it with stuff I rebuilt in about a week with AI. In this case I was leveraging a lot of the learnings I gleaned from the first time I built it, so it’s not a fair one-to-one comparison, but you’re really never going to see a pure natural experiment for this sort of thing.
I think most people are in a professional position where they are sort of externally rate limited. They can’t imaging being 100x more effective. There would be no point to it. In many cases they already sit around doing nothing all day, because they are waiting for other people or processes. I’m lucky to not be in such a position. There’s always somewhere I can apply energy and see results, and so AI acts as an increasingly dramatic multiplier. This is a subtle but crucial point: if you never try to use AI in a way that would even hypothetically result in a big productivity multiplier (doing things you wouldn’t have otherwise done, doing a much more thorough job on the things you need to do, and trying to intentionally speed up your work on core tasks) then you can’t possibly know what the speedup factor is. People end up sounding like a medieval peasant suddenly getting access to a motorcycle and complaining that it doesn’t get them to the market faster, and then you find out that they never actually ride it.
I wonder, have you sat down and tried to vibecode something with Claude Code? If so, what kind of multiplier would you find plausible?
I had the AI implement two parallel implementations of the same thing in one project. Was lots of fun when it was 'fixing' the one that wasn't being used. So yeah, it can definitely muck up your codebase.
Hah today I discovered Claude Code has been copy/pasting gigantic blocks of conditions and styling every time I ask it to add a "--new" flag or whatever in a once tiny now gigantic script I've been adding features to.
It worked fine until recently where I will ask it to tweak some behavior of a command with a flag and it does a diff with like hundreds of lines. So now it's struggling to catch every place it needs to change some hardcoded duplicate values it decided to copy/paste into two dozen random places in the code.
To be fair it is doing a decent job unfucking it now that I noticed and started explicitly showing it how ridiculously cumbersome and unmaintainable it made things with specific examples and refactoring. But if I hadn't bothered to finally sit down and read through it thoroughly it would have just become more broken and inconsistent as it grew exponentially.
Yeah, it will definitely do dumb stuff if you don’t keep an eye on it and intervene if you see the signs that it’s heading in the wrong direction. But it’s very good at course correcting and if you end up in a truly disastrous state you can almost always fix it be reverting to the last working commit and start a fresh context.
I think this would benefit from examples of including coding assistants in the stages enumerated; how can the agent be included in each stage? I've seen posts about successful collaboration with agents at say Google, where there is tons of upfront work among humans to agree on design, then work with the agent to build out parts of the project and ensuring thorough test suites are included.
Does including an agent at each stage of this cycle mean "context engineering"? Is this then just more text and assets to feed in at each stage of LLM ussage to provide the context for the next set of tokens to generate for the next stage of the cycle? Is there something deeper that can be done to encode this level of staged development into the agent's weights/"understanding"? Is there an established process for this yet?
the flaw in the article is acting like engineers always have a choice. the writers presents the contrasts of "fair delegation vs mollycoddling" mirroring "ai-driven development vs vibe coding"... but that sacrifice for short-term gain at the expense of scale is often draconically enforced.
obviously good and experienced engineers aren't going to be vibe coders/mollycoddlers by nature. but many good and experienced engineers will be pressured to make poor decisions by impatient business leaders. and that's the root of most AI anxiety: we all know it's going to be used as irresponsibly and recklessly as possible. it's not about the tech. it's about a system with broken incentives.
This omits the deep knowledge required for traditional coding. This opens up coding to non devs, e.g. product managers.
For vibe coding you need systems thinking, planning and logic, but less craftmansship.
For PO's the chart looks different, here the traditional flow contains: "polish concepts, make mocks, make stories, dailies, handovers, revisions, waiting for devs busy with other things"
The vibing PO has none of that.
Not saying this is sustainable for big projects already, but it is for ever growing "small" projects (especially if it's a techincal PM that can code a bit). It's just so much faster without devs -_-.
Disclaimer: I am such a PO. What I now wonder: how can I mix vibe coding and properly developed foundations (ai assisted, but not 99% vibed). One answer in my view is splitting services in vibecoded and core, but dependencies on the core already slow you down a lot. Curious to hear how others mix and match both.
What I actually do right now:
- fullstack PoC vibecoded
- specifications for core based on PoC findings
- Build proper vibecoded V1 mocking what moves to core. But here already a more structured vibecoding approach too.
- replace mocks with actual core
Not yet done but planned: build throwaway PoC in a branch of the core so I can also vibe on top of the core directly, including modification of the core.
The deep knowledge really isn’t all that deep. A couple years in the weeds and you have it. What this really hurts is outsourced devs. In the past a non coding person could come up with the spec and hire someone from a developing nation to make it on the cheap to that spec. It is still possible to work like this of course, resulting in working code compared to llm that might hallucinate a passing test condition that you can’t appreciate with your lack of coding chops. It is just the ai seems “faster” and the way it is paid for less in front of your face. Really in practice, nothing new was really gained. Pm always could hire code. Now they hire nondeterministic generated code but they are still essentially hiring code, submitting spec, having something else write the code.
Perhaps. Then again, most people aren't working in fields where racing to the finish is a requirement. Corporate work I've found is really slow moving for a lot of reasons. Things getting backburnered and pushed back is pretty standard. It takes time for all the stakeholders to digest information and make feedback. There is also the phenomenon where no one really appreciably works after thanksgiving so for most american corporate workers it is like they are on a 10.5 month year as it is. All this to say that even with diminished feedback cycle wait time and lightened communication workload, I don't think the product is getting out any faster.
Has anyone read up on the recent paper from Meta/FAIR -- CWM: An Open-Weights LLM for Research on Code Generation with World Models
Which looks to attempt to give better "coding" understanding to the model instead of mere tokens and positioning and hence improve the coding capabilities of these "brilliant but unpredictable junior engineer" coding agents:
Even outside of AI coding I have found a tremendous amount of value in using AI to produce a requirements and spec document for me to code from. The key unlock for me is asking AI to “interview” me about how this system/feature should work. As part of that process it will often ask a question that gets me thinking about interesting edge cases.
I will say I always provide an initial context document about the feature/system, to avoid us starting with trivial questions. After about 45minutes I’ll often feel I’ve covered enough ground and given the problem enough thought to really put pen to paper. Off the back of this I’ll ask it to summarise the spec and produce a document. This can be a good point to ditch AI if you are so inclined but still get value from it.
LLM coding agents can't learn from experience on our code, but we can learn from using them on our code, and in the context of our team and processes. I started creating some harnesses to help get more of what we want from these tools, and less of what we need to work too much on - eg, creating specialized agents to refactor code and test after it's been generated, and make it more in line with our standards, removing bogus tests, etc. The learning is embedded in the prompts for these agents.
I think that this approach can already get us pretty far. One thing I'm missing is tooling to make it easier to build automation on top of, eg, Claude Code, but I'm sure it's going to come (and I'm tempted to try vibe coding it; if only I had the time).
There is this statement in the article: LLMs are lightning fast junior engineers.
I don't know if that is right or wrong. To me, good LLMs are a lightning fast better version of me. That means that I can write code like some LLMs but it'll take me days to do it (if I use a language that I'm not good at, for example Rust) but with carefully crafted prompts, LLMs take maybe half an hour or less.
suggests a workflow where AI is used almost exclusively to speed up the writing of known, targeted code whose structure has already been thought out. And possibly as a (non-coding) sounding board during the thinking out.
The thinking part is the same, yes, but I doubt the fixing is.
Fixing something you have written is much easier than fixing something someone else (or an AI) has written, just because you don't have the mental model for the code, which is the most important part of debugging and refactoring.
Especially with junior engineers it is helpful to ask them to provide a Loom video or other proof that they have verified a feature or bug works as intended. I have tried setting up Claude Code with playwright to verify it's work, but so far am not very satisfied with the results. Any tools that are helpful with this end to end testing for web apps using Claude Code and other AI assistants? Feel free to share your product if it is relevant.
I've seen reasonable results from letting Claude Code apply test driven development and having a good e2e test suite on top of that (with Playwright). In that setup giving it Playwright MCP to access the app and verify why e2e tests are working or not working, and for writing new tests helps.
Just giving it an MCP to test changes also didn't work for me. But the combination with e2e tests was better.
The post starts with some valid observation on how to help dev teams mature. And then, all of a sudden the junior devs are replaced by agents. That’s it. End of story.
AI coding is more like being a PM than an engineer. Obviously PM’s exist and don’t know as much about the tech they make as the engineers, but are nevertheless useful.
I cannot express how tired I am of seeing this beyond stupid take.
If you truly believe that, you have either only ever worked with the most piss-poor junior engineers, or you simply have never worked with junior engineers.
LLMs do not learn, LLMs do not ask clarifications, LLMs do not wonder if they're going in the wrong direction, LLMs do not have taste, LLMs do not have opinions, LLMs write absolutely illogical nonsense, LLMs do not ask themselves what is the best for the customer, LLMs have no context beyond what you have explicitely fed them for that one specific task, and much more.
Looks like another BS article that tries to compare a junior dev. to an AI. And its not even close. Anyone that actually tried to use AI tooling should know better. Feels like the CEOs that got sold this ideas are force-feeding this idea to senior devs and sadly senior devs are trying to gulp it down instead of rejecting the entire idea.
LLMs are more like a lightning fast pseudo-random text generator at this point. To see that though you'd need to run it a few times, test it and understand the output. Something that's a bit beyond the casual amateur ability. Maybe what we need is the AI bubble to burst first.
Its sad that we've gotten here though. Every tried having a coffee with your preferred AI as opposed to doing it with a highly motivate junior? Maybe you're not even going to get a chance to experience this.
For now, my mind is still made up. I leave the door open to be shown any serious piece of software that is built primarily through agentic workflows. Having tried to use these tools over the past month to build a critical piece of infrastructure for my company, I agree with OP. I spent so much time wrangling back unnecessary garbage that the LLM found was important, that I wondered if just writing it in one shot would have been actually faster. Simple things like 'test all this workflow logic' resulted in the LLM inserting a non-sensical mock at the top of the test file that took me an hour or two to unwind.
Other than that, I keep hearing the same arguments - "LLMs free up more time for me to think about the 'important things'." Son, your system is not durable, your tests are misleading, and you can't reason about what's happening because you didn't write it. What important things are left to think about??
It is an interesting take. I teach programming to designers as a part of a course called "Emerging technologies". Although, it is fun to see what students create, but it is not fun to solve doubts. When you are teaching basics, the students would quickly fire up ChatGPT and make some slop up. In the end, I have to fix their codes. I think, the learning output is reduced as they have not written any code. I am genuinely concerned as an educator. One thing that is missing is differentiating AI output and understanding what to keep and what to ignore. The sheer "aesthetic" of it. I feel, many do not take time to develop this very "human" skill and become very output oriented from the very start. This, IMO, affects learning. These tools are also quite addictive due to the sense of 'manufactured' certainty they offer, which is something that hinders learning. What is the point of learning how to add when you have the cheatsheet next to you?
> While the LLMs get to blast through all the fun, easy work at lightning speed, we are then left with all the thankless tasks: testing to ensure existing functionality isn’t broken, clearing out duplicated code, writing documentation, handling deployment and infrastructure, etc.
I’ve found LLMs just as useful for the "thankless" layers (e.g. tests, docs, deployment).
The real failure mode is letting AI flood the repo with half-baked abstractions without a playbook. It's helpful to have the model review the existing code and plan out the approach before writing any new code.
The leverage may be in using LLMs more systematically across the lifecycle, including the grunt work the author says remains human-only.
That's my experience as well. The LLM is great for what I consider scaffolding. I can describe the architecture I want, some guidelines in CLAUDE.md, then let it write a bunch of stubbed out code. It saves me a ton of time typing.
It's also great for things that aren't creative, like 'implement a unit test framework using google test and cmake, but don't actually write the tests yet'. That type of thing saves me hours and hours. It's something I rarely do, so it's not like I just start editing my cmake and test files, I'd be looking up documentation, and a lot of code that is necessary, but takes a lot of time.
With LLMs, I usually get what I want quickly. If it's not what I want, a bit of time reviewing what it did and where it went wrong usually tells me what I need to give it a better prompt.
> Test-Driven Development: generating extensive test cases prior to implementation to guide implementation and prevent regression.
I’ve found this concept trips CC up—- assertions are backwards, confusing comments in the test, etc. Just starting a prompt with “Use TDD to…” really helps.
I do appreciate that this article moves past absolute negativity on LLMs and actually speaks to the fact that they are extremely useful for well defined programming tasks. I'm a bit sick of articles that are just pure negativity on these tools.
I will raise that LLMs are pretty good at some of the non-coding tasks too.
eg. "I'm currently creating an AI for a turn based board game. Without doing any implementation, create a plan for the steps that need to be done including training off real world game data".
The LLM creates a tasklist for iterative steps to accomplish the above. It usually needs correction specific to the business/game needs but it's a great start and i recommend doing this just so the LLM has a doc with context on what its trying to achieve in a bigger picture as you have it complete tasks.
The article is pretty interesting, perhaps some marmite takes, but the bit that chimed with me is the vibe coding vs AI-driven engineering. Senior management at my work is obsessed with vibe-coding and are constantly pushing engineers to promote vibe code to PROD. It’s dispiriting to see parts of our code base begin to fill with manager+LLM slop …
Effective AI coding is actually extremely slow if you take into account an exhaustive planning stage where the task specification is laid down in sufficient and unambiguous detail. I had to get the LLM to review my spec over twenty times, always freshly, before I thought it was good enough to be implemented well. Also, it really helps for multiple diverse LLMs to review the spec, as they all have their unique insights. In this way, AI coding also helps me avoid numerous bugs that could have left me in trouble if not for the AI.
Once the planning is done, the actual coding is very fast. The human review that follows is again slow, often also leading to minor new tickets.
A friend of mine who is newer to coding than me is worried that by using AI so much, he's losing his ability to code, and its also killing his motivation because using AI to generate code is just not fun. We do "analog coding sessions" together where we code together without LLM assistance. Its much more enjoyable!
This is why you do not use Claude Code for complete project overhaul. For example, if you are writing in Python, categorize your functions and modules very well so that when you ask Claude's help, it won't get lost and start screwing up the entire project. I use modular design with Claude Code very often, and this is the only way I have found it useful. I won't let it change the code of preexisting projects, and only make it analyze individual files/functions for improvements.
People without a true coding background get stuck after a single function because the context window is still so narrow, and code assistants do not get the full picture of the project. It reminds me of the offshore developer's work ethic: "But sir, you told me to add this button there like that, so I deleted the entire codebase". Without thinking about why in general. It keeps saying, "You are absolutely right! I shouldn't have done that!" I just like working with a crappy coder from Fiverr or another freelancer site.
This article is a great example of how the human is struggling to extrapolate what happens next. There won't be any humans anywhere near this part of the tech stack, just like no one building a SAAS writes assembly code; or has to put together their own server cluster a datacenter (remember pre-cloud?) for their company anymore. He's dead Jim. No one is telling anyone to ship the ML code without any testing. Human coders also make mistakes. I'll bet that in a few years product managers / QA people would rather work with an ML stack to generate code than a human engineering team. It'll not just be cheaper and faster, but a lot less hassle & more accurate. As an example, Python has roughly ~100 or so "keywords", and extensive public libraries and open source algorithms to call upon. Anyone who thinks this presents any sort of challenge for an LLM to profoundly master is is delusional. They can do IMO-grade math, and help proving novel theorems. They can code your YC startup just fine.
Article being discussed in this thread isn't intended to be a luddite rejection of AI. It's just a mistake I see people keep making (and have made myself) and some thoughts on how to avoid it with the tools we have today.
I don't understand why some people are so upset with AI coding - no one forces them to use it.
Now, if you say the problem is that you don't want to be inflicted with other people's AI code, just enforce more meaningful tests. There has always been bad code in the past, and there always will be.
I, for one, am doing great with AI coding - and my feeling is that more emphasis on project structure is the way forward for better end to end results.
one axis that is missing from the discussion is how fast they are improving. We need ~35 years to get a senior software engineer (from birth to education to experience). These things are not even 3.5 years old. I am very interested in this space, if you are too dm me on X:@fabmilo I am in SF.
IMO a vibe coder who is speaking their ideas to an agent which implements them is going to have way more time to think than a hand coder who is spending 80% of their time editing text.
If you know, understand that you are in possession of a massive and temporary information asymmetry advantage, and you should run with it as hard and fast as you can to gain the biggest lead possible, before the author and the rest of the world gain that advantage too. Go, go now, go fast, do it in parallel, and don’t stop until you win. Opportunities like this are extremely rare not just in your life, but in the history of our society. Best of luck.
He is saying that theres a massive group of people declining to use AI, or writing blog posts about why it is bad, and that is a competitive advantage for people who will use them.
The assumption he is making is that the tools are good, and that the tools will eventually be used by everyone. So the competitive advantage is time limited.
He is largely correct, although I think productivity gains are overblown.
Probably something about how using LLMs for coding is such an amazing opportunity or something judging by how he implies the author would be surpassed due to information asymmetry.
I would love to see an anti-AI take that doesn't hinge on the idea that technology forces people to be lazy/careless/thoughtless.
The plan-build-test-reflect loop is equally important when using an LLM to generate code, as anyone who's seriously used the tech knows: if you yolo your way through a build without thought, it will collapse in on itself quickly. But if you DO apply that loop, you get to spend much more time on the part I personally enjoy, architecting the build and testing the resultant experience.
> While the LLMs get to blast through all the fun, easy work at lightning speed, we are then left with all the thankless tasks
This is, to me, the root of one disagreement I see playing out in every industry where AI has achieved any level of mastery. There's a divide between people who enjoy the physical experience of the work and people who enjoy the mental experience of the work. If the thinking bit is your favorite part, AI allows you to spend nearly all of your time there if you wish, from concept through troubleshooting. But if you like the doing, the typing, fiddling with knobs and configs, etc etc, all AI does is take the good part away.
> I would love to see an anti-AI take that doesn't hinge on the idea that technology forces people to be lazy/careless/thoughtless.
The article sort of goes sideways with this idea but pointing out that AI coding robs you a deep understanding of the code it produces is a valid and important criticism of AI coding.
A software engineer's primary job isn't producing code, but producing a functional software system. Most important to that is the extremely hard to convey "mental model" of how the code works and an expertise of the domain it works in. Code is a derived asset of this mental model. And you will never know code as well as a reader and you would have as the author for anything larger than a very small project.
There are other consequences of not building this mental model of a piece of software. Reasoning at the level of syntax is proving to have limits that LLM-based coding agents are having trouble scaling beyond.
> And you will never know code as well as a reader and you would have as the author for anything larger than a very small project.
This feels very true - but also consider how much code exists for which many of the current maintainers were not involved in the original writing.
There are many anecdotal rules out there about how much time is spent reading code vs writing. If you consider the industry as a whole, it seems to me that the introduction of generative code-writing tools is actually not moving the needle as far as people are claiming.
We _already_ live in a world where most of us spend much of our time reading and trying to comprehend code written by others from the past.
What's the difference between a messy codebase created by a genAI, and a messy codebase where all the original authors of the code have moved on and aren't available to ask questions?
> What's the difference between a messy codebase created by a genAI, and a messy codebase where all the original authors of the code have moved on and aren't available to ask questions?
The difference is the hope of getting out of that situation. If you've inherited a messy and incoherent code base, you recognize that as a problem and work on fixing it. You can build an understanding of the code through first reading and then probably rewriting some of it. This over time improves your ability to reason about that code.
If you're constantly putting yourself back into that situation through relegating the reasoning about code to coding agent, then you won't develop a mental model. You're constantly back at Day 1 of having to "own" someone else's code.
The key point is "relegating the reasoning". The real way to think about interfacing with LLMs is "abstraction engineering". You still should fully understand the reasoning behind the code. If you say "make a form that captures X, Y, Z and passes it to this API" you relegate how it accomplishes that goal and everything related to it. Then you look at the code and realize it doesn't handle validation (check the reasoning), so you have it add validation and toasts. But you are now working on a narrower level of abstraction because the bigger goal of "make a user form" has been completed.
Where this gets exhausting is when you assume certain things that you know are necessary but don't want to verify - maybe it let's you submit an email form with no email, or validates password as an email field for some reason, etc. But as LLMs improve their assumptions or you manage context correctly, the scale tips towards this being a useful engineering tool, especially when what you are doing is a well-trodden path.
I find this to be too rosy a story about using agentic coding to add to a codebase. In my experience, miss a small detail about the code and the agent may can go out of control creating a whole new series of errors that you wouldn’t have had to fix. And even if you don’t miss a detail, the agent eventually forgets because of the limited context window.
This is why I’ve constrained my use of AI agents to mostly “read-only and explain” use cases, but I have very strict conditions for letting it write. In any case, whatever productivity gains you supposedly “get” for its write scenarios, you should be subtracting your expenses to fix its output later and/or payments made for a larger context window or better reasoning. It’s usually not worth the trouble to me when I have plenty of experience and knowledge to draw from and can write the code as it should be myself.
So there’s another force at work here that to me answers the question in a different way. Agents also massively decrease the difficulty of coming into someone else’s messy code base and being productive.
Want to make a quick change or fix? The agent will likely figure out a way to do it in minutes rather the than hours it would take me to do so.
Want to get a good understanding of the architecture and code layout? Working with an agent for search and summary cuts my time down by an order of magnitude.
So while agree there’s a lot more “what the heck is this ugly pile of if else statements doing?” And “why are there three modules handling transforms?”, there is a corresponding drop in cost to adding features and paying down tech debt. Finding the right balance is a bit different in the agentic coding world, but it’s a different mindset and set of practices to develop.
In my experience this approach is kicking the can down the road. Tech debt isn't paid down, it's being added to, and at some point in the future it will need to be collected.
When the agent can't kick the can any more who is going to be held responsible? If it is going to be me then I'd prefer to have spent the hours understanding the code.
> You're constantly back at Day 1 of having to "own" someone else's code.
If only there were some people in software engineering in this situation before AI… oh wait.
In the current times you’re either an agent manager or you’re in for a surprise.
> In the current times you’re either an agent manager or you’re in for a surprise.
This opinion seems to be popular, if only in this forum and not in general.
What I do not understand is this;
> What's the difference between a messy codebase created by a genAI, and a messy codebase where all the original authors of the code have moved on and aren't available to ask questions?
Messy codebases made by humans are known to be a bad thing that causes big problems for software that needs to be maintained and changed. Much effort goes into preventing them and cleaning them up.
If you want to work with AI code systems successfully then you better apply these exact same efforts. Documentation, composition, validation, evaluation, review and so on.
I bring this up because it's a solution to what you're pointing out as a problem and yet the status quo is to write even messier and harder to understand code (even before AI code). So I'm just saying, humans are really good at shooting themselves in the foot and blaming it on someone else or acting like the bullet came out of nowhere.
More so, I must get misreading because it sounds like you're asking what's the difference between "messy" and "messier"?If it's the same level of messiness, then sure, it's equal. But in a real world setting there's a continuous transition of people. One doesn't work on code in isolation, quit, and then a new person works on that code also in isolation. So maybe it's not the original authors but rather the original authors are a Ship of Theseus. Your premise isn't entirely accurate and I think the difference matters
>The article sort of goes sideways with this idea but pointing out that AI coding robs you a deep understanding of the code it produces is a valid and important criticism of AI coding.
You can describe what the code should do with natural language.
I've found that using literate programming with agent calls to write the tests first, then code, then the human refining the description of the code, and going back to 1 is surprisingly good at this. One of these days I'll get around to writing an emacs mode to automate it because right now it's yanking and killing between nearly a dozen windows.
Of course this is much slower than regular development but you end up with world class documentation and understanding of the code base.
> The article sort of goes sideways with this idea but pointing out that AI coding robs you a deep understanding of the code it produces is a valid and important criticism of AI coding.
Why? Code has always been the artifact. Thinking about and understanding the domain clearly and solving problems is where the intrinsic value is at (but I'd suspect that in the future this, too, will go away).
Code is the final artifact after everything is shipped. But while the development is active, it is more than that (at least for now), as you need to know implementation details even if you are really proficient at the domain knowledge.
Although I do agree that there is a possibility that we'll build a relatively reliable abstraction using LLMs at some point, so this issue will go away. There probably be some restrictions, but I think it is possible.
Code isn't an "artifact", it's the actual product that you are building and delivering. You can use flowery language and pontificate about the importance of the problem domain if you like, but at the end of the day we are producing a low level sequences of instructions that will be executed by a real world device. There has always been, and likely will always be, value in understanding exactly what you are asking the computer to do
Product != Artifact
Artifacts are snapshots of system knowledge (code, builds, docs, configs, etc.).
The product is the living whole that emerges from these artifacts working together and delivering value.
I'm familiar with "artifact" being used to describe the inconsequential and easy to reproduce output of some deterministic process (e.g. build artifact). Even given the terminology you provide here it doesn't change the content of my point above.
When I see someone dismissing the code as a small irrelevant part of the task of writing software, it's like hearing that the low-level design and physical construction of a bridge is an irrelevant side-effect of my desire to cross a body of water. Like, maybe that's true in a philosophical sense, but at the end of the day we are building a real-world bridge that needs to conform to real-world constraints, and every little detail is going to be important. I wouldn't want to cross a bridge built by someone who thinks otherwise.
Code is just one way of representing a specification.
In most domains, code is not the actual product. Data is. Code is how you record, modify and delete data. But it is ultimately data that has meaning and value.
This is why we have the idiom: “Don’t tell me what the code says—show me the data, and I’ll tell you what the code does.”
Reminds me of critisms of python decades ago. that you wouldn't understand what the "real code" was doing since you were using a scripting language. But then over the years it showed tremendous value and many unicorns were built by focusing on higher level details and not lower level code
Comparing LLMs to programming languages is a fake equivalence. I don’t have to write assembly because LLVM will do that for me correctly in 100% of the cases, while AI might or might not (especially the more I move away from template crud apps)
it might be functionally correct but if you wrote it yourself it could be orders of magnitude faster
We've been on the Electron era long enough to know that developer time is more expensive than CPU time.
That is a myth, cpu time is time spent waiting around by your users as the cpu is taking seconds to do something that could be instant, if you have millions of users and that happens every day that quickly adds up to many years worth of time.
It might be true if you just look at development cost, but if you look at value as a whole it isn't. And even just development cost its often not true, since time spent waiting around by the developer for tests to run and things to start also slows things down, taking a bit of time there to reduce cpu time is well worth it just to get things done faster.
Yeah, it's time spent by the users. Maybe it's an inneficiency of the market because the software company doesn't feel the negative effect enough, maybe it really is cheaper in aggregate that doing 3 different native apps in C++. But if CPU time is so valuable, why aren't we arguing for hand written C or even assembly code instead of the layers upon layers of abstraction in even native modern software?
Also, why
> But if CPU time is so valuable, why aren't we arguing for hand written C or even assembly code instead of the layers upon layers of abstraction in even native modern software?
Many of us do frequently argue for something similar. Take a look at Casey Muratori’s performance aware programming series if you care about the arguments.
> But if CPU time is so valuable, why aren't we arguing for hand written C or even assembly code instead of the layers upon layers of abstraction in even native modern software?
That is an extreme case though, I didn't mean that all optimizations are always worth it, but if we look at marginal value gained from optimizations today the payback is usually massive.
It isn't done enough since managers tend to undervalue user and developer time. But users don't undervalue user time, if your program wastes their time many users will stop using it, users are pretty rational about that aspect and prefer faster products or sites unless they are very lacking. If a website is slow a few times in a row I start looking for alternatives, and data says most users do that.
I even stopped my JetBrains subscription since the editor got so much slower in an update, so I just use the one I can keep forever as I don't want their patched editor. If it didn't get slower I'd gladly keep it as I liked some of the new features, but it being slower was enough to make me go back.
Also, while managers can obvious agree that making developer spend less time waiting is a good thing, it is very rare for managers to tell you to optimize compilation times or such, and pretty simple optimizations there can often make that part of the work massively faster. Like, if you profile your C++ compiler and look what files it spends time compiling, then look at those files to figure out why its so slow there, you can find these weird things and fixing those speeds it up 10x, so what took 30 seconds now takes 3 seconds, that is obviously very helpful and if you are used to that sort of thing you could do it in a couple of hours.
That's not the same thing. LLMs don't just obscure low-level technical implementation details like Python does, they also obscure your business logic and many of its edge cases.
Letting a Python interpreter manage your memory is one thing because it's usually irrelevant, but you can't say the same thing about business logic. Encoding those precise rules and considering all of the gnarly real-world edge cases is what defines your software.
There are no "higher level details" in software development, those are in the domain of different jobs like project managers or analysts. Once AI can reliably translate fuzzy natural language into precise and accurate code, software development will simply die as a profession. Our jobs won't morph into something different - this is our job.
But working with AI isn’t really a higher level of abstraction. It’s a completely different process. I’m not hating on it, I love LLMs and use em constantly, but it doesn’t go assembly > C > python > LLMs
> The article sort of goes sideways with this idea but pointing out that AI coding robs you a deep understanding of the code it produces is a valid and important criticism of AI coding.
In any of my teams with moderate to significant code bases, we've always had to lean very hard into code comments and documentation, because a developer will forget in a few months the fine details of what they've previously built. And further, any org with turnover needs to have someone new come in and be able to understand what's there.
I don't think I've met a developer that keeps all of the architecture and design deeply in their mind at all times. We all often enough need to go walk back through and rediscover what we have.
Which is to say... if the LLM generator was instead a colleague or neighboring team, you'd still need to keep up with them. If you can adapt those habits to the generative code then it doesn't seem to be a bit leap.
What is "understanding code", mental model of the problem? These are terms for which we all have developed a strong & clear picture of what they mean. But may I remind us all that used to not be the case before we entered this industry - we developed it over time. And we developed it based on a variety of highly interconnected factors, some of which are e.g.: what is a program, what is a programming language, what languages are there, what is a computer, what software is there, what editors are there, what problems are there.
And as we mapped put this landscape, hadn't there been countless situations where things felt dumb and annoying, and then situation in sometimes they became useful, and sometimes they remained dumb? Something you thought is making you actively loosing brain cells as you're doing them, because you're doing them wrong?
Or are you to claim that every hurdle you cross, every roadblock you encounter, every annoyance you overcome has pedagogical value to your career? There are so many dumb things out there. And what's more, there's so many things that appear dumb at first and then, when used right, become very powerful. AI is that: Something that you can use to shoot yourself in the foot, if used wrong, but if used right, it can be incredibly powerful. Just like C++, Linux, CORS, npm, tcp, whatever, everything basically.
I can imagine an industry where we describe business rules to apply to data in natural language, and the AI simply provides an executable without source at all.
The role of the programmer would then be to test if the rules are being applied correctly. If not, there are no bugs to fix, you simply clarify the business rules and ask for a new program.
I like to imagine what it must be like for a non technical business owner who employees programmers today. There is a meeting where a process or outcome is described, and a few weeks / months / years a program is delivered. The only way to know if it does what was requested is to poke it a bit and see if it works. The business owner has no metal modal of the code and can't go in and fix bugs.
update: I'm not suggesting I believe AI is anywhere near being this capable.
Not really, its more a case of "potentially can" rather than "will". This dynamic has always been there with the whole junior, senior dev. split, its not a new problem. You 100% can use it without losing this, in an ideal world you can even go so far as to not worry about the understanding for parts that are inconsequential.
>> The article sort of goes sideways with this idea but pointing out that AI coding robs you a deep understanding of the code it produces is a valid and important criticism of AI coding.
All code is temporary and should be treated as ephemeral. Even if it lives for a long time, at the end of the day what really matters is data. Data is what helps you develop the type of deep understanding and expertise of the domain that is needed to produce high quality software.
In most problem domains, if you understand the data and how it is modeled, the need to be on top of how every single line of code works and the nitty-gritty of how things are wired together largely disappears. This is the thought behind the idiom “Don’t tell me what the code says—show me the data, and I’ll tell you what the code does.”
It is therefore crucial to start every AI-driven development effort with data modeling, and have lots of long conversations with AI to make sure you learn the domain well and have all your questions answered. In most cases, the rest is mostly just busywork, and handing it off to AI is how people achieve the type of productivity gains you read about.
Of course, that's not to say you should blindly accept everything the AI generates. Reading the code and asking the AI questions is still important. But the idea that the only way to develop an understanding of the problem is to write the code yourself is no longer true. In fact, it was never true to begin with.
In many cases, the code is "data". The code is a set of rules or instructions by which the data must be transformed for business cases.
You need to understand both the data, and the transformations that are being made to the data.
> The article sort of goes sideways with this idea but pointing out that AI coding robs you a deep understanding of the code it produces is a valid and important criticism of AI coding.
No it isn't. There's literally nothing about the process that forces you to skip understanding. Any such skips are purely due to the lack of will on the developer's side. This lack of will to learn will not change the outcomes for you regardless of whether you're using an LLM. You can spend as much time as you want asking the LLM for in-depth explanations and examples to test your understanding.
So many of the criticisms of coding with LLMs I've seen really do sound like they're coming from people who already started with a pre-existing bias, fiddled with with for a short bit (or worse, never actually tried it at all) and assumed their limited experience is the be-all end-all of the subject. Either that, or they're typical skill issues.
> There's literally nothing about the process that forces you to skip understanding.
There's nothing about C that "forces" people to write buffer overflows. But, when writing C, the path of least resistance is to produce memory-unsafe code. Your position reminds me of C advocates who say that "good developers possess the expertise and put in the effort to write safe code without safeguards," which is a bad argument because we know memory errors do show up in critical code regardless of what a hypothetical "good C dev" does.
If the path of least resistance for a given tool involve using that tool dangerously, then it's a dangerous tool. We say chefs should work with sharp knives, but with good knife technique (claw grip, for instance) safety is the path of least resistance. I have yet to hear of an LLM workflow where skimming the generated code is made harder than comprehensively auditing it, and I'm not sure that such a workflow would feel good or be productive.
Your point of view assumes the best of people, which is naive. It may not force you to skip understanding, however it makes it much easier to than ever before.
People tend to take the path of least resistance, maybe not everyone, maybe not right away, but if you create opportunities to write poor code then people will take them - more than ever it becomes important to have strong CI, review and testing practices.
Edit: okay, maybe I am feeling a little pessimistic this morning :)
People will complain about letting the LLM code because you won't understand every nuance. Then they will turn around and pip install a dependency without even glancing at the underlying code.
> No it isn't. There's literally nothing about the process that forces you to skip understanding. Any such skips are purely due to the lack of will on the developer's side
This is the whole point. The marginal dev will go to the path of least resistance, which is to skip the understanding and churn out a bunch of code. That is why it's a problem.
You are effectively saying "just be a good dev, there's literally nothing about AI which is stopping you from being a good dev" which is completely correct and also missing the point.
The marginal developer is not going to put in the effort to wield AI in a skillful way. They're going to slop their way through. It is a concern for widespread AI coding, even if it's not a concern for you or your skill peers in particular.
To add to the above - I see a parallel to the "if you are a good and diligent developer there is nothing to stop you from writing secure C code" argument. Which is to say - sure, if you also put in extra effort to avoid all the unsafe bits that lead to use-after-free or race conditions it's also possible to write perfect assembly, but in practice we have found that using memory safe languages leads to a huge reduction of safety bugs in production. I think we will find similarly that not using AI will lead to a huge reduction of bugs in production later on when we have enough data to compare to human-generated systems. If that's a pre-existing bias, then so be it.
> The marginal developer is not going to put in the effort to wield AI in a skillful way. They're going to slop their way through. It is a concern for widespread AI coding, even if it's not a concern for you or your skill peers in particular.
My mental model of it is that coding with LLMs amplified both what you know and what you don't.
When you know something, you can direct it productively much faster to a desirable outcome than you could on your own.
When you don't know something, the time you normally would have spent researching to build a sufficient understanding to start working on it can be replaced with evaluating the random stuff the LLM comes up with which oftentimes works but not in the way it ought to, though since you can get to some result quickly, the trade-off to do the research feels somehow less worth it.
Probably if you don't have any idea how to accomplish the task you need to cultivate the habit of still doing the research first. Wielding it skillfully is now the task of our industry, so we ought to be developing that skill and cultivating it in our team members.
I don't think that is a problem with AI, it is a problem with the idea that pure vibe-coding will replace knowledgeable engineers. While there is a loud contingent that hypes up this idea, it will not survive contact with reality.
Purely vibe-coded projects will soon break in unexplainable ways as they grow beyond trivial levels. Once that happens their devs will either need to adapt and learn coding for real or be PIP'd. I can't imagine any such devs lasting long in the current layoff-happy environment. So it seems like a self-correcting problem no?
(Maybe AGI, whatever that is, will change things, but I'm not holding my breath.)
The real problem we should be discussing is, how do we convince students and apprentices to abstain from AI until they learn the ropes for real.
> The real problem we should be discussing is, how do we convince students and apprentices to abstain from AI until they learn the ropes for real.
Learning the ropes looks different now. You used to learn by doing, now you need to learn by directing. In order to know how to direct well, you have to first be knowledgeable. So, if you're starting work in an unfamiliar technology, then a good starting point is read whatever O'Reilly book gives a good overview, so that you understand the landscape of what's possible with the tool and can spot when the LLM is doing (now) obvious bullshit.
You can't just Yolo it for shit you don't know and get good results, but if you build a foundation first through reading, you will do a lot better.
On vibe coding being self-correcting, I would point to the growing number of companies mandating usage of AI and the quote "the market can stay irrational longer than you can stay solvent". Companies routinely burn millions of dollars on irrational endeavours for years. AI has been promised as an insane productivity booster.
I wouldn't expect things to calm down for a while, even if real-life results are worse. You can make excuses for underperformance of these things for a very long time, especially if the CEO or other executives are invested.
> The real problem we should be discussing is, how do we convince students and apprentices to abstain from AI until they learn the ropes for real
I hate to say it but that's never going to happen :/
Let me ask you this: do you understand the code your colleagues produce as well as the code you write yourself?
We most definitely should, especially so if you're working in a team or organization bigger than a handful of people. Because it's almost certain that you may need to change or interact with that code very soon in the lifetime of the project. When that happens you want to make sure the code aligns with your own mental model of how things work.
The industry has institutionalized this by making code reviews a very standard best practice. People think of code reviews mainly as a mechanism to reduce bugs, but turns out the biggest benefits (born out by studies) actually are better context-sharing amongst the team, mentoring junior engineers, and onboarding of new team-mates. It ensures that everyone has the same mental model of the system despite working on different parts of it (c.f. the story of the blind men and the elephant.) This results in better ownership and fewer defects per line of code.
Note, this also doesn't mean everybody reviews each and every PR. But any non-trivial PR should be reviewed by team-mates with appropriate context.
AI is not my coworker, with different tasks and responsibilities.
The comparison is oniy reasonable if most of your job is spent trying to understand their code, and make sure it did what you wanted. And with them standing next to you, ready to answer questons, explain anything I don't understand and pull in any external, relevant parts of the codebase.
Of course not that's a bit disingenuous. I would hope my colleagues write code that is comprehensible so it's maintainable. I think that if the code is so complex and inscrutable that only the author can understand it then it's not good code. AI doesn't create or solve this problem.
I do think when AI writes comprehensible code you can spend as much time as necessary asking questions to better understand it. You can ask about tradeoffs and alternatives without offending anybody and actually get to a better place in your own understanding than would be possible alone.
Who are this endless cohort of develops who need to maintain a 'deep understanding' of their code. I'd argue a high % of all code written globally on any given day that is not some flavour of boilerplate, while written with good intention, is ultimately just short-lived engineering detritus of it even gets a code review to pass.
If you're on HN there's a good chance you've self-selected into "caring about the craft and looking for roles that require more attention."
You need to care if (a) your business logic requirements are super annoyingly complex, (b) you have hard performance requirements, or (c) both. (c) is the most rare, (a) is the most common of those three conditions; much of the programmer pay disparity between the top and the middle or bottom is due to this, but even the jobs where the complexity is "only" business requirements tend to be quite a bit better compensated than the "simple requirements, simple needs" ones.
I think there's a case to be made that LLM tools will likely make it harder for people to make that jump, if they want to. (Alternately they could advance to the point where the distinction changes a bit, and is more purely architectural; or they could advance to the point where anyone can use an LLM to do anything - but there are so many conditional nuances to what the "right decision" is in any given scenario there that I'm skeptical.)
A lot of times floor-raising things don't remove the levels, they just push everything higher. Like a cheap crap movie today will visually look "better" from a technology POV (sharpness, special effects, noise, etc) than Jurassic Park from the 90s, but the craft parts won't (shot framing, deliberate shifts of focus, selection of the best takes). So everyone will just get more efficient and more will be expected, but still stratified.
And so some people will still want to figure out how to go from a lower-paying job to a higher-paying one. And hopefully there are still opportunities, and we don't just turn into other fields, picking by university reputations and connections.
> You need to care if (a) your business logic requirements are super annoyingly complex, (b) you have hard performance requirements, or (c) both. (c) is the most rare
But one of the most fun things you can do is C: creative game development coding. Like coding world simulations etc, you want to be both very fast but the rules and interactions etc is very coupled and complex compared to most regular enterprise logic that is more decoupled.
So while most work programmers do fits A, the work people dream about doing is C, and that means LLM doesn't help you make fun things, it just removes the boring jobs.
In my experience the small percent of developers who do have a deep understanding are the only reason the roof doesn’t come crashing in under the piles of engineering detritus.
> I would love to see an anti-AI take that doesn't hinge on the idea that technology forces people to be lazy/careless/thoughtless.
Here's mine, I use Cline occasionally to help me code but more and more I find myself just coding by hand. The reason is pretty simple which is with these AI tools you for the most part replace writing code with writing a prompt.
I look at it like this, if writing the prompt, and the inference time is less than what it would take me to write the code by hand I usually go the AI route. But this is usually for refactoring tasks where I consider the main bottleneck to be the speed at which my fingers can type.
For virtually all other problems it goes something like this: I can do X task in 10 minutes if i code it manually or I can prompt AI to do it and by the time I finish crafting the prompt and execute, it takes me about 8 minutes. Yes that's a savings of 2 minutes on that task and that's all fine and good assuming that the AI didn't make a mistake, if I have to go back and re-prompt or manually fix something, then all of a sudden the time it took me to complete that task is now 10-12 minutes with AI. Here the best case scenario is I just spent some AI credits for zero time savings and worse case is I spent AI credits AND the task was slower in the end.
With all sorts of tasks I now find myself making this calculation and for the most part, I find that doing it by hand is just the "safer" option, both in terms of code output but also in terms of time spent on the task.
> The reason is pretty simple which is with these AI tools you for the most part replace writing code with writing a prompt
I'm convinced I spend more time typing and end up typing more letters and words when AI coding than when not.
My hands are hurting me more from the extra typing I have to do now lol.
I'm actually annoyed they haven't integrated their voice to text models inside their coding agents yet.
GitHub copilot already does have speech to text and as my sibling comment mentions, on the Mac, it is globally available. It varies according to typing and speaking speed but speaking should be about five times faster than typing.
On a mac you can just use a hotkey to talk to an agentic CLI. It needs to be a bit more polished still IMO, like removing the hotkey requirement, with a voice command to break the agents current task.
Does it use an LLM powered voice to text model ?
I find the generic ones like the ones I can use anywhere on Mac to be crap.
If you've used the ChatGPT voice to text model you know what I mean.
I believe it does on newer macs (m4 has neural engine). It's not perfect, but I'm using it without issue. I suspect it'll get better each generation as Apple leans more into their AI offering.
There are also third parties like Wispr that I haven't tried, but might do a better job? No idea.
Have you tried Soniox? It's really not expensive ($0.12/h, $200 free credits when you sign up) and really accurate.
https://soniox.com/
You can use it with Spokenly (free app, bring your own Soniox API key) on macOS and iOS (virtual voice keyboard)
https://spokenly.app/
Disclaimer: I've worked for Soniox
Why would I buy this if my Mac has it for free? Is it “just better”?
The mac one is pretty limited. I paid for a similar tool as above and the LLM backing makes the output so much better. All my industry specific jargon gets captured perfectly whereas the Apple dictation just made up nonsense.
It's really accurate and supports 60+ languages
I find myself often writing pseudo code (CLI) to express some ideas to the agent. Code can be a very powerful and expressive means of communication. You don't have to stop using it when it's the best / easiest tool for a specific case.
That being said, these agents may still just YOLO and ignore your instructions on occasion, which can be a time suck, so sometimes I still get my hands dirty too :)
> the idea that technology forces people to be careless
I don't think anyone's saying that about technology in general. Many safety-oriented technologies force people to be more careful, not less. The argument is that this technology leads people to be careless.
Personally, my concerns don't have much to do with "the part of coding I enjoy." I enjoy architecture more than rote typing, and if I had a direct way to impose my intent upon code, I'd use it. The trouble is that chatbot interfaces are an indirect and imperfect vector for intent, and when I've used them for high-level code construction, I find my line-by-line understanding of the code quickly slips away from the mental model I'm working with, leaving me with unstable foundations.
I could slow down and review it line-by-line, picking all the nits, but that moves against the grain of the tool. The giddy "10x" feeling of AI-assisted coding encourages slippage between granular implementation and high-level understanding. In fact, thinking less about the concrete elements of your implementation is the whole advantage touted by advocates of chatbot coding workflows. But this gap in understanding causes problems down the line.
Good automation behaves in extremely consistent and predictable ways, such that we only need to understand the high-level invariants before focusing our attention elsewhere. With good automation, safety and correctness are the path of least resistance.
Chatbot codegen draws your attention away without providing those guarantees, demanding best practices that encourage manually checking everything. Safety and correctness are the path of most resistance.
(Adding to your comment, not disagreeing)
And this will always be a result of human preference optimization. There's a simple fact: humans prefer lies that they don't know are lies over lies that they do know are lies.We can't optimize for an objective truth when that objective truth doesn't exist. So while doing our best to align our models they must simultaneously optimize they ability to deceive us. There's little to no training in that loop where outputs are deeply scrutinized, because we can't scale that type of evaluation. We end up rewarding models that are incorrect in their output.
We don't optimize for correctness, we optimize for the appearance of correctness. We can't confuse the two.
The result is: when LLMs make errors, those errors are difficult for humans you detect.
This results in a fundamentally dangerous tool, does it not? Tools that when they error or fail they do so safely and loudly. Instead this one fails silently. That doesn't mean you shouldn't use the tool but that you need to do so with an abundance of caution.
Actually the big problem I have with coding with LLMs is that it increases my cognitive load, not decreases it. Bring over worked results in carelessness. Who among us does not make more mistakes when they are tired or hungry?That's the opposite of lazy, so hopefully answers OP.
I use LLMs for coding and I like it the way I am using it. I do not outsource thinking, and I do not expect it to know what I want without giving it context to my thoughts with regarding to the project. I have written a 1000 LOC program in C using an LLM. It was a success. I have reviewed it "line by line" though, I do not know why I would not do this. Of course it did not spit out 1000 LOC from the get go, we started small and we built upon our foundations. It has an idea of my thinking and my preferences with regarding to C and the project because of our interactions that gave it context.
But it's pretty rare you're going to be able to review every line in a mature project, even if you're developing that project. Those can contain hundreds or even thousands of files with hundreds (hopefully not thousands) of LOC. While it's possible to review every line it's pretty costly in time and it's harder since the code is changing as you're doing this...
Think of it this way, did you also review all the lines of code in all the libraries you used? Why not? The reasoning will be pretty similar. This isn't to say we shouldn't spend more time exploring the code we work with nor that we likely wouldn't benefit from this, but that time is a scarce resource. So the problem is when the LLM is churning out code faster than you can review.
While coding you are hopefully also debugging and thinking. By handing coding over to the LLM you decouple this. So you reduce your time writing lines of code but increase time spent debugging and analyzing. There will be times where this provides gains but IME this doesn't happen in serious code. But yeah, my quick and dirty scripts can be churned out a lot faster. That saves time, but not 10x. At least not for me
nobody in this or any meaningful software engineering discussion is talking about software projects that are 1000, or even 10000, SLoC. these are trivial and uninteresting sizes. the discussion is about 100k+ SLoC projects.
I do not see how this is always necessarily implied. And should I seriously always assume this is the case? Where are you getting this from? None of these projects people claim to successfully (or not) written with the help from LLM have 10k LOC, let alone >100k. Should they just be ignored because LOC is not >100k?
Additionally, why is it that whenever I mention success stories accomplished with the help of LLMs, people rush to say "does not count because it is not >100k LOC". Why does it not count, why should it not count? I would have written it by hand, but I finished much faster with the help of an LLM. These are genuine projects that solve real problems. Not every significant project has to have >100k LOC. I think we have a misunderstanding of the term "significant".
> nobody in this or any meaningful software engineering discussion is talking about software projects that are 1000, or even 10000, SLoC.
Why?
> these are trivial and uninteresting sizes.
In terms of what exactly?
> Why?
Because small programs are really quick and easy to write, there was never a bottleneck making them and the demand for people to write small programs is very small.
The difficulty of writing a program scales super linearly with size, an experienced programmer in his current environment easily writes a 500 line program in a day, but writing 500 meaningful lines to an existing 100k line codebase in a day is not easy at all. So almost all developer time in the world is spent making large programs, small programs is a drop in an ocean and automating that doesn't make a big difference overall.
Small programs can help you a lot, but that doesn't replace programmers since almost no programmers are hired to write small programs, instead automatically making such small programs mostly helps replace other tasks like regular white collar workers etc whose jobs are now easier to automate.
Not because the single line was hard to write but because the context in which it needed to be written.
Typing was never the bottleneck and I'm not sure why this is the main argument for LLMs (e.g. "LLMs save me from the boilerplate). When typing is a bottleneck it seems like it's more likely that the procedure is wrong. Things like libraries, scripts, and skeletons tend to be far better solutions for those problems. In tough cases abstraction can be extremely powerful, but abstraction is a difficult tool to wield.
The bottleneck is the thinking and analyzing.
There's an argument to be made that this gap is actually highlighting design issues rather than AI limitations.
It's entirely possible to have a 100k LOC system be made up of effective a couple hundred 500 line programs that are composed together to great effect.
That's incredibly rare but I did once work for a company who had such a system and it was a dream to work in. I have to think AIs are making a massive impact there.
You may also wish to look at UNIX Philosophy. The idea that programs should be small and focused. A program should do one thing and do it well. But there's a generalization to this philosophy when you realize a function is a program.
I do agree there's a lot of issues with design these days but I think you've vastly oversimplified the problem.
> There's a simple fact: humans prefer lies that they don't know are lies over lies that they do know are lies.
As an engineer and researcher, I prefer lies (models, simplifications), that are known to me, rather than unknown unknowns.
I don't need to know exact implementation details, knowledge of aggregate benchmarks, fault rates and tolerances is enough. A model is a nice to have.
This approach works, in science (physics, chemistry, biology, ...) and in engineering (including engineering agentic and social sustems- social engineering).
I'll make a corollary to help:
I'm insure if you: misread "lies that they don't know are lies", conflated unknown unknowns with known unknowns, or (my guess) misunderstood that I am talking about the training process which involves a human evaluator evaluating an LLM output. That last one would require the human evaluator to preference a lie over a lie that they do not know is actually a lie. I think you can see how we can't expect such an evaluation to occur (except through accident). For the evaluator to preference the unknown unknown they would be required to preference what they believe to be a falsehood over what they believe is truth. You'd throw out such an evaluator for not doing their job!As a researcher myself, yes, I do also prefer known falsehoods over unknown falsehoods but we can only do this from a metaphysical perspective. If I'm aware of an unknown then it is, by definition, not an unknown unknown.
How do you preference a falsehood which you cannot identify as a falsehood?
How do you preference an unknown which you do not know is unknown?
We have strategies like skepticism to deal with this help with this but this doesn't make the problem go away. It ends up with "everything looks right, but I'm suspicious". Digging in can be very fruitful but is more frequently a waste of time for the same reason: if a mistake exists we have not identified the mistake as a mistake!
I think this is a place where there's a divergence in science and engineering (I've worked in both fields). The main difference in them is at what level of a problem you're working on. At the more fundamental level you cannot get away with empirical evidence alone.Evidence can only bound your confidence in the truth of some claim but it cannot prove it. The dual to this is a much simpler problem, as disproving a claim can be done with a singular example. This distinction often isn't as consequential in engineering as there are usually other sources of error that are much larger.
As an example, we all (hopefully) know that you can't prove the correctness of a program through testing. It's a non-exhaustive process. BUT we test because it bounds our confidence about its correctness and we usually write cases to disprove certain unintended behaviors. You could go through the effort to prove correctness but this is a monumental task and usually not worth the effort.
But right now we're talking about a foundational problem and such a distinction matters here. We can't resolve the limits of methods like RLHF without considering this problem. It's quite possible that there's no way around this limitation since there are no objective truths the majority of tasks we give LLMs. If that's true then the consequence is that a known unknown is "there are unknown unknowns". And like you, I'm not a fan of unknown unknowns.
We don't actually know the fault rates nor tolerances. Benchmarks do not give that to us in the general setting (where we apply our tools). This is a very different case than, say, understanding the performance metrics and tolerances of an o-ring. That part is highly constrained and you're not going to have a good idea of how well it'll perform as a spring, despite those tests having a lot of related information.
> If the thinking bit is your favorite part, AI allows you to spend nearly all of your time there if you wish, from concept through troubleshooting.
This argument is wearing a little thin at this point. I see it multiples times a day, rephrased a little bit.
The response, "How well do you think your thinking will go if you had not spent years doing the 'practice' part?", is always followed by either silence or a non-sequitor.
So, sure, keep focusing on the 'thinking' part, but your thinking will get more and more shallow without sufficient 'doing'
Separate from AI, as your role becomes more tech lead / team lead / architect you're also not really "doing" as much and still get involved in a lot of thinking by helping people get unstuck. The thinking part still builds experience. You don't need to type the code to have a good understanding of how to approach problems and how to architect systems. You just need to be making those decisions and gaining experience from them.
> You just need to be making those decisions and gaining experience from them.
The important part that everyone glosses over is the "gaining experience" part.
The experience you gained writing code lead to you being tech lead / team lead /architect.
The experience you get from those roles, including "helping people get unstuck", makes you valuable because there are people involved, not just technology. IOW, that is different to the experience you get from prompting.
We have yet to see how valuable the experience from prompting will be. At this point the prompters are just guessing that their skills won't atrophy, and that their new experience won't be at the same level as vibe-coders who can't spell "Python".
As a fairly senior person myself, and an occasional user of LLMs, and someone who has tried CC in recent months, the experience I got from LLMs, while not nothing, was not recognised by me as valuable in any way - it basically put me at the same skill level as a vibe-coder.
OTOH, the experience I got mentoring very junior engineers the month before that I recognised as instantly valuable; at the end of it I had learned new strategies for dealing with people, growing them, etc.
The only "experience" you get with LLM is "put another coin into the slot and pull the lever again".
> The only "experience" you get with LLM is "put another coin into the slot and pull the lever again".
I relate it to directors on a production. It's certainly very valuable to know how to operate a camera, and especially to understand lighting, storytelling, etc. It gives you insight in how to work with the people who are actually doing these tasks. It helps you to know when someone is gaslighting you, etc.
That being said, it's kind of an insane statement to say that all a director does is pull a lever. I'm sure there are a ton of wannabe directors who try to do exactly that and proceed to fail miserably if they don't adapt quickly to reality. But having a great director is obviously a huge differentiator in output.
Do I think we'll have as many programmers in the future as we do today? Probably not. I think we're going to see a real decimation of coders, but at the same time we might (I say "might") see much greater overall production that may not otherwise exist from the less talented vibers or w/e ridiculously critical name you want. Some of that is certainly going to be interesting and maybe even radically game changing.
IMO our feelings about this are about as relevant as shaking our fist at the cosmos.
> Separate from AI, as your role becomes more tech lead / team lead / architect you're also not really "doing" as much and still get involved in a lot of thinking by helping people get unstuck
True. But the roles as such require you to do a lot of thinking by helping a LOT of people. You end up shuffling between multiple projects/deliverables. Here we are talking about probably a developer working on a single project/deliverable and then equating it to AI. Not to mention the easy to forget part is that by the time you are a tech lead / team lead / architect you have so many hours that you know some stuff like back of your hand.
Do you think that all managers and tech leads atrophy because they don’t spend all day “doing”? I think a good number of them become more effective because they delegate the simple parts of their work that don’t require deep thought, leaving them to continue to think hard about the thorniest areas of what they’re working on.
Or perhaps you’re asking how people will become good at delegation without doing? I don’t know — have you been “doing” multiple years of assembly? If not, how are you any good at Python (or whatever language you currently use?). Probably you’d say you don’t need to think about assembly because it has been abstracted away from you. I think AI operates similarly by changing the level of abstraction you can think at.
> Do you think that all managers and tech leads atrophy because they don’t spend all day “doing”?
People have argued for years that software architects must write code.
Regarding your second paragraph: When you write python you then debug it at the level of the abstraction. You never debug the python interpreter. You can try to treat AI like an abstraction but it immediately breaks down as soon as you go to debug. It would only be a complete abstraction if you never had to deal with the generated code.
> Do you think that all managers and tech leads atrophy because they don’t spend all day “doing”?
Yes? If not 100% then a number pretty close to that. Definitely 100% of all the managers/non-coding leads I’ve worked with
As an IC turned temporary manager that went back to being IC, yes, absolutely my skills atrophied. This isn't even a programming thing, this is just a regular human thing with most, arguably all, things that you don't practice for a while.
Also I find the idea that most managers or technical leads are doing any kind of "deep thought" hilarious, but that's just maybe my apathy towards management speaking.
Managers 100% lose their abilities, their focus shifts to completely different concerns -- codebase health, enabling people, tracking velocity metrics, etc. They still understand high-level concerns, of course (if we are talking about strong technical background), but they'd struggle a lot if just dropped into the codebase.
Tech leads can exist in many variants, but usually they spend the majority of time in code, so they don't lose it. If they become too good at managing and change their priorities, they _will_ gradually drift away too.
I hear all the time from people who have moved into management that their engineering skills atrophy. The only antidote is to continue doing IC work while managing.
> Do you think that all managers and tech leads atrophy because they don’t spend all day “doing"
Yes, obviously this happens.
Do you seriously think that skills don't rust when you stop using them daily?
It's about as much time as I think about caching artifacts and branch mispredict latencies. Things I cared a lot about when I was doing assembly, but don't even think about really in Python (or C++).
My assembly has definitely rotted and I doubt I could do it again without some refreshing but it's been replaced with other higher-level skills, some which are general like using correct data structures and algorithms, and others that are more specific like knowing some pandas magic and React Flow basics.
I expect this iteration I'll get a lot better at systems design, UML, algorithm development, and other things that are slightly higher level. And probably reverse-engineering as well :) The computer engineering space is still vast IMHO....
My take is just that debugging is harder than writing so I'd rather just write it instead of debugging code I didn't write.
I think it's more like code review, which really is the worst part of coding. With AI, I'll be doing less of the fun bits (writing, debugging those super hard customer bugs), and much much more code review.
well to be fair to the argument, reviewing code that you designed.
Are people really not using LLMs to debug code?
@shredprez the website in your bio appears to sell AI-driven products: "Design anything in Claude, Cursor, or VS Code
Consider leaving a disclaimer next time. Seems like you have a vested interest in the current half-baked generation of AI products succeeding
Conflict of interest or not, he's not really wrong. Anyone shipping code in a professional setting doesn't just push to prod after 5 people say LGTM to their vibe coded PR, as much as we like to joke around with it. There are stages of tests and people are responsible for what they submit.
As someone writing lots of research code, I do get caught being careless on occasion since none of it needs to work beyond a proof of concept, but overall being able to just write out a spec and test an idea out in minutes instead of hours or days has probably made a lot of things exist that I'd otherwise never be arsed to bother with. LLMs have improved enough in the past year that I can easily 0-shot lots of ad-hoc visualization stuff or adapters or simple simulations, filters, etc. that work on the first try and with probably fewer bugs than I'd include in the first version myself. Saves me actual days and probably a carpal tunnel operation in the future.
> I would love to see an anti-AI take that doesn't hinge on the idea that technology forces people to be lazy/careless/thoughtless.
I think this might simply be how the human brain works. Take autonomous driving as an example: while the car drives on its own the human driver is supposed to be alert and step in if needed. But does that work? Or will the driver's mind wander off because the car has been driving properly for the last half hour? My gut feeling is that it's inevitable that we'll eventually just shut out everything that goes smoothly and by the time it doesn't it might be too late.
We are not that different from our ancestors who used to roam the forests, trying to eat before they get eaten. In such an environment there is constantly something going on, some critters crawling, some leaves rustling, some water flowing. It would drive us crazy if we could not shut out all this regular noise. It's only when an irregularity appears that our attention must spring into action. When the leaves rustle differently than they are supposed to there is a good chance that there is some prey or a predator to be found. This mechanism only works if we are alert. The sounds of the forest are never exactly the same, so there is constant stimulation to keep up on our toes. But if you are relaxing in your shelter the tension is gone.
My fear is that AI is too good, to the point where it makes us feel like being in our shelter rather than in the forest.
> My gut feeling is that it's inevitable that we'll eventually just shut out everything that goes smoothly and by the time it doesn't it might be too late.
Yes. Productivity accelerates at an exponential rate, right up until it drives off a cliff (figuratively or literally).
> AI allows you to spend nearly all of your time there if you wish, from concept through troubleshooting
It does not! If you're using interactive IDE AI, you spend your time keeping the AI on the rails, and reminding it what the original task is. If you're using agents, then you're delegating all the the mid-level/tactical thinking, and perhaps even the planning, and you're left with the task of writing requirements granular enough for an intern to tackle, but this hews closer to "Business Analyst" than "Software Engineer"
From my experience, current AI models stay on the rails pretty well. I don't need to remind them of the task at hand.
Using an agentic workflow does not require you to delegate tge thinking. Agents are great at taking exactly what you want to do and executing. So spend an extra few minutes and lay out the architecture YOU want then let the ai do the work.
It's "anti-AI" from the perspective of an investor or engineering manager who assumes that 10x coding speed should 10x productivity in their organization. As a staff IC, I find it a realistic take on where AI actually sits in my workflow and how it relates to juniors.
> assumes that 10x coding speed should 10x productivity
This same error in thinking happens in relation to AI agents too. Even if the agent is perfect (not really possible) but other links in the chain are slower, the overall speed of the loop still does not increase. To increase productivity with AI you need to think of the complete loop, reorganize and optimize every link in the chain. In other words a business has to redesign itself for AI, not just apply AI on top.
Same is true for coding with AI, you can't just do your old style manual coding but with AI, you need a new style of work. Maybe you start with constraint design, requirements, tests, and then you let the agent loose and not check the code, you need to automate that part, it needs comprehensive automated testing. The LLM is like a blind force, you need to channel it to make it useful. LLM+Constraints == accountable LLM, but LLM without constraints == unaccountable.
I’ve been trying to re-orient for this exact kind of workflow and I honestly can’t declare whether it’s working.
I’ve switched to using Rust because of the rich type system and pedantic yet helpful compiler errors. I focus on high level design, traits, important types - then I write integration tests and let Claude go to town. I’ve been experimenting with this approach on my side project (backend web services related to GIS - nothing terribly low level) for about 4 months now and I honestly don’t know if it’s any faster than just writing the code myself. I suspect it’s not or only marginally faster at best.
I often find that I end up in a place where the ai generated code just has too many issues collected over iterations and needs serious refactoring that the agent is incapable of performing satisfactorily. So I must do it myself and that work is substantially harder than it would have been had I just written everything myself in the first place.
At work - I find that I have a deep enough understanding of our codebase that the agents are mostly a net-loss outside of boilerplate.
Perhaps I’m holding it wrong but I’ve been doing this for a while now. I am extremely motivated to build a successful side project and try to bootstrap myself out of the corporate world. I read blogs and watch vlogs on how others build their workflows and I just cannot replicate these claims of huge productivity gains.
> Maybe you start with constraint design
Oh, that's a great idea. Then you get a constraint-based language and write your constraints there!
> in every industry where AI has achieved any level of mastery.
Which industries are those? What does that mastery look like?
> There's a divide between people ...
No, there is not. If one is not willing to figure out a couple of ffmpeg flags, comb through k8s controller code to see what is possible and fix that booting error in their VMs then failure in "mental experiences" is certain.
The most successful people I have met in this profession are the ones who absolutely do not tolerate magic and need to know what happens from the moment they press the ON on their machine, till the moment they turn is OFF again.
I never made a case against LLMs and similar ML applications in the sense that they negatively impact mental agility. The cases I made so far include, but are not limited to:
— OSS exploded on the promise that software you voluntarily contributed to remains to benefit the public, and that a large corporation cannot tomorrow simply take your work and make it part of their product, never contributing anything back. Commercially operated LLMs threaten OSS both by laundering code and by overwhelming maintainers with massive, automatically produced and sometimes never read by a human patches and merge requests.
— Being able to claim that any creative work is merely a product of an LLM (which is a reality now for any new artist, copywriter, etc.) removes a large motivator for humans to do fully original creative work and is detrimental to creativity and innovation.
— The ends don’t justify the means, as a general philosophical argument. Large-scale IP theft had been instrumental at the beginning of this new wave of applied ML—and it is essentially piracy, except done by the powerful and wealthy against the rest of us, and for profit rather than entertainment. (They certainly had the money to license swaths of original works for training, yet they chose to scrape and abuse the legal ambiguity due to requisite laws not yet existing.)
— The plain old practical “it will drive more and more people out of jobs”.
— Getting everybody used to the idea that LLMs now mediate access to information increases inequality (making those in control of this tech and their investors richer and more influential, while pushing the rest—most of whom are victims of the aforementioned reverse piracy—down the wealth scale and often out of jobs) more than it levels the playing field.
— Diluting what humanity is. Behaving like a human is how we manifest our humanness to others, and how we deserve humane treatment from them; after entities that walk and talk exactly like a human would, yet which we can be completely inhumane to, become commonplace, I expect over time this treatment will carry over to how humans treat each other—the differentiator has been eliminated.
— It is becoming infeasible to operate open online communities due to bot traffic that now dwarves human traffic. (Like much of the above, this is not a point against LLMs as technology, but rather the way they have been trained and operated by large corporate/national entities—if an ordinary person wanted to self-host their own, they would simply not have the technical capability to cause disruption at this scale.)
This is just what I could recall off the top of my head.
Good points here, particularly the ends not justifying the means.
I'm curious for more thoughts on "will drive more and more people out of jobs”. Isn't this the same for most advances in technology (e.g., steam engine, computers s, automated toll plazas, etc.). In some ways, it's motivation for making progress; you get rid of mundane jobs. The dream is that you free those people to do something more meaningful, but I'm not going to be that blindly optimistic :) still, I feel like "it's going to take jobs" is the weakest of arguments here.
It happened before, and it was an issue back then as well.
Mundane job may be mundane (though note that it is sometimes subjective), but it earns someone bread and butter and it is always economic stress when the job is gone and many people have to retrain.
If we were to believe those of us who paint this technology as mind-bogglingly world-changing, that someone is now nearly everyone and unlike the previous time there is no list of jobs you could choose from (that would last longer than the time it takes to train).
If we were not to believe the hype, still: when those jobs got automated back then, people moved to jobs that are liable to be obsolete this time, except there is also just more people overall, so even purely in terms of numbers this seems to be a bigger event.
> — The ends don’t justify the means. IP theft that lies in the beginning of this new wave of applied ML is essentially piracy
Isn't "AI coding" trained almost entirely on open source code and published documentation?
Yeah but it’s the same issue. Open source licenses (just like other laws) weren’t designed for the age of LLMs. I’m sure most people don’t care, but I bet a lot of maintainers don’t want their code fed to LLMs!
Intellectual property as a concept wasn't designed for the age of LLM. You have to add a bunch of exceptions to copyright (fair use, first sale) to get it to not immediately lead to scenarios that don't make any intuitive sense. LLMs explode these issues because now you can mechanically manipulate ideas, and this forces to light new contradictions that intellectual property causes.
I agree that commercially operated LLMs undermine the entire idea of IP, but it is one of the problems with them, not with the concept of intellectual property, which is an approximation of what has been organically part of human society motivating innovation since forever: benefits of being an author and degree of ownership over intangible ideas. When societies were smaller and local, it just worked out and you would earn respect and status if you came up with something cool, whereas in a bigger and more global society that relies on the rule of law rather than informal enforcement legal protections are needed to keep things working sort of the same way.
I doubt anyone would consider it a problem if large-scale commercial LLM operators were required to respect licenses and negotiate appropriate usage terms. Okay, maybe with one exception: their investors and shareholders.
> IP is an approximation of what has been organically part of human society and drove innovation since forever: benefits of being an author and degree of ownership over intangible ideas.
It is not! It's a very recent invention. Especially its application to creative works contradicts thousands of years of the development of human culture. Consider folk songs.
> I doubt anyone would consider it a problem if large-scale commercial LLM operators were required to respect licenses and negotiate appropriate usage terms. Okay, maybe with one exception: their investors and shareholders.
And the issue I'm gesturing at is that you run into different contradicting conclusions about how LLMs should interact with copyright depending on exactly what line of logic you follow, so the courts will never be able to resolve how it should work. These are issues can only be conclusively resolved with writing new laws to decide it's going to work, but that will eventually only make the contradictions worse and complicate the hoops that people will have to jump through as the technology evolves in new ways.
Yes, open source works whose licenses constrain derivative works.
Some licenses do.
In my experience AI coding is not going to spew out a derivative of another project unless your objective is actually to build a derivative of that software. If your code doesn't do the same or look the same it doesn't really meet the criteria to be a derivative of someone else's.
I mostly use Cursor for writing test suites in Jest with TypeScript, these are so specific to my work I don't think it's possible they've infringed someone else's.
Intellectual property theft? If gp’s referring to the Books3 shadow library not having been legally bought, it’s not realistically more than 197k books worth less than $10MM. And let’s not forget Intellectual property rights only exist “ To promote the Progress of Science and useful Arts.”
There's certainly some debate to be had about ingesting a book about vampires and then writing a book about vampires.
But I think programming is much more "how to use the building blocks" and mathematics than ingesting narratives and themes. More like ingesting a dictionary and thesaurus and then writing a book about vampires.
What about copylefted code?
Yes. But here we are, people ignoring all the theft that has happened. People generating images on stolen art and call themselves artists. People using it to program and call themselves programmers. Also, it seems to me that so many people just absolutely ignore all the security related issues coming with coding agents. Its truly a dystopia. But we are on hackernews so obviously people will glaze about "AI" on here.
Maybe we should get upset about people using cameras to take pictures of art on the same principles. And what about that Andy Warhol guy, what a pretender!
… so I hope you can see why I don’t actually agree with your comment about who’s allowed to be a artist, and not just dismiss me as a glazer
Who is taking pictures of art and calls themselves artist for that? People are generating images from stolen art and creating businesses off of that. People are faking being an artist on social media. But I shouldn't be surprised that people with no actual talent defend all of this.
You wanna be an artist? Put in the work.
> If the thinking bit is your favorite part, AI allows you to spend nearly all of your time there if you wish, from concept through troubleshooting
I think this depends. I prefer the thinking bit, but it's quite difficult to think without the act of coding.
It's how white boarding or writing can help you think. Being in the code helps me think, allows me to experiment, uncover new learnings, and evolve my thinking in the process.
Though maybe we're talking about thinking of different things? Are you thinking in the sense of what a PM thinks about ? User features, user behavior, user edge cases, user metrics? Or do you mean thinking about what a developer thinks about, code clarity, code performance, code security, code modularization and ability to evolve, code testability, innovative algorithms, innovative data-structure, etc. ?
I’m struggling to understand how they are asserting one follows from the other. I’m not a SWE, but do a lot of adjacent types of work (infrastructure automation and scripting, but also electronics engineering, and I’m also a musician), and the “thinking” part where I get to deploy logic and reasoning to solve novel challenges is certainly a common feature among these activities I certainly enjoy, and I feel it’s a core component of what I’m doing.
But the result of that thinking would hardly ever align neatly with whatever an LLM is doing. The only time it wouldn’t be working against me would be drafting boilerplate and scaffolding project repos, which I could already automate with more prosaic (and infinitely more efficient) solutions.
Even if it gets most of what I had in mind correct, the context switching between “creative thinking” and “corrective thinking” would be ruinous to my workflow.
I think the best case scenario in this industry will be workers getting empowered to use the tools that they feel work best for their approach, but the current mindset that AI is going to replace entire positions, and that individual devs should be 10x-ing their productivity is both short-sighted and counterproductive in my opinion.
I like to think of the essential/accidental complexity split. The true way to solve essential complexity in a business settings is to talk with stakeholders.
Tools, libraries and platforms are accidental complexities. If you have already learned how to use them, you can avoid the pitfalls and go straight to the solution, which is why the common advice is to use boring technologies, as the solutions are widely documented and there are a lot of case studies.
If it's something new, then you can learn as you go by starting small and refactor as you're gaining more confidence. Copy-pasta or code generation is usually bad in that case. You don't know enough to judge the long-term costs.
Code is tech debt. When people talk about software engineering, it's to make sure that this debt doesn't outweigh the benefits of using the software.
I agree with your comment sentiment, but I believe that you, like many others have the cycle in wrong order. I don't fault anyone for it because it's the flow that got handed down to us from the days of waterfall development.
My strong belief after almost twenty years of professional software development is that both us and LLMs should be following the order: build, test, reflect, plan, build.
Writing out the implementation is the process of materializing the requirements, and learning the domain. Once the first version is out, you can understand the limits and boundaries of the problem and then you can plan the production system.
This is very much in line with Fred Brooks' "build one to throw away" (written ~40 years ago in the "The Mythical Man-Month". While often quoted, if you never read his book, I urge you to do so, it's both entertaining, and enlightening on our software industry), startup culture (if you remove the "move fast break things" mantra), and governmental pilot programs (the original "minimum viable").
My approach has been to "yolo" my way through the first time, yes in a somewhat lazy and careless manner, get a working version, and then build a second time more thoughtfully.
> There's a divide between people who enjoy the physical experience of the work and people who enjoy the mental experience of the work. If the thinking bit is your favorite part, AI allows you to spend nearly all of your time there if you wish, from concept through troubleshooting. But if you like the doing, the typing, fiddling with knobs and configs, etc etc, all AI does is take the good part away.
I don't know... that seems like a false dichotomy to me. I think I could enjoy both but it depends on what kind of work. I did start using AI for one project recently: I do most of the thinking and planning, and for things that are enjoyable to implement I still write the majority of the code.
But for tests, build system integration, ...? Well that's usually very repetitive, low-entropy code that we've all seen a thousand times before. Usually not intellectually interesting, so why not outsource that to the AI.
And even for the planning part of a project there can be a lot of grunt work too. Haven't you had the frustrating experience of attempting a re-factoring and finding out midway it doesn't work because of some edge case. Sometimes the edge case is interesting and points to some deeper issue in the design, but sometimes not. Either way it sure would be nice to get a hint beforehand. Although in my experience AIs aren't at a stage to reason about such issues upfront --- no surprise since it's difficult for humans too --- of course it helps if your software has an oracle for if the attempted changes are correct, i.e. it is statically-typed and/or has thorough tests.
> Usually not intellectually interesting, so why not outsource that to the AI.
Because it still needs to be correct, and AI still is not producing correct code
It's like folks complaining that people don't know how to code in Assembly or Machine Language.
New-fangled compiled languages...
Or who use modern, strictly-typed languages.
New-fangled type-safe languages...
As someone that has been coding since it was wiring up NAND gates on a circuit board, I'm all for the new ways, but there will definitely be a lot of mistakes, jargon, and blind alleys; just like every other big advancement.
I'm not sure if you are insinuating that the article is an anti-AI take, but in case it wasn't clear, it's not. It is about doing just what you suggested:
> Just as tech leads don't just write code but set practices for the team, engineers now need to set practices for AI agents. That means bringing AI into every stage of the lifecycle
The technology doesn't force people to be careless, but it does make it very easy to be careless, without having to pay the costs of that carelessness until later.
> There's a divide between people who enjoy the physical experience of the work and people who enjoy the mental experience of the work.
Pretty clearly that’s not the divide anyone’s talking about, right?
Your argument should maybe be something about thinking about the details vs thinking about the higher level. (If you were to make that argument, my response would be: both are valuable and important. You can only go so far working at one level. There are certainly problems that can be solved at one level, but also ones that can’t.)
My experience is that you need the “physical” coding work to get a good intuition of the mechanics of software design, the trade-offs and pitfalls, the general design landscape, and so on. I disagree that you can cleanly separate the “mental” portion of the work. Iterating on code builds your mental models, in a way that merely reviewing code does not, or only to a much more superficial degree.
Idk I feel like even without using LLMs the job is 90% thinking and planning. And it’s nice to go the last 10% on your own to have a chance to reflect and challenge your earlier assumptions.
I actually end up using LLMs in the planning phase more often than the writing phase. Cursor is super good at finding relevant bits of code in unfamiliar projects, showing me what kind of conventions and libraries are being used, etc.
> If the thinking bit is your favorite part, AI allows you to spend nearly all of your time there if you wish, from concept through troubleshooting...
What about if the "knowing/understanding" bit is your favorite part?
Here's one
AI can only recycle the past.
Most of us do nothing but remix the past solutions.
Since we don't know what else might already exist in the world without digging very deep, we fool ourselves into thinking that we do something very original and unique.
lol, this is the entire reason LLMs work so well. The bar is so low and most folks don't seem to realize it.
And even truly novel and unique things are more often than not composites of things that have come before prior. We all stand on the shoulders of giants/priors.
There's plenty of novelty out there. In fact it's in infinite supply.
The giant's shoulder just give us access to useful an lucrative spaces in which to build that novel stuff.
That doesn't contradict what I said. Just that pattern detection based on priors doesn't rule out novelty if guided by a human.
Soon AI will be used for crafting government policy. That is, for plotting a course that will maintain the ruling class with the best safety.
This AI will be built from it's own excretions, recursively.
That might be a hilarious trap.
The first two paragraphs are so confusing. Since Claude Code became a thing my "thinking" phase has been much, much longer than before.
I honestly don't know how one can use Claude Code (or other AI agents) in a 'coding first thinking later' manner.
> I would love to see an anti-AI take that doesn't hinge on the idea that technology forces people to be lazy/careless/thoughtless.
What makes you regard this as an anti-AI take? To my mind, this is a very pro-AI take
I think that the problem is, at the end of the day, the engineer must specify exactly what they want the program to do.
You can do this in Python, or you can do this in English. But at the end of the day the engineer must input the same information to get the same behavior. Maybe LLMs make this a bit more efficient but even in English it is extremely hard to give exact specification without ambiguity (maybe even harder than Python in some cases).
"I would love to see an anti-AI take that doesn't hinge on the idea that technology forces people to be lazy/careless/thoughtless."
I'm not impressed by AI because it generates slop. Copilot can't write a thorough working test suite to save it's life. I think we need a design and test paradigm to properly communicate with AI for it to build great software.
Completely agreed. Whether it be AI or otherwise, I consider anything that gives me more time to focus on figuring out the right problem to solve or iterating on possible solutions to be good.
Yet every time that someone here earnestly testifies to whatever slight but real use they’ve found of AI, an army of commentators appears ready to gaslight them into doubting themselves, always citing that study meant to have proven that any apparent usefulness of AI is an illusion.
At this point, even just considering the domain of programming, there’s more than enough testimony to the contrary. This doesn’t say anything about whether there’s an AI bubble or overhype or anything about its social function or future. But, as you note, it means these cardboard cutout critiques of AI need to at least start from where we are.
> technology forces people to be lazy/careless/thoughtless
AI isn't a technology. (No more than asking your classmate to do your homework for you is a "technology".)
Please don't conflate between AI and programming tools. AI isn't a tool, it is an oracle. There's a huge fundamental gap here that cannot be bridged.
> I would love to see an anti-AI take that doesn't hinge on the idea that technology forces people to be lazy/careless/thoughtless.
Are you genuinely saying you never saw a critique of AI on environmental impact, or how it amplifies biases, or how it widens the economic gap, or how it further concentrates power in the hands of a few, or how it facilitates the dispersion of misinformation and surveillance, directly helping despots erode civil liberties? Or, or, or…
You don’t have to agree with any of those. You don’t even have to understand them. But to imply anti-AI arguments “hinge on the idea that technology forces people to be lazy/careless/thoughtless” is at best misinformed.
Go grab whatever your favourite LLM is and type “critiques of AI”. You’ll get your takes.
I'm not an AI zealot but I think some of these are over blown.
The energy cost is nonsensical unless you pin down a value out vs value in ratio and some would argue the output is highly valuable and the input cost is priced in.
I don't know if it will end up being a concentrated power. It seems like local/open LLMs will still be in the same ballpark. Despite the absurd amounts of money spent so far the moats don't seem that deep.
Baking in bias is a huge problem.
The genie is out of the bottle as far as people using it for bad. Your own usage won't change that.
The motes are incredibly deep, because the established players are being propped up by VC money. Without that VC money, it's impossible to compete, unless you have a way to sustain losses for an indefinite amount of time.
I’ll say it again:
> You don’t have to agree with any of those. You don’t even have to understand them. But to imply anti-AI arguments “hinge on the idea that technology forces people to be lazy/careless/thoughtless” is at best misinformed.
We can certainly discuss some of those points, but that’s not what is in question here. The OP is suggesting there is only one type of anti-AI argument they are familiar with and that they’d “love” to see something different. But I have to question how true that is considering the myriad of different arguments that exist and how easy they are to find.
>I would love to see an anti-AI take that doesn't hinge on the idea that technology forces people to be lazy/careless/thoughtless.
It’s not force but simply human nature. We invent tools to do less. That’s the whole point of tools.
> I would love to see an anti-AI take that doesn't hinge on the idea that technology forces people to be lazy/careless/thoughtless.
Here's a couple points which are related to each other:
1) LLMs are statistical models of text (code being text). They can only exist because huge for-profit companies ingested a lot of code under proprietary, permissive and copyleft licenses, most of which at the very least require attribution, some reserve rights of the authors, some give extra rights to users.
LLM training mixes and repurposes the work of human authors in a way which gives them plausible deniability against any single author, yet the output is clearly only possible because of the input. If you trained an LLM on only google's source code, you'd be sued by google and it would almost certainly reproduce snippets which can be tracked down to google's code. But by taking way, way more input data, the blender cuts them into such fine pieces that the source is undetectable, yet the output is clearly still based on the labor of other people who have not been paid.
Hell, GPT3 still produced verbatim snippets of inverse square root and probably other well known but licensed code. And github has a checkbox which scans for verbatim matches so you don't accidentally infringe copyright by using copilot in a way which is provable. Which means they take extra care to make it unprovable.
If I "write a book" by taking an existing book but replacing every word with a synonym, it's still plagiarism and copyright infringement. It doesn't matter if the mechanical transformation is way more sophisticated, the same rules should apply.
2) There's no opt out. I stopped writing open source over a year ago when it became clear all my code is unpaid labor for people who are much richer than me and are becoming richer at a pace I can't match through productive work because they own assets which give them passive income. And there's no license I can apply which will stop this. I am not alone. As someone said, "Open-Source has turned into a form of unpaid internship"[0]. It might lead to a complete death of open source because nobody will want to see their work fed into a money printing machine (subscription based LLM services) and get nothing in return for their work.
> But if you like the doing, the typing, fiddling with knobs and configs, etc etc, all AI does is take the good part away.
I see quite the opposite. For me, what makes programming fun is deeply understanding a problem and coming up with a correct, clear to understand, elegant solution. But most problems a working programmer has are just variations of what other programmers had. The remaining work is prompting the LLMs in the right way that they produce this (describing the problem instead of thinking about its solutions) and debugging bugs LLMs generated.
A colleague vibe coded a small utility. It's useful but it's broken is so many ways, the UI falls apart when some text gets too long, labels are slightly incorrect and misleading, some text handle decimal numbers in weird ways, etc. With manually written code, a programmer would get these right the right time. Potential bugs become obvious as you're writing the code because you are thinking about it. But they do not occur to someone prompting an LLM. Now I can either fix them manually which is time consuming and boring, or I can try prompting an LLM about every single one which is less time consuming but more boring and likely to break something else.
Most importantly, using an LLM does not give me deeper understanding of the problem or the solution, it keeps knowledge locked in a black box.
[0]: https://aria.dog/barks/forklift-certified-license/
Strongly agree with this
OK: AI is slow when using the said loop. AI is like poker. You bet with time. 60 seconds to type prompt and generate a response. Oh it is wrong ok let's gamble another 60 seconds...
At least when doing stuff the old way you learn something if you waste time.
That said AI is useful enough and some poker games are +EV.
So this is more caution-AI than anti-AI take. It is more an anti-vibe-koolaid take.
This depends entirely on how you use said AI. You can have it read code, explain why was it done this or that way, and once it has the context you ask to think about implementing feature X. There is almost no gambling involved there, at best the level frustration you would have with a colleague. If you start from blank context, tell it to implement full app, you are purely just gambling.
> You can have it read code, explain why was it done this or that way,
The thing is that, once you're experienced enough, it's faster to just glance at the code and have the answer right, instead of playing the guessing game with AI.
> and once it has the context you ask to think about implementing feature X
I'm always amazed at someone using that methodology. When I think about a feature, first is to understand the domain, second is which state I'm like to start from and where all the data are. If you don't get these two steps right, what you'll have is a buggy/incomplete implementation. And if you do get these steps right, the implementation is likely trivial.
I'm not sure where is the misunderstanding but your second paragraph is exactly why I ask AI the questions you question in the first paragraph. I ask the AI to do the domain research, see what we are starting from and THEN ask it to think about a feature. They are not really for me, they are for the AI to have good context what we are working on. As you said, the implementation is then almost trivial and the AI is less likely to mess it up.
The thing is, the domain is often more difficult than the actual implementation. And often only a subset matters (different for each task). So I’m wondering if teaching the AI the correct subdomain is indeed faster than just code the solution.
Also trivial work can benefit the coder. Like a light jog between full sprints for your brain. Reviewing code can be more taxing than writing it as you need to retieve the full context at once instead of incremental steps.
I suspect the root of the disagreement is more about what kinds of work people do. There are many different kinds of programming and you can’t lump them all together. We shouldn’t expect an AI tool to be a good fit for all of them, any more than we should expect Ruby to be a good fit for embedded development or C to be a good fit for web apps.
My experience with low level systems programming is that it’s like working with a developer who is tremendously enthusiastic but has little skill and little understanding of what they do or don’t understand. Time I would have spent writing code is replaced by time spent picking through code that looks superficially good but is often missing key concepts. That may count as “thinking” but I wouldn’t categorize it as the good kind.
Where it excels for me is as a superpowered search (asking it to find places where we play a particular bit-packing game with a particular type of pointer works great and saves a lot of time) and for writing one-off helper scripts. I haven’t found it useful for writing code I’m going to ship, but for stuff that won’t ship it can be a big help.
It’s kind of like an excavator. If you need to move a bunch of dirt from A to B then it’s great. If you need to move a small amount of dirt around buried power lines and water mains, it’s going to cause more trouble than it’s worth.
I think this is one of the most cogent takes on the topic that I've seen. Thanks for the good read!
It's also been my experience that AI will speed up the easy / menial stuff. But that's just not the stuff that takes up most of my time in the first place.
The last paragraph feels more wrong the more I think about it.
Imagine an AI as smart as some of the smartest humans, able to do everything they intellectually do but much faster, cheaper, 24/7 and in parallel.
Why would you spend any time thinking? All you'll be doing it is the things an AI can't do - 1) feeding it input from the real world and 2) trying out its output in the real world.
1) Could be finding customers, asking them to describe their problem, arranging meetings, driving to the customer's factory to measure stuff and take photos for the AI, etc.
2) Could be assembling the prototype, soldering, driving it to the customer's factory, signing off the invoice, etc.
None of that is what I as a programmer / engineer enjoy.
If actual human-level AI arrives, it'll do everything from concept to troubleshooting, except the parts where it needs presence in the physical world and human dexterity.
If actual human-level AI arrives, we'll become interfaces.
"AI" does not encourage real thinking. "AI" encourages hand waving grand plans that don't work, CEO style. All pro-"AI" posts focus on procedures and methodologies, which is just LARPing thinking.
Using "AI" is just like speed reading a math book without ever doing single exercise. The proponents rarely have any serious public code bases.
A surprising conclusion to me at least is that a lot of programmers simply don’t like to write code.
Well, yeah. I like getting computers to automate things and solve problems. Typing in boilerplate and syntax is just a means to that end, and not even remotely the most interesting part. I don't like managing my own memory garbage collection either, so I prefer to use tools that handle that for me.
I mean, I guess when I was really early in my career I'd get a kick out of writing a clever loop or whatever, and I drank deep from all the low level coding wisdom that was available, but the scope of what I care about these days has long since expanded outward.
> "AI" encourages hand waving grand plans that don't work
You described the current AI Bubble.
I see a lot of comments like this and it reflects strongly negatively on the engineers who write it imho. As in I've been a staff level engineer at both Meta and Google and a lead at various startups in my time. I post open source projects here on HN from time to time that are appreciated. I know my shit. If someone tells me that LLMs aren't useful i think to myself "wow this person is so unable to learn new tools they can't find value in one of the biggest changes happening today".
That's not to say that LLMs as good as some of the more outrageous claims. You do still need to do a lot of work to implement code. But if you're not finding value at all it honestly reflects badly on you and your ability to use tools.
The craziest thing is i see the above type of comment on linked in regularly. Which is jaw dropping. Prospective hiring managers will read it and think "Wow you think advertising a lack of knowledge is helpful to your career?" Big tech co's are literally firing people with attitudes like the above. There's no room for people who refuse to adapt.
I put absolute LLM negativity right up there with comments like "i never use a debugger and just use printf statements". To me it just screams you never learnt the tool.
> I put absolute LLM negativity right up there with comments like "i never use a debugger and just use printf statements". To me it just screams you never learnt the tool.
To me it just feels different. Learning to use a debugger made me feel more powerful and "in control" (even though I still use a lot of print debugging; every tool has its place). Using AI assisted coding makes me feel like a manager who has to micro-manage a noob - it's exhausting.
It’s exhausting because most of us like to sit down open an IDE and start coding with the belief that ambiguous or incomplete aspects will be solved as they come up. The idea of writing out the spec of a feature from without ambiguity, handling error states, etc. and stopping to ask if the spec is clear is boring and not fun.
To many of us coding us simply more fun. At the same time, many of us could benefit from that exercise with or without the LLM.
For pet projects, it might be less fun. For real projects, having to actually think about what I'm trying to do has been a net positive, LLM or no LLM.
Agree. I've never had the attention span to learn code, but I utilize LLM's heavily and have recently started managing my first large coding project with CC to what seems like good results.
As LLM get better, more and more people will be able to create projects with only rudimentary language understanding. I don't think LLMS can ever be as good as some of the outrageous claims; it's a lot like that 3rd grade project kids do on writing instruction on making a PB&J. LLM's cannot read minds and will only follow the prompt given to them. What I'm trying to say is that eventually there will be a time where being able to effectively manage coding agents efficiently will be more externally valuable than knowing how to write code.
This isn't to say that engineering experience is not valuable. Having a deep understanding of how to design and build secure and efficient software is a huge moat between experienced engineers and vibecoders like me, and not learning how to best use the tools that are quickly changing how the world operates will leave them behind.
Why would you point out two tool obsessed companies as something positive? Meta and Google are overstaffed and produce all sorts of tools that people have to use because someone's performance evaluation depends on it.
The open source code of these companies is also not that great and definitely not bug free. Perhaps these companies should do more thinking and less tooling politics.
> That's not to say that LLMs as good as some of the more outrageous claims. You do still need to do a lot of work to implement code. But if you're not finding value at all it honestly reflects badly on you and your ability to use tools.
You are in a forum full of people that routinely claim that vibe coding is the future, that LLMs already can fully replace engineers, and if you don't think so you are just a naysayer that is doing it wrong.
Rephrasing your claim, LLMs are just moderately useful, far from being the future-defining technology people invested in it wants it to be. But you choose to rally against people not interested in marketing it further.
Given the credentials you decided to share, I find it unsurprising.
Alternatively - there's 5 million other things I could be learning and practicing to improve as a programmer before trying out the new AI codegen-du-jour. Until I'm Fabrice Bellard, focusing on my fundamental skills will make me a better programmer, faster, than focusing on the hype of the day.
Ah yes, the hallmarks of top talent:
Violent insecurity and authoritarianism... definitely not compensating for anything there.
Most of my anti-AI takes are either:
1) Bad actors using AI at scale to do bad things
2) AI just commodifying everything and making humans into zoo animals
My anti AI take is that it's no fun.
I'm on a small personal project with it intentionally off, and I honestly feel I'm moving through it faster and certainly having a better time. I also have a much better feel for the code.
These are all just vibes, in the parlance of our times, but it's making me question why I'm bothering with LLM assisted coding.
Velocity is rarely the thing in my niche, and I'm not convinced babysitting an agent is all in all faster. It's certainly a lot less enjoyable, and that matters, right?
More specifically for (1), the combined set of predators, advertisers, businesses, and lazy people using it to either prey or enshittify or cheat will make up the vast majority of use cases.
But notice how I got automatically heavily downvoted by merely mentioning legitimate downsides of AI
Yeah you're on a forum where lots of people hope to profit from AI, it's probably unavoidable.
It's a fine post, but two canards in here:
First, skilled engineers using LLMs to code also think and discuss and stare off into space before the source code starts getting laid down. In fact: I do a lot, lot more thinking and balancing different designs and getting a macro sense of where I'm going, because that's usually what it takes to get an LLM agent to build something decent. But now that pondering and planning gets recorded and distilled into a design document, something I definitely didn't have the discipline to deliver dependably before LLM agents.
Most of my initial prompts to agents start with "DO NOT WRITE ANY CODE YET."
Second, this idea that LLMs are like junior developers that can't learn anything. First, no they're not. Early-career developers are human beings. LLMs are tools. But the more general argument here is that there's compounding value to working with an early-career developer and there isn't with an LLM. That seems false: the LLM may not be learning anything, but I am. I use these tools much more effectively now than I did 3 months ago. I think we're in the very early stages of figuring how to get good product out of them. That's obvious compounding value.
> the LLM may not be learning anything, but I am
Regardless of that, personally i'd really like it if they could actually learn from interacting with them. From a user's perspective what i'd like to do is to be able to "save" the discussion/session/chat/whatever, with everything the LLM learned so far, to a file. Then later be able to restore it and have the LLM "relearn" whatever is in it. Now, you can already do this with various frontend UIs, but the important part in what i'd want is that a) this "relearn" should not affect the current context window (TBH i'd like that entire concept to be gone but that is another aspect) and b) it should not be some sort of lossy relearning that loses information.
There are some solutions but there are all band-aids to fundamental issues. For example you can occasionally summarize whatever discussed so far and restart the discussion. But obviously that is just some sort of lossy memory compression (i do not care that humans can do the same, LLMs are software running on computers, not humans). Or you could use some sort of RAG but AFAIK this works via "prompt triggering" - i.e. only via your "current" interaction, so even if the knowledge is in there but whatever you are doing now wouldn't trigger its index the LLM will be oblivious to it.
What i want is, e.g., if i tell to the LLM that there is some function `foo` used to barfize moo objects, then go on and tell it other stuff way beyond whatever context length it has, save the discussion or whatever, restore it next day, go on and tell it other stuff, then ask it about joining splarfers, it should be able to tell me that i can join splarfers by converting them to barfized moo objects even if i haven't mentioned anything about moo objects or barfization since my previous session yesterday.
(also as a sidenote, this sort of memory save/load should be explicit since i'd want to be able to start from clean slate - but this sort of clean slate should be because i want to, not as a workaround to the technology's limitations)
You want something that requires an engineering breakthrough.
Models don't have memory, and they don't have understanding or intelligence beyond what they learned in training.
You give them some text (as context), and they predict what should come after (as the answer).
They’re trained to predict over some context size, and what makes them good is that they learn to model relationships across that context in many dimensions. A word in the middle can affect the probability of a word at the end.
If you insanely scale the training and inference to handle massive contexts, which is currently far too expensive, you run into another problem: the model can’t reliably tell which parts of that huge context are relevant. Irrelevant or weakly related tokens dilute the signal and bias it in the wrong direction, the distribution flatten or just ends up in the wrong place.
That's why you have to make sure you give it relevant well attended context, aka, context engineering.
It won't be able to look at a 100kloc code base and figure out what's relevant to the problem at hand, and what is irrelevant. You have to do that part yourself.
Or what some people do, is you can try to automate that part a little as well by using another model to go research and build that context. That's where people say the research->plan->build loop. And it's best to keep to small tasks, otherwise the context needing for a big task will be too big.
> You want something that requires an engineering breakthrough.
Basically, yes. I know the way LLMs currently work wouldn't be able to provide what i want, but what i want is a different way that does :-P (perhaps not even using LLMs).
I'm using a "memory" MCP server which basically just stores facts to a big json file and makes a search available. There's a directive in my system prompt that tells the LLM to store facts and search for them when it starts up.
It seems to work quite well and I'll often be pleasantly surprised when Claude retrieves some useful background I've stored, and seems to magically "know what I'm talking about".
Not perfect by any means and I think what you're describing is maybe a little more fundamental than bolting on a janky database to the model - but it does seem better than nothing.
>First, skilled engineers using LLMs to code also think and discuss and stare off into space before the source code starts getting laid down
Yes, and the thinking time is a significant part of overall software delivery, which is why accelerating the coding part doesn't dramatically change overall productivity or labor requirements.
This logic doesn't even cohere. Thinking is a significant part of software delivery. So is getting actual code to work.
Which part of the third chart do you disagree with?
Here we're really arguing about his first chart, which I agree with, and you do not.
if you're spending anywhere near as many engineering hours "getting code to work" as you're spending "thinking" then something is wrong in your process
Take that up with the author of this article!
ideally there is an order of magnitude difference between, and the latter is trivially delegable (where the former is not)
No there isn't.
This harkens back to the waterfall vs agile debates. Ideally there would be a plan of all of the architecture with all the pitfalls found out before any code is laid out.
In practice this can’t happen because 30 minutes into coding you will find something that nobody thought about.
In the micro, sure. In the macro, if you are finding architecture problems after 30 minutes, then I’m afraid you aren’t really doing architecture planning up front.
Oftentimes, micro pitfalls are ill omens that some bigger issue is afoot.
Depends on what you're building. If it's another crud app sure, but if its something remotely novel you just can't understand the landscape without walking through it at least once.
I have not profiled how much time I am just codding at work, but is not the biggest time sink.
If their job is basically to generate code to close jira tickets I can see the appeal of LLMs.
DO NOT WRITE ANY CODE YET.
haha I always do that. I think it's a good way to have some control and understand what it is doing before the regurgitation. I don't like to write code but I love the problem solving/logic/integrations part.
I'm surprised (or maybe just ignorant) that Claude doesn't have an explicit setting for this, because it definitely tends to jump the gun a lot.
Plan mode (shift-tab twice) might be what you want.
See, I called it! Ignorance it is!
Is that not exactly plan mode?
Most of my initial prompts to agents start with "DO NOT WRITE ANY CODE YET."
Copilot has Ask mode, and GPT-5 Codex has Plan/Chat mode for this specific task. They won't change any files. I've been using Codex for a couple of days and it's very good if you give it plenty of guidance.
> Most of my initial prompts to agents start with "DO NOT WRITE ANY CODE YET."
I like asking for the plan of action first, what does it think to do before actually do any edits/file touching.
I’ve also had success writing documentation ahead of time (keeping these in a separate repo as docs), and then referencing it for various stages. The doc will have quasi-code examples of various features, and then I can have a models stubbed in one pass, failing tests in the next, etc.
But there’s a guiding light that both the LLM and I can reference.
Sometimes I wonder if pseudocode could be better for prompting than expressive human language, because it can follow a structure and be expressive but constrained -- have you seen research on this and whether this an effective technique?
Self disciplined humans are far and few between, that seems to be the point of most of these anti-ai articles, and I tend to agree with them.
> figuring how to get good product out of them
What have you figured out so far, apart from explicit up-front design?
> Most of my initial prompts to agents start with "DO NOT WRITE ANY CODE YET."
I really like that on IntelliJ I have to approve all changes, so this prompt is unnecessary.
There's a YOLO mode that just changes shit without approval, that I never use. I wonder if anyone does.
I use YOLO mode all the time with Claude Code. Start on a new branch, put it in plan mode (shift + tab twice), get a solid plan broken up in logical steps, then tell it to execute that plan and commit in sensible steps. I run that last part in "YOLO mode" with commit and test commands white listed.
This makes it move with much less scattered interactions from me, which allows focus time on other tasks. And the committing parts make it easier for me to review what it did just like I would review a feature branch created by a junior colleague.
If it's done and tests pass I'll create a pull request (assigned to myself) from the feature branch. Then thoroughly review it fully, this really requires discipline. And then let Claude fetch the pull request comments from the Github API and fix them. Again as a longer run that allows me to do other things.
YOLO-mode is helpful for me, because it allows Claude to run for 30 minutes with no oversight which allows me to have a meeting or work on something else. If it requires input or approval every 2 minutes you're not async but essentially spending all your time watching it run.
It's more about having the LLM give you a plan of what it wants to do and how it wants to do it, rather rhan code. Then you can mold the plan to fit what you really want. Then you ask it to actually start writing code.
Even Claude Code lets you approve each change, but it's already writing code according to a plan that you reviewed and approved.
> LLMs are tools
With tools you know ahead of time that they will do the job you expect them to do with very high probability, or fail (with low probability) in some obvious way. With LLMs, there are few tasks you can trust them to do, and you also don't know their failure mode. They can fail yet report success. They work like neither humans nor tools.
An LLM behaves like a highly buggy compiler that too frequently reports success while emitting incorrect code. Not knowing where the bugs are, the only thing you can try to do is write the program in some equivalent way but with different syntax, hoping you won't trigger a bug. That is not a tool programmers often use. Learning to work with such a compiler is a skill, but it's unclear how transferable or lasting that skill is.
If LLMs advance as significantly and as quickly as some believe they will, it may be better to just wait for the buggy compiler to be fixed (or largely fixed). Presumably, much less skill will be required to achieve the same result that requires more skill today.
I think what the article gets at, but doesn't quite deliver on, is similar to this great take from Casey Muratori [1] about how programming with a learning-based mindset means that AI is inherently not useful to you.
I personally find AI code gen most useful for one-off throwaway code where I have zero intent to learn. I imagine this means that the opposite end of the spectrum where learning is maximized is one where the AI doesn't generate any code for me.
I'm sure there are some people for which the "AI-Driven Engineering" approach would be beneficial, but at least for me I find that replacing those AI coding blocks with just writing the code myself is much more enjoyable, and thus more sustainable to actually delivering something at the end.
[1] https://youtu.be/apREl0KmTdQ?t=4751 (relevant section is about 5 minutes long)
"learning is maximized is one where the AI doesn't generate any code for me"
Obviously you have to work to learn, but to me this is a bit like saying learning is maximized when you never talk to anyone or ask for help — too strong.
I spend more time thinking now that I use Claude Code. I write features that are often 400-600 word descriptions of what I want—-something I never would’ve done beforehand.
That thinking does result in some big tradeoffs…I generally get better results faster but I also have a less complete understanding of my code.
But the notion that Claude Code means an experienced developer spends less thinking carefully is simply wrong. It’s possible (even likely) that a lot of people are using agents poorly…but that isn’t necessarily the agent’s fault.
What these articles miss:
1) not all coding is the same. You might be working on a production system. I might need a proof of concept
2) not everyone's use of the coding agents is the same
3) developer time, especially good developer time has a cost too
I would like to see an article that frames the tradeoffs of AI assisted coding. Specifically without assigning value judgments (ie goodness or badness). Really hard when your identity is built around writing code.
This article explicitly mentions your first point.
Article seems to cherry pick Microsoft marketing to claim 10% gains, when the Harvard ran study from earlier this year showed a 10% slowdown. Id rather articles point first at the research being done that's agnostic of any corporation trying to push its propaganda if possible.
Does anyone know how to make the diagrams and charts in this blog post? I liked the style.
Looks like done in https://excalidraw.com
Every day I think to myself: "Just fake it like you want to be here for 30 more years and then you can retire."
I have been working in machine-learning for 10 years. I am tired of the computer. I am tired of working. I just want to lay in the grass.
I’ve been grassing for 7 months. That gets old too.
It sounds like you need a sabbatical.
Every time I read stuff like this I honestly wonder if the author is using the same tools I am.
I can have Claude Code bang out everything from boilerplate to a working prototype to a complex algorithm embedded in a very complex and confusing code base. It’s not correct 100% of the time but it’s pretty damn close. And often times it comes up with algorithms I would have never thought of initially.
These things are at least a 10x multiple of my time.
The difficulty is we skeptics have read claims like yours tens of times, and our response is always, "please share a repo built this way and an example of your prompts," and I at least have never seen anyone do so.
I'd love for what you say to be possible. Comments like yours often cause me to take another crack at agentic workflows. I'm disappointed every time.
Popular != true. Galileo found that out by spending the last ~decade of his life under house arrest. Thankfully today we mostly just get downvoted.
> lack in-depth knowledge of your business, codebase, or roadmap
So give them some context. I like Cline's memory bank approach https://docs.cline.bot/prompting/cline-memory-bank which includes the architecture, progress, road map etc. Some of my more complex projects use 30k tokens just on this, with the memory bank built from existing docs and stuff I told the model along the way. Too much context can make models worse but overall it's a fair tradeoff - it maintains my coding style and architecture decisions pretty well.
I also recommend in each session using Plan mode to get to a design you are happy with before generating any code.
Not in my experience. I still spend much of the time thinking before prompting. Then I spend time reviewing the AI written code before using it. Does not feel like a trap. It mostly feels like having a super experienced pair programmer. I may be using it differently than others since I do not have it integrated to my IDE. I use it like I used google + stackoverflow before it.
To be fair you can’t really appreciate if you’ve been trapped unless you test yourself without the ai agent for some time and see if there is a difference in output. If there is, you’ve been trapped.
I think the post, while extensive, missed one important issue.
The fact then when we read others' code, we don't remember/integrate it into our thinking as well as we do when we're the authors. So mentoring "AI Juniors" provides less growth then doing the job, esp. if it is mostly corrective actions.
The article assumes that AI coding is thoughtless, but prompting is writing, and writing is thinking. If you approach AI coding the same way as regular programming, your thinking phase involves crafting a prompt that describes your thoughts.
I appreciate these takes, but I can't help to think this is just the weird interim time where nothing is quite good enough, but in a year an article like this would clearly be "overthinking" the problem.
Like when those in the know could clearly see the internet's path to consuming everything but it just hasn't happened yet so there were countless articles trying to decide if it was a fad or not, a waste of money, bad investment, etc.
Jumping straight into coding is a very junior thing to do.
Using Plan mode in Cline or other agent based workflows is day and night in the outputs.
Alas, at least in Cline it seems plan mode doesn’t read files just works off context which is insane to me and hinders its usefulness, anyone know why that happens?
> Using Plan mode in Cline or other agent based workflows is day and night in the outputs.
Agreed. My tool agnostic workflow is to work up a planning/spec doc in the docs/feature-plans folder. I use one chat thread to make that. First it creates the basic plan, then then we pick it apart together, I manually fix bad assumptions, then in a new chat, we implement.
Before and after implementation, I run my /gilfoyle command for a constructive roast, then my /sec command for a thorough security review. After implementing this, and a bit more, the final LLM output quality is much higher.
edit: adding "make sure we are applying known patterns used in our app for our solution, don't reinvent the wheel." helped a ton. I am on mobile atm, and that might not be the exact wording.
When it tells me that I need to switch to act mode for it to read files and create a detailed the plan, I just chide gently and ask it to read the damn files in plan mode. Works every time. I wish I dont have to do that.
Cline plan mode doesn't tend to read files by default but you can tell it 'read all files necessary to establish a detailed plan'. GPT5 also seems more eager to read files.
You can just add the files to the prompt with the @ sign.
The "thinking & coding" vs "thinking & fixing" graph is interesting. I've found this to be the case recently as I've been trying out Codex. I expected to spend a lot of time fixing the AI's code. Weirdly that has led to me spending a long time fixing issues which turn out to be nothing to do with the code.
Most recently I was struggling to get an authentication setup working. I spent at least an hour combing through the code looking for the mistake. The issue turned out to be that the VM I was working on had a broken ipv6 configuration.
Broadly the critique is valid where it applies; I don’t know if it accurately captures the way most people are using LLMs to code, so I don’t know that it applies in most case.
My one concrete pushback to the article is that it states the inevitable end result of vibe coding is a messy unmaintainable codebase. This is empirically not true. At this point I have many vibecoded projects that are quite complex but work perfectly. Most of these are for my private use but two of them serve in a live production context. It goes without saying that not only do these projects work, but they were accomplished 100x faster than I could have done by hand.
Do I also have vibecoded projects that went of the rails? Of course. I had to build those to learn where the edges of the model’s capabilities are, and what its failure modes are, so I can compensate. Vibecoding a good codebase is a skill. I know how to vibecode a good, maintainable codebase. Perhaps this violates your definition of vibecoding; my definition is that I almost never need to actually look at the code. I am just serving as a very hands-on manager. (Though I can look at the code if I need to - have 20 years of coding experience. But if I find that I need to look at the code, something has already gone badly wrong.)
Relevant anecdote: A couple of years ago I had a friend who was incredibly skilled at getting image models to do things that serious people asserted image models definitely couldn’t do at the time. At that time there were no image models that could get consistent text to appear in the image, but my friend could always get exactly the text you wanted. His prompts were themselves incredible works of art and engineering, directly grabbing hold of the fundamental control knobs of the model that most users are fumbling at.
Here’s the thing: any one of us can now make an image that is better than anything he was making at the time. Better compositionality, better understanding of intent, better text accuracy. We do this out of the box and without any attention paid to promoting voodoo at all. The models simply got that much better.
In a year or two, my carefully cultivated expertise around vibecoding will be irrelevant. You will get results like mine by just telling the model what you want. I assert this with high confidence. This is not disappointing to me, because I will be taking full advantage of the bleeding edge of capabilities throughout that period of time. Much like my friend, I don’t want to be good at managing AIs, I want to realize my vision.
100x is such a crazy claim to me - you’re saying you can do in 4 days what would have previously taken over a year. 5 weeks and you can accomplish what would have taken you a decade without LLMs.
Interesting point.
In most cases I would never have undertaken those projects at all without AI. One of the projects that is currently live and making me money took about 1 working day with Claude Code. It’s not something I ever would have started without Claude Code, because I know I wouldn’t have the time for it. I have built websites of similar complexity in the past, and since they were free-time type endeavors, they never quite crossed the finish line into commerciality even after several years of on-again-off-again work. So how do you account that with a time multiplier? 100x? Infinite speedup? The counterfactual is a world where the product doesn’t exist at all.
This is where most of the “speedup” happens. It’s more a speedup in overall effectiveness than raw “coding speed.” Another example is a web API for which I was able to very quickly release comprehensive client side SDKs in multiple languages. This is exactly the kind of deterministic boilerplate work LLMs are ideal for, and that would take a human a lot of typing, and looking up details for unfamiliar languages. How long would it have taken me to write SDKs in all those languages by hand? I don’t really know, I simply wouldn’t have done it, I would have just done one SDK in Python and said good enough.
If you really twist my arm and ask me to estimate the speedup on some task that I would have done either way, then yeah I still think a 100x speedup is the right order of magnitude, if we’re talking about Claude Code with Opus 4.1 specifically. In the past I spent about a five years very carefully building a suite of tools for managing my simulation work and serving as a pre/post-processor. Obviously this wasn’t full-time work on the code itself, but the development progressed across that timeframe. I recently threw all that out and replaced it with stuff I rebuilt in about a week with AI. In this case I was leveraging a lot of the learnings I gleaned from the first time I built it, so it’s not a fair one-to-one comparison, but you’re really never going to see a pure natural experiment for this sort of thing.
I think most people are in a professional position where they are sort of externally rate limited. They can’t imaging being 100x more effective. There would be no point to it. In many cases they already sit around doing nothing all day, because they are waiting for other people or processes. I’m lucky to not be in such a position. There’s always somewhere I can apply energy and see results, and so AI acts as an increasingly dramatic multiplier. This is a subtle but crucial point: if you never try to use AI in a way that would even hypothetically result in a big productivity multiplier (doing things you wouldn’t have otherwise done, doing a much more thorough job on the things you need to do, and trying to intentionally speed up your work on core tasks) then you can’t possibly know what the speedup factor is. People end up sounding like a medieval peasant suddenly getting access to a motorcycle and complaining that it doesn’t get them to the market faster, and then you find out that they never actually ride it.
I wonder, have you sat down and tried to vibecode something with Claude Code? If so, what kind of multiplier would you find plausible?
I had the AI implement two parallel implementations of the same thing in one project. Was lots of fun when it was 'fixing' the one that wasn't being used. So yeah, it can definitely muck up your codebase.
Hah today I discovered Claude Code has been copy/pasting gigantic blocks of conditions and styling every time I ask it to add a "--new" flag or whatever in a once tiny now gigantic script I've been adding features to.
It worked fine until recently where I will ask it to tweak some behavior of a command with a flag and it does a diff with like hundreds of lines. So now it's struggling to catch every place it needs to change some hardcoded duplicate values it decided to copy/paste into two dozen random places in the code.
To be fair it is doing a decent job unfucking it now that I noticed and started explicitly showing it how ridiculously cumbersome and unmaintainable it made things with specific examples and refactoring. But if I hadn't bothered to finally sit down and read through it thoroughly it would have just become more broken and inconsistent as it grew exponentially.
Yeah, it will definitely do dumb stuff if you don’t keep an eye on it and intervene if you see the signs that it’s heading in the wrong direction. But it’s very good at course correcting and if you end up in a truly disastrous state you can almost always fix it be reverting to the last working commit and start a fresh context.
I think this would benefit from examples of including coding assistants in the stages enumerated; how can the agent be included in each stage? I've seen posts about successful collaboration with agents at say Google, where there is tons of upfront work among humans to agree on design, then work with the agent to build out parts of the project and ensuring thorough test suites are included.
Does including an agent at each stage of this cycle mean "context engineering"? Is this then just more text and assets to feed in at each stage of LLM ussage to provide the context for the next set of tokens to generate for the next stage of the cycle? Is there something deeper that can be done to encode this level of staged development into the agent's weights/"understanding"? Is there an established process for this yet?
- Specification
- Documentation
- Modular Design
- Test-Driven Development
- Coding Standard
- Monitoring & Introspection
the flaw in the article is acting like engineers always have a choice. the writers presents the contrasts of "fair delegation vs mollycoddling" mirroring "ai-driven development vs vibe coding"... but that sacrifice for short-term gain at the expense of scale is often draconically enforced.
obviously good and experienced engineers aren't going to be vibe coders/mollycoddlers by nature. but many good and experienced engineers will be pressured to make poor decisions by impatient business leaders. and that's the root of most AI anxiety: we all know it's going to be used as irresponsibly and recklessly as possible. it's not about the tech. it's about a system with broken incentives.
This omits the deep knowledge required for traditional coding. This opens up coding to non devs, e.g. product managers.
For vibe coding you need systems thinking, planning and logic, but less craftmansship.
For PO's the chart looks different, here the traditional flow contains: "polish concepts, make mocks, make stories, dailies, handovers, revisions, waiting for devs busy with other things"
The vibing PO has none of that.
Not saying this is sustainable for big projects already, but it is for ever growing "small" projects (especially if it's a techincal PM that can code a bit). It's just so much faster without devs -_-.
Disclaimer: I am such a PO. What I now wonder: how can I mix vibe coding and properly developed foundations (ai assisted, but not 99% vibed). One answer in my view is splitting services in vibecoded and core, but dependencies on the core already slow you down a lot. Curious to hear how others mix and match both.
What I actually do right now:
- fullstack PoC vibecoded
- specifications for core based on PoC findings
- Build proper vibecoded V1 mocking what moves to core. But here already a more structured vibecoding approach too.
- replace mocks with actual core
Not yet done but planned: build throwaway PoC in a branch of the core so I can also vibe on top of the core directly, including modification of the core.
The deep knowledge really isn’t all that deep. A couple years in the weeds and you have it. What this really hurts is outsourced devs. In the past a non coding person could come up with the spec and hire someone from a developing nation to make it on the cheap to that spec. It is still possible to work like this of course, resulting in working code compared to llm that might hallucinate a passing test condition that you can’t appreciate with your lack of coding chops. It is just the ai seems “faster” and the way it is paid for less in front of your face. Really in practice, nothing new was really gained. Pm always could hire code. Now they hire nondeterministic generated code but they are still essentially hiring code, submitting spec, having something else write the code.
But the feedback cycle wait time and the communication workload is almost eliminated.
Perhaps. Then again, most people aren't working in fields where racing to the finish is a requirement. Corporate work I've found is really slow moving for a lot of reasons. Things getting backburnered and pushed back is pretty standard. It takes time for all the stakeholders to digest information and make feedback. There is also the phenomenon where no one really appreciably works after thanksgiving so for most american corporate workers it is like they are on a 10.5 month year as it is. All this to say that even with diminished feedback cycle wait time and lightened communication workload, I don't think the product is getting out any faster.
Has anyone read up on the recent paper from Meta/FAIR -- CWM: An Open-Weights LLM for Research on Code Generation with World Models
Which looks to attempt to give better "coding" understanding to the model instead of mere tokens and positioning and hence improve the coding capabilities of these "brilliant but unpredictable junior engineer" coding agents:
- https://ai.meta.com/research/publications/cwm-an-open-weight...
Even outside of AI coding I have found a tremendous amount of value in using AI to produce a requirements and spec document for me to code from. The key unlock for me is asking AI to “interview” me about how this system/feature should work. As part of that process it will often ask a question that gets me thinking about interesting edge cases.
I will say I always provide an initial context document about the feature/system, to avoid us starting with trivial questions. After about 45minutes I’ll often feel I’ve covered enough ground and given the problem enough thought to really put pen to paper. Off the back of this I’ll ask it to summarise the spec and produce a document. This can be a good point to ditch AI if you are so inclined but still get value from it.
LLM coding agents can't learn from experience on our code, but we can learn from using them on our code, and in the context of our team and processes. I started creating some harnesses to help get more of what we want from these tools, and less of what we need to work too much on - eg, creating specialized agents to refactor code and test after it's been generated, and make it more in line with our standards, removing bogus tests, etc. The learning is embedded in the prompts for these agents.
I think that this approach can already get us pretty far. One thing I'm missing is tooling to make it easier to build automation on top of, eg, Claude Code, but I'm sure it's going to come (and I'm tempted to try vibe coding it; if only I had the time).
There is this statement in the article: LLMs are lightning fast junior engineers. I don't know if that is right or wrong. To me, good LLMs are a lightning fast better version of me. That means that I can write code like some LLMs but it'll take me days to do it (if I use a language that I'm not good at, for example Rust) but with carefully crafted prompts, LLMs take maybe half an hour or less.
This diagram (which resonated):
suggests a workflow where AI is used almost exclusively to speed up the writing of known, targeted code whose structure has already been thought out. And possibly as a (non-coding) sounding board during the thinking out.The thinking part is the same, yes, but I doubt the fixing is.
Fixing something you have written is much easier than fixing something someone else (or an AI) has written, just because you don't have the mental model for the code, which is the most important part of debugging and refactoring.
Agree completely -- that's what my suggestion was getting at.
Especially with junior engineers it is helpful to ask them to provide a Loom video or other proof that they have verified a feature or bug works as intended. I have tried setting up Claude Code with playwright to verify it's work, but so far am not very satisfied with the results. Any tools that are helpful with this end to end testing for web apps using Claude Code and other AI assistants? Feel free to share your product if it is relevant.
I've seen reasonable results from letting Claude Code apply test driven development and having a good e2e test suite on top of that (with Playwright). In that setup giving it Playwright MCP to access the app and verify why e2e tests are working or not working, and for writing new tests helps.
Just giving it an MCP to test changes also didn't work for me. But the combination with e2e tests was better.
The post starts with some valid observation on how to help dev teams mature. And then, all of a sudden the junior devs are replaced by agents. That’s it. End of story.
AI coding is more like being a PM than an engineer. Obviously PM’s exist and don’t know as much about the tech they make as the engineers, but are nevertheless useful.
> LLMs are lightning fast junior engineers
I cannot express how tired I am of seeing this beyond stupid take.
If you truly believe that, you have either only ever worked with the most piss-poor junior engineers, or you simply have never worked with junior engineers.
LLMs do not learn, LLMs do not ask clarifications, LLMs do not wonder if they're going in the wrong direction, LLMs do not have taste, LLMs do not have opinions, LLMs write absolutely illogical nonsense, LLMs do not ask themselves what is the best for the customer, LLMs have no context beyond what you have explicitely fed them for that one specific task, and much more.
> LLMs do not have taste
This is the fundamental issue with LLM's imo.
Looks like another BS article that tries to compare a junior dev. to an AI. And its not even close. Anyone that actually tried to use AI tooling should know better. Feels like the CEOs that got sold this ideas are force-feeding this idea to senior devs and sadly senior devs are trying to gulp it down instead of rejecting the entire idea.
LLMs are more like a lightning fast pseudo-random text generator at this point. To see that though you'd need to run it a few times, test it and understand the output. Something that's a bit beyond the casual amateur ability. Maybe what we need is the AI bubble to burst first.
Its sad that we've gotten here though. Every tried having a coffee with your preferred AI as opposed to doing it with a highly motivate junior? Maybe you're not even going to get a chance to experience this.
For now, my mind is still made up. I leave the door open to be shown any serious piece of software that is built primarily through agentic workflows. Having tried to use these tools over the past month to build a critical piece of infrastructure for my company, I agree with OP. I spent so much time wrangling back unnecessary garbage that the LLM found was important, that I wondered if just writing it in one shot would have been actually faster. Simple things like 'test all this workflow logic' resulted in the LLM inserting a non-sensical mock at the top of the test file that took me an hour or two to unwind.
Other than that, I keep hearing the same arguments - "LLMs free up more time for me to think about the 'important things'." Son, your system is not durable, your tests are misleading, and you can't reason about what's happening because you didn't write it. What important things are left to think about??
>resulted in the LLM inserting a non-sensical mock at the top of the test file that took me an hour or two to unwind
Why didn't you just not accept the change?
> Son, your system is not durable, your tests are misleading, and you can't reason about what's happening
You just described the majority of pre-LLM enterprise software.
It is an interesting take. I teach programming to designers as a part of a course called "Emerging technologies". Although, it is fun to see what students create, but it is not fun to solve doubts. When you are teaching basics, the students would quickly fire up ChatGPT and make some slop up. In the end, I have to fix their codes. I think, the learning output is reduced as they have not written any code. I am genuinely concerned as an educator. One thing that is missing is differentiating AI output and understanding what to keep and what to ignore. The sheer "aesthetic" of it. I feel, many do not take time to develop this very "human" skill and become very output oriented from the very start. This, IMO, affects learning. These tools are also quite addictive due to the sense of 'manufactured' certainty they offer, which is something that hinders learning. What is the point of learning how to add when you have the cheatsheet next to you?
> While the LLMs get to blast through all the fun, easy work at lightning speed, we are then left with all the thankless tasks: testing to ensure existing functionality isn’t broken, clearing out duplicated code, writing documentation, handling deployment and infrastructure, etc.
I’ve found LLMs just as useful for the "thankless" layers (e.g. tests, docs, deployment).
The real failure mode is letting AI flood the repo with half-baked abstractions without a playbook. It's helpful to have the model review the existing code and plan out the approach before writing any new code.
The leverage may be in using LLMs more systematically across the lifecycle, including the grunt work the author says remains human-only.
That's my experience as well. The LLM is great for what I consider scaffolding. I can describe the architecture I want, some guidelines in CLAUDE.md, then let it write a bunch of stubbed out code. It saves me a ton of time typing.
It's also great for things that aren't creative, like 'implement a unit test framework using google test and cmake, but don't actually write the tests yet'. That type of thing saves me hours and hours. It's something I rarely do, so it's not like I just start editing my cmake and test files, I'd be looking up documentation, and a lot of code that is necessary, but takes a lot of time.
With LLMs, I usually get what I want quickly. If it's not what I want, a bit of time reviewing what it did and where it went wrong usually tells me what I need to give it a better prompt.
What I don't like about using AI is doing something without planning it first.
All software engineering is left aside in exchange for coding -> fixing -> coding -> fixing.
> Test-Driven Development: generating extensive test cases prior to implementation to guide implementation and prevent regression.
I’ve found this concept trips CC up—- assertions are backwards, confusing comments in the test, etc. Just starting a prompt with “Use TDD to…” really helps.
I do appreciate that this article moves past absolute negativity on LLMs and actually speaks to the fact that they are extremely useful for well defined programming tasks. I'm a bit sick of articles that are just pure negativity on these tools.
I will raise that LLMs are pretty good at some of the non-coding tasks too.
eg. "I'm currently creating an AI for a turn based board game. Without doing any implementation, create a plan for the steps that need to be done including training off real world game data".
The LLM creates a tasklist for iterative steps to accomplish the above. It usually needs correction specific to the business/game needs but it's a great start and i recommend doing this just so the LLM has a doc with context on what its trying to achieve in a bigger picture as you have it complete tasks.
The article is pretty interesting, perhaps some marmite takes, but the bit that chimed with me is the vibe coding vs AI-driven engineering. Senior management at my work is obsessed with vibe-coding and are constantly pushing engineers to promote vibe code to PROD. It’s dispiriting to see parts of our code base begin to fill with manager+LLM slop …
Effective AI coding is actually extremely slow if you take into account an exhaustive planning stage where the task specification is laid down in sufficient and unambiguous detail. I had to get the LLM to review my spec over twenty times, always freshly, before I thought it was good enough to be implemented well. Also, it really helps for multiple diverse LLMs to review the spec, as they all have their unique insights. In this way, AI coding also helps me avoid numerous bugs that could have left me in trouble if not for the AI.
Once the planning is done, the actual coding is very fast. The human review that follows is again slow, often also leading to minor new tickets.
A friend of mine who is newer to coding than me is worried that by using AI so much, he's losing his ability to code, and its also killing his motivation because using AI to generate code is just not fun. We do "analog coding sessions" together where we code together without LLM assistance. Its much more enjoyable!
This is why you do not use Claude Code for complete project overhaul. For example, if you are writing in Python, categorize your functions and modules very well so that when you ask Claude's help, it won't get lost and start screwing up the entire project. I use modular design with Claude Code very often, and this is the only way I have found it useful. I won't let it change the code of preexisting projects, and only make it analyze individual files/functions for improvements.
People without a true coding background get stuck after a single function because the context window is still so narrow, and code assistants do not get the full picture of the project. It reminds me of the offshore developer's work ethic: "But sir, you told me to add this button there like that, so I deleted the entire codebase". Without thinking about why in general. It keeps saying, "You are absolutely right! I shouldn't have done that!" I just like working with a crappy coder from Fiverr or another freelancer site.
This article is a great example of how the human is struggling to extrapolate what happens next. There won't be any humans anywhere near this part of the tech stack, just like no one building a SAAS writes assembly code; or has to put together their own server cluster a datacenter (remember pre-cloud?) for their company anymore. He's dead Jim. No one is telling anyone to ship the ML code without any testing. Human coders also make mistakes. I'll bet that in a few years product managers / QA people would rather work with an ML stack to generate code than a human engineering team. It'll not just be cheaper and faster, but a lot less hassle & more accurate. As an example, Python has roughly ~100 or so "keywords", and extensive public libraries and open source algorithms to call upon. Anyone who thinks this presents any sort of challenge for an LLM to profoundly master is is delusional. They can do IMO-grade math, and help proving novel theorems. They can code your YC startup just fine.
Author here - I agree and have written about this before, though focusing a bit more on how far down the stack they might go, rather than up: https://chrisloy.dev/post/2025/03/23/will-ai-replace-softwar...
Article being discussed in this thread isn't intended to be a luddite rejection of AI. It's just a mistake I see people keep making (and have made myself) and some thoughts on how to avoid it with the tools we have today.
Pm and qa people would go before devs. Llms are already better pms than the best pms.
I don't understand why some people are so upset with AI coding - no one forces them to use it.
Now, if you say the problem is that you don't want to be inflicted with other people's AI code, just enforce more meaningful tests. There has always been bad code in the past, and there always will be.
I, for one, am doing great with AI coding - and my feeling is that more emphasis on project structure is the way forward for better end to end results.
> I don't understand why some people are so upset with AI coding - no one forces them to use it.
You might be a little bit out of touch with the current zeitgeist -- and what's happening in a lot of corporations.
First time I have to accept cookie profiling on a personal website blog. Also, the cookie banner isn't GDPR-compliant.
one axis that is missing from the discussion is how fast they are improving. We need ~35 years to get a senior software engineer (from birth to education to experience). These things are not even 3.5 years old. I am very interested in this space, if you are too dm me on X:@fabmilo I am in SF.
Effective coding is not code first think later.
LLMs aren't effective when used this way.
You still have to think.
IMO a vibe coder who is speaking their ideas to an agent which implements them is going to have way more time to think than a hand coder who is spending 80% of their time editing text.
A lot of times coding is thinking. It's like sketching out an idea on paper, but with code.
If you know, understand that you are in possession of a massive and temporary information asymmetry advantage, and you should run with it as hard and fast as you can to gain the biggest lead possible, before the author and the rest of the world gain that advantage too. Go, go now, go fast, do it in parallel, and don’t stop until you win. Opportunities like this are extremely rare not just in your life, but in the history of our society. Best of luck.
What are you talking about?
He is saying that theres a massive group of people declining to use AI, or writing blog posts about why it is bad, and that is a competitive advantage for people who will use them.
The assumption he is making is that the tools are good, and that the tools will eventually be used by everyone. So the competitive advantage is time limited.
He is largely correct, although I think productivity gains are overblown.
Could be either to keep using AI or to give up AI all together, we will never know.
This man should probably be a poet.
Probably something about how using LLMs for coding is such an amazing opportunity or something judging by how he implies the author would be surpassed due to information asymmetry.
He doesn’t know.