The approach I've taken to "vibe coding" is to just write pseudo-code and then ask the LLM to translate. It's a very nice experience because I remain the driver, instead of sitting back and acting like the director of a movie. And I also don't have to worry about trivial language details.
Here's a prompt I'd make for fizz buzz, for instance. Notice the mixing of english, python, and rust. I just write what makes sense to me, and I have a very high degree of confidence that the LLM will produce what I want.
fn fizz_buzz(count):
loop count and match i:
% 3 => "fizz"
% 5 => "buzz"
both => "fizz buzz"
That's a really powerful approach because LLMs are very very strong at what is basically "style transfer". Much better than they are at writing code from scratch. One of my most recent big AI wins was going the other way; I had to read some Mulesoft code in its native storage format, which is some fairly nasty XML encoding scheme, mixed with code, mixed with other weird things, but asking the AI to just "turn this into psuedocode" was quite successful. Also very good at language-to-language transfer. Not perfect but much better than doing it by hand. It's still important to validate the transfer, it does get a thing or two wrong per every few dozen lines, but it's still way faster than doing it from scratch and good enough to work with if you've got testing.
My mental model for LLMs is that they’re a fuzzy compiler of sorts. Any kind of specification whether that’s BNF or a carefully written prompt will get “translated”. But if you don’t have anything to translate it won’t output anything good.
I agree with that assessment but that makes me wonder if a T5 style LLM would work better than a decoder only style LLM like GPT or Claude. Has anyone tried that?
I do something similar, merely writing out the function signatures i want in code. The more concrete of the idea i have in my head the more i outline, outline tests, etc.
However this is far less vibe coding and more actual work with an LLM imo.
Overall i'm not finding much value in vibe coding. The LLM will "create value" that quickly starts to become an albatross of edge cases and unreviewed code. The bugs will work their way in and prevent the LLM from making progress, and then i have to dig in to find the sanity - which is especially difficult when the LLM dug that far.
Yeah I'm nowhere near ready to loosen the leash. Show me a long-running agent that can get within 90% of its goal, then I'll be convinced. But right now we barely even have the tools to properly evaluate such agents.
Is this seriously quicker than just writing in a language that you know? I mean, you're not benefitting from syntax highlighting, autocompletion, indentation, snippets etc. This looks like more work than I do at a higher cost and insane latency.
I find it particularly useful when I would need to look up lots of library functions I don't remember. For example, in python I recently did something (just looked it up:
for ever my file in directory 'd' ending '.capture':
Read file
Split every line into A=B:C
Make a dictionary send A to [B,C]
Return a list of pairs [filename, dict from filename]
I don't python enough to remember reading all files in a directory, or splitting strings. I didn't even bother proof reading the English (as you can see)
> Is this seriously quicker than just writing in a language that you know?
Yes. Well, it depends.
Most of the prompts specifying requirements and constraints can be reused, so you don't need to reinvent the wheel each time you prompt a LLM to do something. The same goes for test suites: you do not need to recreate a whole test suite whenever you touch a feature. You can even put together prompt files for specific types of task, such as extending test coverage (as in, don't touch project code and only append unit tests to the existing set) or refactoring work (as in, don't touch tests and only change project code)
Also, you do not need to go for miracle single-shot sessions, or purist all-or-nothing prompts. A single prompt can fill in most of the code you require to implement a feature,and nothing prevents you from tweaking the output.
It is seriously quicker because people like you and me use LLMs to speed up how the boring stuff is implemented. Guides like this are important to share some lessons on how to get LLMs to work and minimize drudge work.
Those are just features waiting to be developed. I'm currently experimenting with building LLM-powered editor services (all the stuff you mentioned). It's not there yet, but as local models become faster and more powerful, it'll unlock.
This particular example isn't very useful, but anecdotally it feels very nice to not need perfect syntax. How many programmer hours have been wasted because of trivial coding errors?
> How many programmer hours have been wasted because of trivial coding errors?
Historically probably quite a lot, but with a decent editor and tools like gofmt that became popular in the past 10 years I'd say syntax is just not a problem any more. I can definitely recall the frustration of a missing closing bracket in HTML in the 90s, but nowadays people can turn out perfectly syntactically correct code on day 1 of a new language.
That’s fair. Not to shift the goal post but my intuition has shifted recently as to what I’d consider a “trivial” problem. API details, off-by-one errors, and other issues like that are what I’d lump into that category.
Easy way to say it is that source code requires perfection, whereas pseudo-code takes the pressure off of that last 10%, and IMO that could have significant benefits for cognitive load if not latency.
Still all hypothetical, and something I’m actively experimenting with. Not a hill I’m gonna die on, but it’s super fun to play and imagine what might be possible.
I do something like that when I get down to the function level and there is an algorithm that is either struggling for the role or poorly optimized, but the models that excel in codebase architecture have their hands held behind their back with that level of micromanaging.
the results are good because as another replier mentioned, LLMs are good at style transfer when given a rigid ruleset -- but this technique sometimes just means extra work at the operator level to needlessly define something the model is already very aware of.
"write a fizzbuzz fn" will create a function with the same output. "write a fizzbuzz function using modulo" will get you closer to verbatim -- but my point here is that in the grand scheme of "will this get me closer to alleviating typing-caused-RSI-pain" the pseudocode usually only needs to get whipped out when the LLM does something braindead at the function level.
But "write a fizzbuzz fn" has one important assumption / limitation: the LLM should have seen a ton of fizbuzz implementations already to be able to respond.
Hence, LLMs can be helpful to produce boilerplate / glue code, the kind that has already been written in many variations, but cannot be directly reused. Anything novel you should rather outline at a more detailed level.
> It’s a bit of a catch-22 to say “anyone can code with AI” and then make such statements.
Also makes it very much not "vibe" coding. The term keeps expanding into "any coding activity with AI assistance" but the whole idea of "vibe" coding is that you don't actually understand the generated code, and likely don't know how to program at all, you're just prompting AI to do everything.
Once you step into reviewing & understanding, you're no longer vibe coding you're just...coding.
The first self-announced vibe coder was a founder of OpenAI. So he had the knowledge.
He started to trust the code written by the models to a point where he didn't always read the code. This happens when you you have learned the exact limits of your models.
It's like a senior coder who knows the projects the interns can handle without close supervision, and those they can't.
I've expressed this to others as much as is reasonable - but the phrase "vibe coding" shouldn't be used in any serious discourse about agentic tools. We can't control the lens under which a given person first heard the term, but that moment (combined with the mountains of memes they've consumed since) is going to color a lot of folk's personal definition of vibe coding. It's not realistic to expect everyone to have a shared definition of it, despite the inventor of the phrase immediately giving it definition.
> I haven’t felt I thoroughly understood any code after working with C++ and reading the entries in code obfuscation contests.
Seems to me the result should be that if you aren't sure, your feedback when reviewing the code is that it needs to be more readable. Send it back to the LLM and demand they make it easier to understand.
> Always: > > Thoroughly review and understand the generated code
I think this is good advice actually. We do allow LLM agents where I work, but you still need to understand every line of code that you write or generate. That’s probably why we still do physical interviews as well.
It's great advice for anything AI-generated in a professional production environment. I think the question is whether it's vibe coding with that requirement in place. Or, rather, if the requirement is appropriate for how vibe coding is often used and promoted today (by non-coders).
Basically all of the suggestions on that page were good practice, and not just for code. Documenting your changes, reviewing the output of an AI (or junior person), writing meaningful commits ... all of these apply equally to code, contracts, whatever. I read this post as "If you want vibe coding to be coding you still have to do a lot of hard work and not treat it as a magic app engine." Which is true but absolutely not what a lot of vibe code-embracing middle managers want to hear.
I agree. Personally, I barely use any LLM tools professionally as a developer, and I don't use it at all in my free time. I do however have some coworkers that use it more heavily. Having a culture of proper code reviews and requirements that you need to know what the code in your PR does ensures that we have create proper solutions.
I don't think I could enjoy working at a place where people didn't know the content of the commits they made. I remember the early talks of vibe coding being that you're not even supposed to look at the code, and have been very happy that I haven't met anyone professionally that codes like that.
It's too bad the "vibe coding" definition is so strict. You could have an app that was completely AI generated, but the moment you even peak at the generated code or make a small revision, it is no longer a vibe coded app.
Nobody’s coined a fun catchy phrase for half vibe coding / half real engineering, so I will continue to refer to my AI assisted adventures as vibe coding :)
the page is an interesting display of a very large bureaucratic institution that is extremely worried about being sued, but is still utterly desperate to get in on the AI bubble before it pops
On a slightly related note... I'm kind of out of the loop wrt coding with AI. I was trying to find some youtuber working on some interesting project using AI to get a feel for how useful it could be but didn't have much luck (I didn't get past the "top 10 AI tools to use for coding" style videos). I was thinking something in the style of tsoding if you're familiar with his projects.
It's what you describe: Someone using LLMs to code stuff. I didnt really end up watching it so I can't really speak to the content but it should be the type of thing you're looking for.
I'm starting to think of "vibe coding" as "peer/pair programming". How effective it will be depends on how effective I am as the peer reviewer.
The driver is the AI who is highly capable but has a 5% chance of doing something psychotic lol. Me, the peer, can either review carefully and catch errors or just relax and "vibe" through it all. Results will of course vary based on that relationship.
"Vibe coding" was intended to mean where you don't pay attention to the work your partner creates at all. Where you just lean into their "vibe" and run with it, no matter how bad it actually is. What you describe already has a name. You even mentioned it yourself. Also calling it "vibe coding" would be a bit redundant.
Even though they are using the wrong term here, the advice throughout the file is solid. I find it funny that it doesn't even mention kiro, which is amazon's take on a vscode clone, that focuses on processes instead of vibes.
>Provide detailed specifications for the work to be done
I've been playing around with vibe coding for a few months and my experience doesn't really match this advice.
I used to think this was the correct way and based on that was creating some huge prompts for every feature. It took the form of markdown files with hundred of lines, specifying every single detail about the implementation. It seems to be an effective technique at the start, but when things get more complex it starts to break down.
After some time I started cutting down on prompt size and things seem to have improved. But I don't really have any hard data on this.
If you want to explore this topic one thing you can do is to ask you LLM to "create a ticket" for some feature you already have, and try to play around with the format it gives you.
Everytime I see these tips and tricks, it reinforces my viewpoint thag it would be more productive to actually learn the abstractions of your framework and your tooling. Instead of wrestling with a capricious agent.
Thinking is always the bottleneck, so what you need most are:
- A reduction of complexity in your system.
- Offloading trivial and repetitive work to automated systems (testing, formatting, and code analysis)
- A good information system (documentation, clear tickets, good commits,…)
Then you can focus on thinking, instead of typing or generating copious amount of code. Code is the realisation of a solution that can be executed by machines. If the solution is clear, the code is trivial.
I have found it better to have stronger scope for 2nd and 3rd iteration feature sets in mind.. refactoring because you didn't think you'd be adding a certain kind of feature or filter or database scope is worse than knowing ahead of time that's where you were going.
A little different than "spec", but one-shotting things works great if that's going to get you as far as you want to go. If you're adding neat-o features after that, it can get a little messy because the initial design doesn't bend to the new idea.
Even something like adding anti-DDOS libraries towards the end, and then having to scope those down from admin features. Much better to spec that stuff at the outset.
I have found putting the spec together with a model, having it to try find blindspots and write done the final take in clear and concise language, useful.
A good next step is to have the model provide a detailed step by step plan to implement the spec.
Both steps are best done with a strong planning model like Claude Opus or ChatGPT5, having it write "for my developer", before switching to something like Claude Code.
I think it might have something to do with context rot that all LLMs experience now. Like each token used degrades the token after it, regardless of input/output.
It is mostly because it creates code that is way more complex than it needs to.
One, admittedly silly, example is Claude trying to put a light/dark theme switcher when you are trying to refactor the style of your small webapp.
I'm not against a theme switcher, but it is certainly not a trivial feature to implement. But Claude doesn't seem to understand that. And by having simpler prompts I feel it gets easier to steer the LLM in the right direction.
You created markdown files with hundreds of lines and this was the result? I let it create task lists in md files, and review regularly. Works well for me as the scope of the tasks is well defined. Sure, sometimes it still does bad things, but I consider it just another junior dev, but with vast knowledge.
That's how I've been doing it as well. There's no guarantee that the LLM will follow your minute, detailed, description, and dumping it all at once at the start of a session has made it perform worse in my case.
And, you know, LLMs are mostly dumb typists, but sometimes they do dump something better than what I had in mind, and I feel that I lose that if I try to constrain them.
This approach also breaks down for the same reasons the Waterfall model doesn't work. A lot of information is discovered during development, which causes specs to be outdated or wrong. At that point the LLM context is deeply poisoned, whether from the specs themselves, or from the rest of the codebase. You can try to update the specs or ask for major refactors, but that often introduces new issues. And as the context grows, the chances of producing working code diminish significantly. The only way forward at that point is to dive in yourself, reviewing, fixing, and refactoring the traditional way, and wondering whether this workflow has really made you any more productive.
There are tools like context7. Apple is also starting to put markdown files summarizing/detailing APIs for inclusion in LLM context automatically and have shipped these inside Xcode
> "Thoroughly review and understand the generated code"
That isn't vibe coding though.
Vibe coding means you don't look at the code, you look at the front / back end and accept what you see if it meets your expectations visually, and the code doesn't matter in this case, you "see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works." [1]
If the changes are good enough, i.e. the front/backend works well, then it's good and keep prompting.
Maybe the zeroth tip is "never go full vibe coder."
It can be tempting, but there's so much impact that even small changes to the code can have, and often in subtle ways, that it should at least be scanned and read carefully in certain critical parts. Especially as you near the point where hosting it on AWS is practical.
Even in Karpathy's original quote that you referenced he says "It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding." Maybe it should have been called vibe prompting.
AI is cool and all, but the biggest thing that makes me think that we’re in a bit of a bubble is seeing otherwise conservative organizations take “vibe coding” seriously
> otherwise conservative organizations take “vibe coding” seriously
Eh, it might be just someone wanting to jump on a trendy term, without understanding it properly. The file actually makes good points about moving from vibes to structure, which is fine.
Amazon also has their own clone of vscode (who doesn't these days) that focuses on some things mentioned here as well. They take your prompt and get it through a process of documenting - clarifying - planning, leading to solid results in the end. The problem with their approach is that it's nothing particularly "proprietary" and you can pretty much have the same experience with some slash commands and dedicated prompts in any other code assistant.
The chat apps are pretty good at internet searches without all the ads and SEO crap. So I use the chat app, it's okay at context (knows which language and framework I'm using) and can basically get me the same answers as the docs would provide in less time.
Still makes mistakes in code examples though, so I'd never trust it to actually change my code.
The approach I've taken to "vibe coding" is to just write pseudo-code and then ask the LLM to translate. It's a very nice experience because I remain the driver, instead of sitting back and acting like the director of a movie. And I also don't have to worry about trivial language details.
Here's a prompt I'd make for fizz buzz, for instance. Notice the mixing of english, python, and rust. I just write what makes sense to me, and I have a very high degree of confidence that the LLM will produce what I want.
That's a really powerful approach because LLMs are very very strong at what is basically "style transfer". Much better than they are at writing code from scratch. One of my most recent big AI wins was going the other way; I had to read some Mulesoft code in its native storage format, which is some fairly nasty XML encoding scheme, mixed with code, mixed with other weird things, but asking the AI to just "turn this into psuedocode" was quite successful. Also very good at language-to-language transfer. Not perfect but much better than doing it by hand. It's still important to validate the transfer, it does get a thing or two wrong per every few dozen lines, but it's still way faster than doing it from scratch and good enough to work with if you've got testing.
My mental model for LLMs is that they’re a fuzzy compiler of sorts. Any kind of specification whether that’s BNF or a carefully written prompt will get “translated”. But if you don’t have anything to translate it won’t output anything good.
Yep, exactly. "Garbage in, garbage out" still applies.
I agree with that assessment but that makes me wonder if a T5 style LLM would work better than a decoder only style LLM like GPT or Claude. Has anyone tried that?
I do something similar, merely writing out the function signatures i want in code. The more concrete of the idea i have in my head the more i outline, outline tests, etc.
However this is far less vibe coding and more actual work with an LLM imo.
Overall i'm not finding much value in vibe coding. The LLM will "create value" that quickly starts to become an albatross of edge cases and unreviewed code. The bugs will work their way in and prevent the LLM from making progress, and then i have to dig in to find the sanity - which is especially difficult when the LLM dug that far.
Yeah I'm nowhere near ready to loosen the leash. Show me a long-running agent that can get within 90% of its goal, then I'll be convinced. But right now we barely even have the tools to properly evaluate such agents.
Is this seriously quicker than just writing in a language that you know? I mean, you're not benefitting from syntax highlighting, autocompletion, indentation, snippets etc. This looks like more work than I do at a higher cost and insane latency.
I find it particularly useful when I would need to look up lots of library functions I don't remember. For example, in python I recently did something (just looked it up:
I don't python enough to remember reading all files in a directory, or splitting strings. I didn't even bother proof reading the English (as you can see)Same when you a few times per year need to write some short bash script. It's really nice to not have to remember how it really works again!
> Is this seriously quicker than just writing in a language that you know?
Yes. Well, it depends.
Most of the prompts specifying requirements and constraints can be reused, so you don't need to reinvent the wheel each time you prompt a LLM to do something. The same goes for test suites: you do not need to recreate a whole test suite whenever you touch a feature. You can even put together prompt files for specific types of task, such as extending test coverage (as in, don't touch project code and only append unit tests to the existing set) or refactoring work (as in, don't touch tests and only change project code)
Also, you do not need to go for miracle single-shot sessions, or purist all-or-nothing prompts. A single prompt can fill in most of the code you require to implement a feature,and nothing prevents you from tweaking the output.
It is seriously quicker because people like you and me use LLMs to speed up how the boring stuff is implemented. Guides like this are important to share some lessons on how to get LLMs to work and minimize drudge work.
Those are just features waiting to be developed. I'm currently experimenting with building LLM-powered editor services (all the stuff you mentioned). It's not there yet, but as local models become faster and more powerful, it'll unlock.
This particular example isn't very useful, but anecdotally it feels very nice to not need perfect syntax. How many programmer hours have been wasted because of trivial coding errors?
> How many programmer hours have been wasted because of trivial coding errors?
Historically probably quite a lot, but with a decent editor and tools like gofmt that became popular in the past 10 years I'd say syntax is just not a problem any more. I can definitely recall the frustration of a missing closing bracket in HTML in the 90s, but nowadays people can turn out perfectly syntactically correct code on day 1 of a new language.
That’s fair. Not to shift the goal post but my intuition has shifted recently as to what I’d consider a “trivial” problem. API details, off-by-one errors, and other issues like that are what I’d lump into that category.
Easy way to say it is that source code requires perfection, whereas pseudo-code takes the pressure off of that last 10%, and IMO that could have significant benefits for cognitive load if not latency.
Still all hypothetical, and something I’m actively experimenting with. Not a hill I’m gonna die on, but it’s super fun to play and imagine what might be possible.
I do something like that when I get down to the function level and there is an algorithm that is either struggling for the role or poorly optimized, but the models that excel in codebase architecture have their hands held behind their back with that level of micromanaging.
the results are good because as another replier mentioned, LLMs are good at style transfer when given a rigid ruleset -- but this technique sometimes just means extra work at the operator level to needlessly define something the model is already very aware of.
"write a fizzbuzz fn" will create a function with the same output. "write a fizzbuzz function using modulo" will get you closer to verbatim -- but my point here is that in the grand scheme of "will this get me closer to alleviating typing-caused-RSI-pain" the pseudocode usually only needs to get whipped out when the LLM does something braindead at the function level.
But "write a fizzbuzz fn" has one important assumption / limitation: the LLM should have seen a ton of fizbuzz implementations already to be able to respond.
Hence, LLMs can be helpful to produce boilerplate / glue code, the kind that has already been written in many variations, but cannot be directly reused. Anything novel you should rather outline at a more detailed level.
> Always: > > Thoroughly review and understand the generated code
Rules it out for me; I haven’t felt I thoroughly understood any code after working with C++ and reading the entries in code obfuscation contests.
It’s a bit of a catch-22 to say “anyone can code with AI” and then make such statements.
> It’s a bit of a catch-22 to say “anyone can code with AI” and then make such statements.
Also makes it very much not "vibe" coding. The term keeps expanding into "any coding activity with AI assistance" but the whole idea of "vibe" coding is that you don't actually understand the generated code, and likely don't know how to program at all, you're just prompting AI to do everything.
Once you step into reviewing & understanding, you're no longer vibe coding you're just...coding.
The first self-announced vibe coder was a founder of OpenAI. So he had the knowledge.
He started to trust the code written by the models to a point where he didn't always read the code. This happens when you you have learned the exact limits of your models.
It's like a senior coder who knows the projects the interns can handle without close supervision, and those they can't.
I've expressed this to others as much as is reasonable - but the phrase "vibe coding" shouldn't be used in any serious discourse about agentic tools. We can't control the lens under which a given person first heard the term, but that moment (combined with the mountains of memes they've consumed since) is going to color a lot of folk's personal definition of vibe coding. It's not realistic to expect everyone to have a shared definition of it, despite the inventor of the phrase immediately giving it definition.
> I haven’t felt I thoroughly understood any code after working with C++ and reading the entries in code obfuscation contests.
Seems to me the result should be that if you aren't sure, your feedback when reviewing the code is that it needs to be more readable. Send it back to the LLM and demand they make it easier to understand.
> Always: > > Thoroughly review and understand the generated code
I think this is good advice actually. We do allow LLM agents where I work, but you still need to understand every line of code that you write or generate. That’s probably why we still do physical interviews as well.
It's great advice for anything AI-generated in a professional production environment. I think the question is whether it's vibe coding with that requirement in place. Or, rather, if the requirement is appropriate for how vibe coding is often used and promoted today (by non-coders).
Basically all of the suggestions on that page were good practice, and not just for code. Documenting your changes, reviewing the output of an AI (or junior person), writing meaningful commits ... all of these apply equally to code, contracts, whatever. I read this post as "If you want vibe coding to be coding you still have to do a lot of hard work and not treat it as a magic app engine." Which is true but absolutely not what a lot of vibe code-embracing middle managers want to hear.
I agree. Personally, I barely use any LLM tools professionally as a developer, and I don't use it at all in my free time. I do however have some coworkers that use it more heavily. Having a culture of proper code reviews and requirements that you need to know what the code in your PR does ensures that we have create proper solutions.
I don't think I could enjoy working at a place where people didn't know the content of the commits they made. I remember the early talks of vibe coding being that you're not even supposed to look at the code, and have been very happy that I haven't met anyone professionally that codes like that.
This is not vibe coding at all, this is reviewing AI generated code
It's too bad the "vibe coding" definition is so strict. You could have an app that was completely AI generated, but the moment you even peak at the generated code or make a small revision, it is no longer a vibe coded app.
I think that's an overly strict definition.
A CGI film is still a CGI film even if artists at the studio hand-paint certain objects in certain frames.
A Python program is still a Python program even if it calls out to C code for a handful of key functions.
I suspect most people would consider an AI generated image to still constitute AI art even if the creator touched up a few things in Photoshop.
Nobody’s coined a fun catchy phrase for half vibe coding / half real engineering, so I will continue to refer to my AI assisted adventures as vibe coding :)
Sometimes I call it vibe coding ++.
the page is an interesting display of a very large bureaucratic institution that is extremely worried about being sued, but is still utterly desperate to get in on the AI bubble before it pops
Bro you’re harshing the vibe
On a slightly related note... I'm kind of out of the loop wrt coding with AI. I was trying to find some youtuber working on some interesting project using AI to get a feel for how useful it could be but didn't have much luck (I didn't get past the "top 10 AI tools to use for coding" style videos). I was thinking something in the style of tsoding if you're familiar with his projects.
I saw this channel from someone on hackernews if I recall correctly: https://www.youtube.com/playlist?list=PLm7RTomLsZo5PAWk0CpK0...
It's what you describe: Someone using LLMs to code stuff. I didnt really end up watching it so I can't really speak to the content but it should be the type of thing you're looking for.
I recorded myself trying it out to port some old apps of mine using Claude code as a first time user of it.
I'm not even a youtuber and make these to keep myself accountable, so it's not that fun to watch, but it might be in the direction of your query:
https://www.youtube.com/watch?v=d9YCkWjD7WQ&list=PLEWSTtjNAw...
Looks promising! Thanks for sharing!
https://x.com/steipete
https://steipete.me/posts/2025/essential-reading-july-2025
https://steipete.me/posts/2025/the-future-of-vibe-coding
I'm starting to think of "vibe coding" as "peer/pair programming". How effective it will be depends on how effective I am as the peer reviewer.
The driver is the AI who is highly capable but has a 5% chance of doing something psychotic lol. Me, the peer, can either review carefully and catch errors or just relax and "vibe" through it all. Results will of course vary based on that relationship.
"Vibe coding" was intended to mean where you don't pay attention to the work your partner creates at all. Where you just lean into their "vibe" and run with it, no matter how bad it actually is. What you describe already has a name. You even mentioned it yourself. Also calling it "vibe coding" would be a bit redundant.
"The vibe code with detailed specifications is not the true vibe." - Laozi
I really don’t see how vibe coding has any place here. It’s just writing bad code without knowing anything it does.
Even though they are using the wrong term here, the advice throughout the file is solid. I find it funny that it doesn't even mention kiro, which is amazon's take on a vscode clone, that focuses on processes instead of vibes.
here's what i use for large projects https://getstream.io/blog/cursor-ai-large-projects/
>Provide detailed specifications for the work to be done
I've been playing around with vibe coding for a few months and my experience doesn't really match this advice.
I used to think this was the correct way and based on that was creating some huge prompts for every feature. It took the form of markdown files with hundred of lines, specifying every single detail about the implementation. It seems to be an effective technique at the start, but when things get more complex it starts to break down.
After some time I started cutting down on prompt size and things seem to have improved. But I don't really have any hard data on this.
If you want to explore this topic one thing you can do is to ask you LLM to "create a ticket" for some feature you already have, and try to play around with the format it gives you.
Everytime I see these tips and tricks, it reinforces my viewpoint thag it would be more productive to actually learn the abstractions of your framework and your tooling. Instead of wrestling with a capricious agent.
Thinking is always the bottleneck, so what you need most are:
- A reduction of complexity in your system.
- Offloading trivial and repetitive work to automated systems (testing, formatting, and code analysis)
- A good information system (documentation, clear tickets, good commits,…)
Then you can focus on thinking, instead of typing or generating copious amount of code. Code is the realisation of a solution that can be executed by machines. If the solution is clear, the code is trivial.
You can do both
I have found it better to have stronger scope for 2nd and 3rd iteration feature sets in mind.. refactoring because you didn't think you'd be adding a certain kind of feature or filter or database scope is worse than knowing ahead of time that's where you were going.
A little different than "spec", but one-shotting things works great if that's going to get you as far as you want to go. If you're adding neat-o features after that, it can get a little messy because the initial design doesn't bend to the new idea.
Even something like adding anti-DDOS libraries towards the end, and then having to scope those down from admin features. Much better to spec that stuff at the outset.
I have found putting the spec together with a model, having it to try find blindspots and write done the final take in clear and concise language, useful.
A good next step is to have the model provide a detailed step by step plan to implement the spec.
Both steps are best done with a strong planning model like Claude Opus or ChatGPT5, having it write "for my developer", before switching to something like Claude Code.
Couldn't agree with this sentiment more.
I think it might have something to do with context rot that all LLMs experience now. Like each token used degrades the token after it, regardless of input/output.
How does it break down? Is it because the LLM didn't follow what you wrote down?
It is mostly because it creates code that is way more complex than it needs to.
One, admittedly silly, example is Claude trying to put a light/dark theme switcher when you are trying to refactor the style of your small webapp.
I'm not against a theme switcher, but it is certainly not a trivial feature to implement. But Claude doesn't seem to understand that. And by having simpler prompts I feel it gets easier to steer the LLM in the right direction.
You created markdown files with hundreds of lines and this was the result? I let it create task lists in md files, and review regularly. Works well for me as the scope of the tasks is well defined. Sure, sometimes it still does bad things, but I consider it just another junior dev, but with vast knowledge.
That's how I've been doing it as well. There's no guarantee that the LLM will follow your minute, detailed, description, and dumping it all at once at the start of a session has made it perform worse in my case.
And, you know, LLMs are mostly dumb typists, but sometimes they do dump something better than what I had in mind, and I feel that I lose that if I try to constrain them.
This approach also breaks down for the same reasons the Waterfall model doesn't work. A lot of information is discovered during development, which causes specs to be outdated or wrong. At that point the LLM context is deeply poisoned, whether from the specs themselves, or from the rest of the codebase. You can try to update the specs or ask for major refactors, but that often introduces new issues. And as the context grows, the chances of producing working code diminish significantly. The only way forward at that point is to dive in yourself, reviewing, fixing, and refactoring the traditional way, and wondering whether this workflow has really made you any more productive.
The biggest issue I've had with vibe coding, by far, is the lack-of and/or outdated documentation for specific APIs.
I now spend time gathering as much documentation as possible and inserting it within the prompt as a <documentation> tag, or as a cursor rule.
There are tools like context7. Apple is also starting to put markdown files summarizing/detailing APIs for inclusion in LLM context automatically and have shipped these inside Xcode
> "Thoroughly review and understand the generated code"
That isn't vibe coding though.
Vibe coding means you don't look at the code, you look at the front / back end and accept what you see if it meets your expectations visually, and the code doesn't matter in this case, you "see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works." [1]
If the changes are good enough, i.e. the front/backend works well, then it's good and keep prompting.
You rely on and give in into the ~vibes~. [1]
[1] https://x.com/karpathy/status/1886192184808149383
Maybe the zeroth tip is "never go full vibe coder."
It can be tempting, but there's so much impact that even small changes to the code can have, and often in subtle ways, that it should at least be scanned and read carefully in certain critical parts. Especially as you near the point where hosting it on AWS is practical.
Even in Karpathy's original quote that you referenced he says "It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding." Maybe it should have been called vibe prompting.
> "Warning
Never blindly trust code generated by AI assistants. Always:
- Thoroughly review and understand the generated code
- Verify all dependencies
- Perform necessary security checks."
This of course makes sense, but is not vibe coding.
I suppose we'll see an effort to steer people away from vibe coding nonsense by redefining the term, which makes sense.
AI is cool and all, but the biggest thing that makes me think that we’re in a bit of a bubble is seeing otherwise conservative organizations take “vibe coding” seriously
They get paid the more vibe coding occurs on their platform, so of course they have a two-pizza team dedicated to milking the latest trend.
> otherwise conservative organizations take “vibe coding” seriously
Eh, it might be just someone wanting to jump on a trendy term, without understanding it properly. The file actually makes good points about moving from vibes to structure, which is fine.
Amazon also has their own clone of vscode (who doesn't these days) that focuses on some things mentioned here as well. They take your prompt and get it through a process of documenting - clarifying - planning, leading to solid results in the end. The problem with their approach is that it's nothing particularly "proprietary" and you can pretty much have the same experience with some slash commands and dedicated prompts in any other code assistant.
I don't doubt that the tooling and documentation are fine.
"Vibe Coding" is a goofy and unserious practice though. The headline reads like "Oracle Best Practices to Hit the Griddy"
Followers gonna follow.
There is massive financial incentive for them to make it happen for AWS.
Between selling more bedrock usage or cutting their own headcount.
Best vibe coding tip: Don't.
My list:
1. Don't.
2. Don't do it.
3. Seriously, don't.
My approach to vibe coding:
The chat apps are pretty good at internet searches without all the ads and SEO crap. So I use the chat app, it's okay at context (knows which language and framework I'm using) and can basically get me the same answers as the docs would provide in less time.
Still makes mistakes in code examples though, so I'd never trust it to actually change my code.
"Vibe coding" used to be a meme and used as a derogatory term. Odd to see it adapted as the norm now.
ML pros call it "Semantic Diffusion", with a smirk, I assume.
https://simonwillison.net/tags/semantic-diffusion/
This is barely vibe coding, reads like just writing specs lol
It reads just like regular coding, except using an LLM as the compiler.