This seems very post-hoc and like they're fortunate they happened to arrive at something better rather than worse.
The justification for K8s seems pretty thin. It makes me wonder if the author understands why they need it. I'm guessing it's because they've got substantial parallel, multi-tenant networking of stateful processes, which is a pretty defensible reason to use K8s. And easy to say. It seems strange to leave it out.
The argument against Temporal also seems invalid, but I'm not certain. It has been years since I used it, but wouldn't it be possible to poll for completion? It seems like you'd wind up with better observability/retryability tooling, and it's much simpler overall.
I'd also posit that you could model a lot of this using your own serializable state machines. They're in the JS ecosystem, so XState is an excellent option. You'd get incredible visibility into your orchestration, deep access to testing the semantics and logic you care most about, and the ability to have your entire architecture be containers on the fly with no blackbox orchestration.
Of course, I'm speculating after browsing through their website a bit and thinking about the problems they described. I'm missing a lot of context. K8s could be the clear winner.
Still, after reading this I would never use this product. I don't mean to sound unkind. I'd never trust the decision-making of the people who followed this trajectory. If I were the author I'd take this down ASAP.
It's not really hinted at in the article, which doesn't actually mention whether the rewrite was a net gain - I presume it was or they wouldn't have written the article, and the lead-in picture paints a rosy picture, but the tone at the end suggests he's not happy with how things turned out.
But one thing that used to be a common design anti-pattern was the "version 2 problem". I think I first heard about it when Netscape were talking about how NN2 was a disaster, and they were finally happy with NN3 or NN4.
Often version 1 is a hastily thrown together mess of stuff, but it works and people like it. But there's lots of bad design decisions and you reach a limit with how far you can continue pushing that bad design before it gets too brittle to change. So you start on version 2, a complete rewrite to fix all the problems and you end up with something that's "technically perfect" but so overengineered, it's slow and everybody hates it, plus there are probably lots of workflow hoops to jump through to get things approved that you end up not making any progress, and possibly version 2 kills the product and/or the company.
The idea is that the "version 3" is a pragmatic compromise - the worse design problems from version 1 are gone, but you forego all the unnecessary stuff that you added in version 2, and finally have a product that customers like again (assuming you can convince them to come back and try v3 out) and you can build into future versions.
To a large degree I think this "version 2 problem" was a by product of waterfall design, it's certainly been less common since agile development became popular in the early 2000s and tooling made large scale refactoring easier, but even so I remember working somewhere with a v1 that the customers were using and a v2 that was a 3-year rewrite going on in parallel. None of the developers wanted to work on v1 even though that's what brought in the revenue, and v2 didn't have any of the benefit of the bug fixes accumulated over the years to fix very specific issues that were never captured in any of the scope documents.
"The general tendency is to over-design the second system,
using all the ideas and frills that were cautiously sidetracked on
the first one. The result, as Ovid says, is a "big pile."
I definitely encountered this second-system effect recently. I have an app that works well because it was written to target a specific use case. User (and I) wanted some additional features, but the original architecture just couldn't handle these new features, so I had to do a rewrite from the ground up.
As I rewrote it, I started pulling in more "nice to haves" or else opening up the design for the potential to support more and more future features. I eventually got to a point where it became unwieldy as it had too many open-ended architectural decisions and a lot of bloat.
I ended up scrapping this v2 before releasing it and worked on a v3 but with a more focused architecture, having some things open-ended but choosing not to pursue them yet as I knew that would just introduce unneeded bloat.
I was quite aware of the second-system effect when doing all this, but I still succumbed to it. Thankfully, the v3 rewrite didn't take as long since I was able to incorporate a lot of the v2 design decisions but scaled some of them back.
My adaptation of the Version 2 Problem is “any idiot can ship version 1 of a product, but it takes skill to ship version 2”.
Usually levied at people who are so hyper focused on shipping a so-called MVP that is really demoware that they are driving us at a brick wall and commenting the entire way about what good time we are making.
This has been my experience exactly. V1 was custom built for a single client and they loved it. As we tried to expand to multiple clients the v1 was too narrowly scoped (both in UX and code architecture) so we did a full rewrite attempting to generalize the app across more workflows. V2 definitely expanded our client pool, but all our large v1 customers absolutely hated it.
We never did a full v3 rewrite, but it took about 4 years and many v3 redesigns of various features to get our legacy customers on board.
I think your comment is just as "insane" as the practice you are railing against. Although I wouldn't use the word "insane" - it's hyperbole. What's the right word here? I'm not sure... "dogmatic" isn't quite right.
If you are a two man startup, burning through runway and pre-product-market fit... then spending a lot of time on tests is questionable (although the cost-benefit now with AI is changing very fast).
What I find "insane", "dogmatic"... about your comment is the complete elision of this process of cost-benefit analysis, as if there should never be such an analysis.
I've worked with a lot of people like you. When a discussion begins about a choice to be made, they just stampede in with "THIS IS THE RIGHT WAY". And the discussion can't even be had.
This sort of "dogmatism" is so rife if engineering culture, I wonder if this is why the c-suite is so ready to dump us all for AI centaurs that just fucking ship features. How many of them got burned listening to engineers who refused to perform even the most basic of cost benefit analyses with the perspective of the business as a whole in mind and forced the most unnecessary, over-engineered bullshit.
I worked at one startup where the tech lead browbeat the founders into building this enormous microservice monster that took them years. They had ONE dev team, ONE customer, and the only feature actually being used was just a single form (which was built so badly it took seconds to type a single character in a field cause the React re-renders were crazy).
Again - that's a business decision that needs to be made in the context of that business. The fact that testing was forbidden isn't in itself good or bad. It depends on that business context. THe post says nothing about how that decision was made, whether it was discussed, or if it was just his absolutist ideal he imposed without consideration of the broader cost-benefit.
And I still feel the original comment doesn't give this point enough weight.
The truth is in the middle somewhere, regarding tests at least (yes, your microservices story is insane).
I think the author could have been happier with the no-test decision if they had treated the initial work as a prototype with the idea of throwing it away.
At the same time, writing some tests, should not be seen as a waste of time since if you're even at all experienced with it, it's going to be faster than constantly reloading your browser or pressing up-up-up-up-up in a REPL to check progress (if you're doing the latter you are essentially doing a form of sorta reverse TDD).
So I dunno... I may be more in line with the idea that's a bit insane to prevent people from writing tests BUT so many people are so bad at writing tests that ya, for a go-gettem start up it could be the right call.
I certainly agree with your whole cost-benefit analysis paragraph.
Did you edit the wording of your original comment slightly to emphasise the "actively disallowing them" in every situation? Anyway... if that is what you meant, then ok. It's less awful a statement than what I felt I originally read.
I'd still push back on your hyperbole though. I don't think the author was insane - and we don't know what the broader business context was when they started growing the team and decided to persist without building out the test architecture at that point. They made a call that dogfooding was going to be enough to catch issues as they grew the team. There are a lot of scenarios where that is going to be true.
One scenario where it wouldn't - the most likely - is that the team isn't actually dogfooding because they personally don't find the product useful. Leadership lambasts them to use the product more... but no one does cause it sucks so much it impacts their own personal productivity.
Even there I wouldn't use the word insane... just poor leadership.
He did not edit, and you're misunderstanding the meaning behind his post. Not everything needs to be pedantic and accurate, language is flexible, this is about communicating, not being right.
What we really don't need is paragraphs of someone arguing because their own definitions differ slightly from the OP
Not having ANY tests means tons of manual testing is needed every time you modify code, which will rapidly consume more time than writing the tests would.
sorry, still don't get no tests as an excuse to go faster. obviously ymmv, but you will need to test your implementation somehow, and manual testing usually takes more time than running your automated tests. no need to over test, but definitely tests doesn't mean it will slow you down, unless you don't know how to test, which in that case, that's totally up to you.
Two different stages of the project, not necessarily contradictory. I'm not saying this is great, but tests make a whole lot more sense when you know what you're building.
Yes. TFA author could have gone into it with this mindset and treated the initial work as a prototype with the idea of throwing it away and would have been happier about it.
> but tests make a whole lot more sense when you know what you're building.
It's very true. This is a "gotcha" a lot of anti-TDDers always bring up, and yet some talk about "prototyping == good" without ever making the connection that you can do both.
Tests are most useful for regression detection, so it's a good instinct to not add them when you're primarily exploring. Once you've decided to switch to exploitation, though, regression will hurt. I think it's just a classic 0 to 0.1 not being the same thing as 0.1 to 1.
I wouldn’t admit to this level of frankly incompetence.
Wildly swinging dogmatism on how to do software development that’s so wrong you have to throw it all away - then repeating this failure loop multiple times.
Doesn’t inspire any confidence in the person I wouldn’t get them to lead a project.
Why would you be so loud and proud about all this.
"bugs were appearing everywhere out of the blue. The codebase was a huge mess of nulls, undefined behaviour, bad error handling. It was so bad that we actually lost a client over this."
Especially wild considering their product is literally an automated bug finder lol.
Next is such a dumpster fire. So much wasted effort due to the Node ecosystem never developing a universal batteries included framework like Rails or Django.
Which in turn were only invented because millennials would not be caught dead writing Java and JSP. We had all this shit figured out by the late nineties and 90% of what is accomplished on the web today was entirely possible and well integrated in Java app servers.
This whole business is a fashion industry.
I'm for one grateful for LLMs because for the first time in around 30 years there is actually genuine novelty to explore in software engineering. Ruby and nodejs weren't it.
"For the longest time, I would NOT allow people to write tests because I thought that culturally, we need to have a culture of shipping fast"
Tests are how you ship fast.
If you have good tests in place you can ship a new feature without fear that it will break some other feature that you haven't manually tested yet.
This seems very post-hoc and like they're fortunate they happened to arrive at something better rather than worse.
The justification for K8s seems pretty thin. It makes me wonder if the author understands why they need it. I'm guessing it's because they've got substantial parallel, multi-tenant networking of stateful processes, which is a pretty defensible reason to use K8s. And easy to say. It seems strange to leave it out.
The argument against Temporal also seems invalid, but I'm not certain. It has been years since I used it, but wouldn't it be possible to poll for completion? It seems like you'd wind up with better observability/retryability tooling, and it's much simpler overall.
I'd also posit that you could model a lot of this using your own serializable state machines. They're in the JS ecosystem, so XState is an excellent option. You'd get incredible visibility into your orchestration, deep access to testing the semantics and logic you care most about, and the ability to have your entire architecture be containers on the fly with no blackbox orchestration.
Of course, I'm speculating after browsing through their website a bit and thinking about the problems they described. I'm missing a lot of context. K8s could be the clear winner.
Still, after reading this I would never use this product. I don't mean to sound unkind. I'd never trust the decision-making of the people who followed this trajectory. If I were the author I'd take this down ASAP.
It's not really hinted at in the article, which doesn't actually mention whether the rewrite was a net gain - I presume it was or they wouldn't have written the article, and the lead-in picture paints a rosy picture, but the tone at the end suggests he's not happy with how things turned out.
But one thing that used to be a common design anti-pattern was the "version 2 problem". I think I first heard about it when Netscape were talking about how NN2 was a disaster, and they were finally happy with NN3 or NN4.
Often version 1 is a hastily thrown together mess of stuff, but it works and people like it. But there's lots of bad design decisions and you reach a limit with how far you can continue pushing that bad design before it gets too brittle to change. So you start on version 2, a complete rewrite to fix all the problems and you end up with something that's "technically perfect" but so overengineered, it's slow and everybody hates it, plus there are probably lots of workflow hoops to jump through to get things approved that you end up not making any progress, and possibly version 2 kills the product and/or the company.
The idea is that the "version 3" is a pragmatic compromise - the worse design problems from version 1 are gone, but you forego all the unnecessary stuff that you added in version 2, and finally have a product that customers like again (assuming you can convince them to come back and try v3 out) and you can build into future versions.
To a large degree I think this "version 2 problem" was a by product of waterfall design, it's certainly been less common since agile development became popular in the early 2000s and tooling made large scale refactoring easier, but even so I remember working somewhere with a v1 that the customers were using and a v2 that was a 3-year rewrite going on in parallel. None of the developers wanted to work on v1 even though that's what brought in the revenue, and v2 didn't have any of the benefit of the bug fixes accumulated over the years to fix very specific issues that were never captured in any of the scope documents.
"The general tendency is to over-design the second system, using all the ideas and frills that were cautiously sidetracked on the first one. The result, as Ovid says, is a "big pile."
- Fred Brooks, 'The Mythical Man Month' (1975)
Oh wow, it's from Mythical Man Month? I've been meaning to read that for years and still never have.
I definitely encountered this second-system effect recently. I have an app that works well because it was written to target a specific use case. User (and I) wanted some additional features, but the original architecture just couldn't handle these new features, so I had to do a rewrite from the ground up.
As I rewrote it, I started pulling in more "nice to haves" or else opening up the design for the potential to support more and more future features. I eventually got to a point where it became unwieldy as it had too many open-ended architectural decisions and a lot of bloat.
I ended up scrapping this v2 before releasing it and worked on a v3 but with a more focused architecture, having some things open-ended but choosing not to pursue them yet as I knew that would just introduce unneeded bloat.
I was quite aware of the second-system effect when doing all this, but I still succumbed to it. Thankfully, the v3 rewrite didn't take as long since I was able to incorporate a lot of the v2 design decisions but scaled some of them back.
My adaptation of the Version 2 Problem is “any idiot can ship version 1 of a product, but it takes skill to ship version 2”.
Usually levied at people who are so hyper focused on shipping a so-called MVP that is really demoware that they are driving us at a brick wall and commenting the entire way about what good time we are making.
This has been my experience exactly. V1 was custom built for a single client and they loved it. As we tried to expand to multiple clients the v1 was too narrowly scoped (both in UX and code architecture) so we did a full rewrite attempting to generalize the app across more workflows. V2 definitely expanded our client pool, but all our large v1 customers absolutely hated it.
We never did a full v3 rewrite, but it took about 4 years and many v3 redesigns of various features to get our legacy customers on board.
Having a culture of not ever writing tests and actively disallowing them is so insane I can't even imagine why there's anything else in this post
And particularly the “no tests go faster”.
I feel like we keep having to reestablish known facts every two years in this field.
[delayed]
I think your comment is just as "insane" as the practice you are railing against. Although I wouldn't use the word "insane" - it's hyperbole. What's the right word here? I'm not sure... "dogmatic" isn't quite right.
If you are a two man startup, burning through runway and pre-product-market fit... then spending a lot of time on tests is questionable (although the cost-benefit now with AI is changing very fast).
What I find "insane", "dogmatic"... about your comment is the complete elision of this process of cost-benefit analysis, as if there should never be such an analysis.
I've worked with a lot of people like you. When a discussion begins about a choice to be made, they just stampede in with "THIS IS THE RIGHT WAY". And the discussion can't even be had.
This sort of "dogmatism" is so rife if engineering culture, I wonder if this is why the c-suite is so ready to dump us all for AI centaurs that just fucking ship features. How many of them got burned listening to engineers who refused to perform even the most basic of cost benefit analyses with the perspective of the business as a whole in mind and forced the most unnecessary, over-engineered bullshit.
I worked at one startup where the tech lead browbeat the founders into building this enormous microservice monster that took them years. They had ONE dev team, ONE customer, and the only feature actually being used was just a single form (which was built so badly it took seconds to type a single character in a field cause the React re-renders were crazy).
Now THAT's insanity.
> they just stampede in with "THIS IS THE RIGHT WAY". And the discussion can't even be had.
That's exactly what this person is railing against. They strictly forbid testing.
Again - that's a business decision that needs to be made in the context of that business. The fact that testing was forbidden isn't in itself good or bad. It depends on that business context. THe post says nothing about how that decision was made, whether it was discussed, or if it was just his absolutist ideal he imposed without consideration of the broader cost-benefit.
And I still feel the original comment doesn't give this point enough weight.
Forbidding tests is not a business decision, it's a software engineering decision, and it's a remarkably poor one at that.
The truth is in the middle somewhere, regarding tests at least (yes, your microservices story is insane).
I think the author could have been happier with the no-test decision if they had treated the initial work as a prototype with the idea of throwing it away.
At the same time, writing some tests, should not be seen as a waste of time since if you're even at all experienced with it, it's going to be faster than constantly reloading your browser or pressing up-up-up-up-up in a REPL to check progress (if you're doing the latter you are essentially doing a form of sorta reverse TDD).
So I dunno... I may be more in line with the idea that's a bit insane to prevent people from writing tests BUT so many people are so bad at writing tests that ya, for a go-gettem start up it could be the right call.
I certainly agree with your whole cost-benefit analysis paragraph.
Did I say that my way was the right way? No: what I said was actively disallowing tests in every situation was the wrong way.
There is no ability here for the cost benefit analysis to change over time. There is only no tests
Did you edit the wording of your original comment slightly to emphasise the "actively disallowing them" in every situation? Anyway... if that is what you meant, then ok. It's less awful a statement than what I felt I originally read.
I'd still push back on your hyperbole though. I don't think the author was insane - and we don't know what the broader business context was when they started growing the team and decided to persist without building out the test architecture at that point. They made a call that dogfooding was going to be enough to catch issues as they grew the team. There are a lot of scenarios where that is going to be true.
One scenario where it wouldn't - the most likely - is that the team isn't actually dogfooding because they personally don't find the product useful. Leadership lambasts them to use the product more... but no one does cause it sucks so much it impacts their own personal productivity.
Even there I wouldn't use the word insane... just poor leadership.
> Did you edit the wording of your original comment slightly to emphasise the "actively disallowing them" in every situation?
I did not.
He did not edit, and you're misunderstanding the meaning behind his post. Not everything needs to be pedantic and accurate, language is flexible, this is about communicating, not being right.
What we really don't need is paragraphs of someone arguing because their own definitions differ slightly from the OP
Not having ANY tests means tons of manual testing is needed every time you modify code, which will rapidly consume more time than writing the tests would.
sorry, still don't get no tests as an excuse to go faster. obviously ymmv, but you will need to test your implementation somehow, and manual testing usually takes more time than running your automated tests. no need to over test, but definitely tests doesn't mean it will slow you down, unless you don't know how to test, which in that case, that's totally up to you.
It's a big move. But I understand it.
Sometimes your code is "just" a proof of concept, a way to test the idea. Very far from a decent product.
That is the time you ditch the code, keep the ideas (both good and bad) and start over.
Pearls.
> I would NOT allow people to write tests
> now [...] we started with tests from the ground up
Two different stages of the project, not necessarily contradictory. I'm not saying this is great, but tests make a whole lot more sense when you know what you're building.
Yes. TFA author could have gone into it with this mindset and treated the initial work as a prototype with the idea of throwing it away and would have been happier about it.
> but tests make a whole lot more sense when you know what you're building.
It's very true. This is a "gotcha" a lot of anti-TDDers always bring up, and yet some talk about "prototyping == good" without ever making the connection that you can do both.
in an age of generated tests, a mandate on no tests is just dumb
Nice using of the io domain there
Tests are most useful for regression detection, so it's a good instinct to not add them when you're primarily exploring. Once you've decided to switch to exploitation, though, regression will hurt. I think it's just a classic 0 to 0.1 not being the same thing as 0.1 to 1.
So you started with 2023 theo.gg philosophy but now moved on to 2026 theo.gg philosophy
I wouldn’t admit to this level of frankly incompetence.
Wildly swinging dogmatism on how to do software development that’s so wrong you have to throw it all away - then repeating this failure loop multiple times.
Doesn’t inspire any confidence in the person I wouldn’t get them to lead a project.
Why would you be so loud and proud about all this.
"bugs were appearing everywhere out of the blue. The codebase was a huge mess of nulls, undefined behaviour, bad error handling. It was so bad that we actually lost a client over this."
Especially wild considering their product is literally an automated bug finder lol.
I think there's a real possibility this is a "no such thing as bad publicity" stunt.
Next is such a dumpster fire. So much wasted effort due to the Node ecosystem never developing a universal batteries included framework like Rails or Django.
Which in turn were only invented because millennials would not be caught dead writing Java and JSP. We had all this shit figured out by the late nineties and 90% of what is accomplished on the web today was entirely possible and well integrated in Java app servers.
This whole business is a fashion industry.
I'm for one grateful for LLMs because for the first time in around 30 years there is actually genuine novelty to explore in software engineering. Ruby and nodejs weren't it.
Mongodb is webscale.
Do you think it can handle 10 requests per hour? How many mongo instances will that require, and should I use micro services?