We wont get to AGI if we dont get models with both larger context and dreaming (aka distilling important parts from their 'day long' context and adding those to their neural net) with about the same effort/cost as inference. LLM models cannot do this and wont be able to, so if no one comes up with a better model, AGI cannot be reached, no matter what amounts are invested, so we will get an AI winter. So many smart minds are on this now, that if anyone had an idea how to 'learn during inference', someone would have released something. No one has a clue, so I am betting the downfall will come soon. Still, we got incredible progress from this AI boom, so it is not bad, just money slushing.
Not sure if there will be an AI winter, but the claims of AGI being very close are likely very optimistic. Maybe AGI requires an entirely new paradigm, maybe it requires quantum computing, or biological computing (our brains are able to work at just a fraction of the cost of current LLMs after all), regardless it’s much further out than certain industry leaders would like you to believe.
But an AI winter is unlikely to come either, as it currently adds a lot of value in many places. But the profits coming out of it are unlikely to line up with current investments being poured in.
LLMs add a value that remains orders of magnitude lower than the investments. So I fully expect their valuation to decrease suddenly by end of 2027 – and to take the rest of AI with them into a new AI winter.
Also these industry leaders believe that we can get there just optimising llms: this is a mistake for sure: the only general intelligence we know is human and humans add new facts to their neural nets daily while llms need weeks of expensive training and then have a knowledge cut off. It cannot happen like that.
The previous winters still had a lot (for the time) of benefits and money coming from the discoveries before the winter, but there were just very few fundamental advancements to AI tech itself. So just using LLMs and optimising them etc but getting nothing new is still a winter to me.
Even if energy needs inevitably come down these guys will continue to force energy use purely driven by financial obligations. It’s the dumbest thing since subprime. We’re already seeing shades of this with Nvidia. They’re giving away their GPUs in exchange for paper to keep people on this path. But energy is not free. It’s finite.
As a side note, one argument I heard again and again against crypto was the energy use. It’s an argument that has gone so far to the way side it’s incredible to me. Overcoming criticism about energy use and tech was simply marketing. Global warming is barely an issue in the mainstream anymore, it’s truly amazing
If the LLM boom ends then bitcoin miners can eat up all the slack grid power! You can put those ASICs anywhere
Crypto mining energy is wasted relative to proof of stake that provides the same security for no energy. But AI inference has no real alternative so the energy isn't wasted. It is unfortunate that politicians refused to allow abundant clean energy even after we told them exactly what to do, but now it is what it is. The market wants AI, they are willing to pay for it, and they are getting it.
Mining doesn’t create bitcoins though. They are created on a schedule. Miners just burn power on useless calculations to compete for a share of the scheduled distribution.
Ethereum did the change, Bitcoin is stuck with the current mining of small blocks due to politics, it could've moved to bigger block sizes many years ago but the miners blocked the move since it would be less profitable.
So Bitcoin is locked into its current wasteful state with no end in sight.
> If the LLM boom ends then bitcoin miners can eat up all the slack grid power! You can put those ASICs anywhere
Or maybe we shouldnt use that energy at all. We're still heading towards climate collapse so just frivolously wasting energy on something that has 0 value like cryptocurrencies doesnt seem like the smartest idea.
The question that really matters: is the net present value of each $1 investment in AI Capex > $1 (+ some spread for borrowing costs & risk).
We'll be inference token constrained indefinitely: i.e. inference tokens supply will never exceed demand, it's just that the $/token may not be able to pay back the capital investment.
> it's just that the $/token may not be able to pay back the capital investment.
the loss is private, so that's OK.
A similar thing happened to the internet bandwidth capacity when the dot-com bust happened - overinvestment in fibre everywhere (came to be called dark fibre iirc), which became superbly useful once the recovery started, despite those building these capacity not making much money. They ate the losses, so that the benefit can flow out.
The only time this is not OK is when the overinvestment comes from gov't sources, and is ultimately a taxpayer funded grift.
Every time someone says “but dark fiber”, someone else has to point out that graphics cards are not infrastructure and depreciate at a much, much higher rate. I guess it’s my turn.
Fiber will remain a valuable asset until/unless some moron snaps it with a backhoe. And it costs almost nothing to operate.
Your data center full of H100s will wear out in 5 years. Any that don’t are still going to require substantial costs to run/may not be cost-competitive with whatever new higher performance card Nvidia releases next year.
That is a fine point. However I am not sure if replacing the gpus themselves will be the bottleneck investment for datacenter costs. After all you have so much more infrastructure in a datacenter (cooling and networking). Plus custom chips like tpus might catch up at lower cost eventually. I think the bigger question is whether demand for compute will evaporate or not.
When the bubble pops the labs are going to stop the hero training runs and switch the gigawatt datacenters over to inference and then they're going to discover that milking existing GPUs is cheaper than replacing them.
Investment in dark fiber was intentional and continues to this day. Almost all of the cost for laying fiber is in getting physical access to where you want to put the fiber underground. The fiber itself is incredibly cheap, so every time a telecom bothers to dig up mile upon mile of earth they overprovision massively.
The capital overhang of having more fiber than needed is so small compared to other costs I doubt the telecoms have really regretted any of the overprovisioning they've done, even when their models for future demand didn't pan out.
Softbank investment funds include teacher pension plans and things like that. Private losses attached to public savings can very quickly become too big to fail.
Nobody forced a pension plan to invest in Masa's 300 year AI vision or whatever. Why it's even legal to gamble pensioners' money like that is beyond me.
I don't think merely building infrastructure at a loss is what's being described here - it's building infrastructure that won't get used (or used enough to be worth it). More of a bridge to nowhere situation than expecting to recoup the cost of a bridge with tolls or whatever.
Infrastructure building at a loss is very much not okay for a government and is usually the result of some form of corruption (e.g. privatize the profit), incompetence (e.g. misaligned incentives) or both.
However, the cost-benefit analysis on governmental projects typically includes non-monetary or indirect benefits.
This is all talking about the big companies investing in AI as a service to sell to others. The whole financial aspect of all the articles is focused on investing in a service to sell?
But the moment that the AI can exceed a human programmer, at something as narrow as coding, then the company that has that AI shouldn't sell it to replace humans at other companies - it should instead use it to write programs to replace the other companies.
And the moment an AI can exceed a human generally, then the company that has that AI shouldn't sell it to replace humans at other companies - it should instead ask it how to dominate the world and replace all other companies (with side quest to ensure no competitor achieves AGI)?
Specialization tends to preclude this. Why doesn’t TSMC just fab the chips and not create their own? Why do (effectively all) farmers sell grain rather baked goods? Why don’t lumber yards sell furniture?
For sure, there are some fantastically successful, vertically integrated companies. But mostly it’s less risky to sell the shovels rather than mine the gold.
Software is something that can be completely automated. Its just compute budget. They can create a ERM from scratch _and_ provide all the migration scripts to move customers off of the competitors etc. Its just compute.
Not really. Even if we imagine perfect automated developers, it's also market research, marketing, tracking user feedback, tracking legal requirements, collecting and fixing issues, etc.
> But the moment that the AI can exceed a human programmer, at something as narrow as coding, then the company that has that AI shouldn't sell it to replace humans at other companies - it should instead use it to write programs to replace the other companies.
I'm not sure this is a good take. There's a reason why Microsoft employs 200,000+ people, and you can be sure most of them they're not cranking out code 9-to-5.
Companies are living organisms more than they are pieces of software. In fact, the source code for Windows has been leaked a number of times over the years, and it resulted in approximately $0 revenue lost.
I mean it remains to be seen that the demand cant be satisfied by local AI.
Qwen + Your Laptop + 3 years is more interesting to me than offloading AI to some hyperscale datacenter. Yes efficiency gains can work for both, but theres a certain level below which you may as well just run the app on your own silicon. AI might not eventually meet the threshold for "apps on tap" if every user with an i7 and 32GB ram is ably served locally.
There is a real possibility that local, free models keep trailing a few years behind the frontier. If that does happen, then the math for building out all this capacity is that you need to make a profit in that window. Can they recoup a trillion dollar data center capex in 2-3 years?
Doesn't that require local hardware to trail a few years behind as well? I don't see consumer hardware being on the same level as a cluster of A100s for a very long time - mostly due to form factor. No one wants a laptop that's two inches thick and liquid cooled. :)
20$ p/m for a timeshare on a high-end unit that includes electricity costs is really not a bad deal.
Buying hardware that covers 90% of my use pattern isn't going to pay itself back for 5 or 10 years.
With the added benefit that i can change my setup every month.
I strongly believe we're in a bubble, but even just buying stocks with the money seems a better investment in my situation.
I agree except you probably dont do this for other apps. Like theres some threshold where if it runs locally you just do. I have been having lots of fun with models that fit in 16GB ram, and my next hardware will have 32GB just to be future proof. Worst case, I get 12 extra chrome tabs.
I saw a coworker using chatGPT the other day - they were in marketing, and trying to access a dashboard. One of the filters used a list of IDs, sperated by commas. I watched as she repeatedly copied a list of IDs from a spreadsheet into chat gpt, asking for them to be comma sperated, then into the filter.
There are loads of use cases like this that will most definitely be solved by local LLM.
I'm convinced that most office work exists for social reasons, not for practical reasons. Just like every other office technology before, AI is going to replace workflows like that (and this: https://xkcd.com/763/) but it won't have an end effect on the total number of office workers, because they weren't really needed in the first place, and ebb and flow due to cycles of elite overproduction, not productivity.
> I'm convinced that most office work exists for social reasons
The powers that be are not competent/powerful enough to engineer a society-wide conspiracy like that.
A lot of waste (especially in big organizations, and the bigger the organization the more capacity for waste) happens, and the more the income is detached from production (e.g. governments get tax revenue regardless of how well they spend it) the less people care about efficiency.
But for the most part people get hired because somebody thought that makes business sense to do so.
Scary. A person with awareness of how LLMs would work would not feel comfortable doing that. It can hallucinate things, forget entries. It is just plain silly.
If it's 95% accurate on average for tasks, then that's probably better than writing an excel formula that's right on the first try. Especially for a marketer.
Can we agree that 95% accurate for such tasks is really bad? I agree that we probably don't care all that much if marketing gets something wrong, but imagine the same applied to taxes or pensions.
I run AI locally and it's a really useful tool, but I'm aware that it'll always be constrained to work on data embedded in the model, that I have locally, or that I can access via some sort of API integration. That makes it much less useful for a lot of tasks than a cloud-based tool that can give me insights into a pool of data that I can't see into directly. AI integrated into a SaaS app will usually have more value than a local model simply because specific and targeted information is better than general information.
Claude and ChatGPT can both search the web for you now. It’s actually quite handy as they can search for updated data on topics if prompted.
Though I agree and believe personal LLM agents with access to our personal data would be much more effective. Though perhaps we should give LLMs a few more years to mature and safeguards to be created before letting it say gamble your house on the newest meme coin. ;)
I can get a 90 percent solution on my local machine. I like the privacy and the cost. I'm still challenged to see wide adaptation of a paid for service at mass scale.
"Demand" for inference seems infinite because you can always think harder or insert AI for free into some workflow where the customer didn't request it (e.g. Google search). It's like a gas that expands to fill the container. Then during the crash people are going to discover that some fraction of that demand wasn't needed and they'll just turn it off.
It would be a good sign if paid demand exceeds supply but I don't know if we can measure that.
Even though I'm a touch sceptical as to how good LLMs are at coding, imagine developing world demand for LLMs via a cheap Android handset for $1/month.
There's a lot of use case for the information retrieval side, especially for people whose educational level and mindset means well-crafted search queries were never a thing for them.
But there's also the other side of the trade. How much are suppliers willing to pay for upranking? It's a dirty business model, Enshittification 3.0+, but when did that stop anyone?
I don’t think that’s true, given the wide range of inference providers on OpenRouter. These are mostly undifferentiated and commoditized, with no chance of gaining market share at the exclusion of others.
We wont get to AGI if we dont get models with both larger context and dreaming (aka distilling important parts from their 'day long' context and adding those to their neural net) with about the same effort/cost as inference. LLM models cannot do this and wont be able to, so if no one comes up with a better model, AGI cannot be reached, no matter what amounts are invested, so we will get an AI winter. So many smart minds are on this now, that if anyone had an idea how to 'learn during inference', someone would have released something. No one has a clue, so I am betting the downfall will come soon. Still, we got incredible progress from this AI boom, so it is not bad, just money slushing.
Not sure if there will be an AI winter, but the claims of AGI being very close are likely very optimistic. Maybe AGI requires an entirely new paradigm, maybe it requires quantum computing, or biological computing (our brains are able to work at just a fraction of the cost of current LLMs after all), regardless it’s much further out than certain industry leaders would like you to believe.
But an AI winter is unlikely to come either, as it currently adds a lot of value in many places. But the profits coming out of it are unlikely to line up with current investments being poured in.
LLMs add a value that remains orders of magnitude lower than the investments. So I fully expect their valuation to decrease suddenly by end of 2027 – and to take the rest of AI with them into a new AI winter.
Also these industry leaders believe that we can get there just optimising llms: this is a mistake for sure: the only general intelligence we know is human and humans add new facts to their neural nets daily while llms need weeks of expensive training and then have a knowledge cut off. It cannot happen like that.
The previous winters still had a lot (for the time) of benefits and money coming from the discoveries before the winter, but there were just very few fundamental advancements to AI tech itself. So just using LLMs and optimising them etc but getting nothing new is still a winter to me.
Even if energy needs inevitably come down these guys will continue to force energy use purely driven by financial obligations. It’s the dumbest thing since subprime. We’re already seeing shades of this with Nvidia. They’re giving away their GPUs in exchange for paper to keep people on this path. But energy is not free. It’s finite.
As a side note, one argument I heard again and again against crypto was the energy use. It’s an argument that has gone so far to the way side it’s incredible to me. Overcoming criticism about energy use and tech was simply marketing. Global warming is barely an issue in the mainstream anymore, it’s truly amazing
If the LLM boom ends then bitcoin miners can eat up all the slack grid power! You can put those ASICs anywhere
Crypto mining energy is wasted relative to proof of stake that provides the same security for no energy. But AI inference has no real alternative so the energy isn't wasted. It is unfortunate that politicians refused to allow abundant clean energy even after we told them exactly what to do, but now it is what it is. The market wants AI, they are willing to pay for it, and they are getting it.
If the market demands bitcoin, then miners provide? No?
Mining doesn’t create bitcoins though. They are created on a schedule. Miners just burn power on useless calculations to compete for a share of the scheduled distribution.
Didn't more modern currencies (e.g. ETH post switch to Proof-of-Stake) solve a lot of that[0]?
[0] I own no crypto and have no horse in that race, just from what i understood from online discourse.
Ethereum did the change, Bitcoin is stuck with the current mining of small blocks due to politics, it could've moved to bigger block sizes many years ago but the miners blocked the move since it would be less profitable.
So Bitcoin is locked into its current wasteful state with no end in sight.
> If the LLM boom ends then bitcoin miners can eat up all the slack grid power! You can put those ASICs anywhere
Or maybe we shouldnt use that energy at all. We're still heading towards climate collapse so just frivolously wasting energy on something that has 0 value like cryptocurrencies doesnt seem like the smartest idea.
[dead]
The question that really matters: is the net present value of each $1 investment in AI Capex > $1 (+ some spread for borrowing costs & risk).
We'll be inference token constrained indefinitely: i.e. inference tokens supply will never exceed demand, it's just that the $/token may not be able to pay back the capital investment.
And it may not need to pay back with enough fiscal stimulus
[dead]
> it's just that the $/token may not be able to pay back the capital investment.
the loss is private, so that's OK.
A similar thing happened to the internet bandwidth capacity when the dot-com bust happened - overinvestment in fibre everywhere (came to be called dark fibre iirc), which became superbly useful once the recovery started, despite those building these capacity not making much money. They ate the losses, so that the benefit can flow out.
The only time this is not OK is when the overinvestment comes from gov't sources, and is ultimately a taxpayer funded grift.
Every time someone says “but dark fiber”, someone else has to point out that graphics cards are not infrastructure and depreciate at a much, much higher rate. I guess it’s my turn.
Fiber will remain a valuable asset until/unless some moron snaps it with a backhoe. And it costs almost nothing to operate.
Your data center full of H100s will wear out in 5 years. Any that don’t are still going to require substantial costs to run/may not be cost-competitive with whatever new higher performance card Nvidia releases next year.
How does the cost breakdown of all these new datacenters, cooling, and power delivery systems compare to the cost of the GPUs themselves?
There is a surprising amount of real long-term infrastructure being built beyond the quickly obsolete chips.
Poorly. GPUs are easily the bulk of the costs, and a disposable asset.
That is a fine point. However I am not sure if replacing the gpus themselves will be the bottleneck investment for datacenter costs. After all you have so much more infrastructure in a datacenter (cooling and networking). Plus custom chips like tpus might catch up at lower cost eventually. I think the bigger question is whether demand for compute will evaporate or not.
When the bubble pops the labs are going to stop the hero training runs and switch the gigawatt datacenters over to inference and then they're going to discover that milking existing GPUs is cheaper than replacing them.
Investment in dark fiber was intentional and continues to this day. Almost all of the cost for laying fiber is in getting physical access to where you want to put the fiber underground. The fiber itself is incredibly cheap, so every time a telecom bothers to dig up mile upon mile of earth they overprovision massively.
The capital overhang of having more fiber than needed is so small compared to other costs I doubt the telecoms have really regretted any of the overprovisioning they've done, even when their models for future demand didn't pan out.
Softbank investment funds include teacher pension plans and things like that. Private losses attached to public savings can very quickly become too big to fail.
Nobody forced a pension plan to invest in Masa's 300 year AI vision or whatever. Why it's even legal to gamble pensioners' money like that is beyond me.
How is infrastructure building at a loss not OK for the government? That's what governments do, including things that will never be profitable.
I don't think merely building infrastructure at a loss is what's being described here - it's building infrastructure that won't get used (or used enough to be worth it). More of a bridge to nowhere situation than expecting to recoup the cost of a bridge with tolls or whatever.
Infrastructure building at a loss is very much not okay for a government and is usually the result of some form of corruption (e.g. privatize the profit), incompetence (e.g. misaligned incentives) or both.
However, the cost-benefit analysis on governmental projects typically includes non-monetary or indirect benefits.
There's a lot of retirement funds tied up in heavily AI-exposed stocks. A crash, which seems inevitable to me, will hit the public pretty hard.
This is all talking about the big companies investing in AI as a service to sell to others. The whole financial aspect of all the articles is focused on investing in a service to sell?
But the moment that the AI can exceed a human programmer, at something as narrow as coding, then the company that has that AI shouldn't sell it to replace humans at other companies - it should instead use it to write programs to replace the other companies.
And the moment an AI can exceed a human generally, then the company that has that AI shouldn't sell it to replace humans at other companies - it should instead ask it how to dominate the world and replace all other companies (with side quest to ensure no competitor achieves AGI)?
Specialization tends to preclude this. Why doesn’t TSMC just fab the chips and not create their own? Why do (effectively all) farmers sell grain rather baked goods? Why don’t lumber yards sell furniture?
For sure, there are some fantastically successful, vertically integrated companies. But mostly it’s less risky to sell the shovels rather than mine the gold.
Your examples are all hardware.
Software is something that can be completely automated. Its just compute budget. They can create a ERM from scratch _and_ provide all the migration scripts to move customers off of the competitors etc. Its just compute.
Not really. Even if we imagine perfect automated developers, it's also market research, marketing, tracking user feedback, tracking legal requirements, collecting and fixing issues, etc.
> But the moment that the AI can exceed a human programmer, at something as narrow as coding, then the company that has that AI shouldn't sell it to replace humans at other companies - it should instead use it to write programs to replace the other companies.
I'm not sure this is a good take. There's a reason why Microsoft employs 200,000+ people, and you can be sure most of them they're not cranking out code 9-to-5.
Companies are living organisms more than they are pieces of software. In fact, the source code for Windows has been leaked a number of times over the years, and it resulted in approximately $0 revenue lost.
Making a successful software company requires a lot more than good programmers.
I mean it remains to be seen that the demand cant be satisfied by local AI.
Qwen + Your Laptop + 3 years is more interesting to me than offloading AI to some hyperscale datacenter. Yes efficiency gains can work for both, but theres a certain level below which you may as well just run the app on your own silicon. AI might not eventually meet the threshold for "apps on tap" if every user with an i7 and 32GB ram is ably served locally.
There is a real possibility that local, free models keep trailing a few years behind the frontier. If that does happen, then the math for building out all this capacity is that you need to make a profit in that window. Can they recoup a trillion dollar data center capex in 2-3 years?
Doesn't that require local hardware to trail a few years behind as well? I don't see consumer hardware being on the same level as a cluster of A100s for a very long time - mostly due to form factor. No one wants a laptop that's two inches thick and liquid cooled. :)
That’s true, but models have been getting more efficient over time as well.
I find the smaller Qwen models and waiting 30 seconds are actually quite tolerable.
You can't train new models on consumer hardware.
Right, so the consumer use of LLMs is really just covering for the commercial training of LLMs.
If training slowed down by 2/3rds would consumers be that much worse off?
20$ p/m for a timeshare on a high-end unit that includes electricity costs is really not a bad deal.
Buying hardware that covers 90% of my use pattern isn't going to pay itself back for 5 or 10 years. With the added benefit that i can change my setup every month.
I strongly believe we're in a bubble, but even just buying stocks with the money seems a better investment in my situation.
I agree except you probably dont do this for other apps. Like theres some threshold where if it runs locally you just do. I have been having lots of fun with models that fit in 16GB ram, and my next hardware will have 32GB just to be future proof. Worst case, I get 12 extra chrome tabs.
I saw a coworker using chatGPT the other day - they were in marketing, and trying to access a dashboard. One of the filters used a list of IDs, sperated by commas. I watched as she repeatedly copied a list of IDs from a spreadsheet into chat gpt, asking for them to be comma sperated, then into the filter.
There are loads of use cases like this that will most definitely be solved by local LLM.
I'm convinced that most office work exists for social reasons, not for practical reasons. Just like every other office technology before, AI is going to replace workflows like that (and this: https://xkcd.com/763/) but it won't have an end effect on the total number of office workers, because they weren't really needed in the first place, and ebb and flow due to cycles of elite overproduction, not productivity.
> I'm convinced that most office work exists for social reasons
The powers that be are not competent/powerful enough to engineer a society-wide conspiracy like that.
A lot of waste (especially in big organizations, and the bigger the organization the more capacity for waste) happens, and the more the income is detached from production (e.g. governments get tax revenue regardless of how well they spend it) the less people care about efficiency.
But for the most part people get hired because somebody thought that makes business sense to do so.
Scary. A person with awareness of how LLMs would work would not feel comfortable doing that. It can hallucinate things, forget entries. It is just plain silly.
If it's 95% accurate on average for tasks, then that's probably better than writing an excel formula that's right on the first try. Especially for a marketer.
Can we agree that 95% accurate for such tasks is really bad? I agree that we probably don't care all that much if marketing gets something wrong, but imagine the same applied to taxes or pensions.
The issue is that 95 x 95 x 95 etc. Error rates compound.
I run AI locally and it's a really useful tool, but I'm aware that it'll always be constrained to work on data embedded in the model, that I have locally, or that I can access via some sort of API integration. That makes it much less useful for a lot of tasks than a cloud-based tool that can give me insights into a pool of data that I can't see into directly. AI integrated into a SaaS app will usually have more value than a local model simply because specific and targeted information is better than general information.
Claude and ChatGPT can both search the web for you now. It’s actually quite handy as they can search for updated data on topics if prompted.
Though I agree and believe personal LLM agents with access to our personal data would be much more effective. Though perhaps we should give LLMs a few more years to mature and safeguards to be created before letting it say gamble your house on the newest meme coin. ;)
I can get a 90 percent solution on my local machine. I like the privacy and the cost. I'm still challenged to see wide adaptation of a paid for service at mass scale.
When people talk about AGI, are they talking about LLMs that will be very good? What does achieving AGI mean?
OpenAI definition:
> a highly autonomous system that outperforms humans at most economically valuable work
Thanks, that clears things up
Ponzi, fraud and extraction? Checks notes. Coming closer, yepp.
A question the capital markets should be asking themselves.
As usual buzz words mean little to nothing.
The goalpost is always moving.
Definitions change to fit the narrative each one is trying to push at the time.
It's basically a Ponzi now. They simply profit on new investments. It's just no investor realizes he's the last one until it's too later.
A bubble isn't a Ponzi. You can't just call every case of financial mismanagement a Ponzi.
Correct. Using new investor money to pay out old investors would make it a ponzi.
At this point, from what I can see, it's all money going in, nothing coming out yet. So hence, not ponzi.
It may be a bubble. Or it may be smart investing. At this point it could go either way.
> demand for tokens/AI inference capacity exceeds CURRENT supply
That's not at all obvious to me; costs as a consumer are going down, rather than up. Can someone steel-man this guy's argument for me?
"Demand" for inference seems infinite because you can always think harder or insert AI for free into some workflow where the customer didn't request it (e.g. Google search). It's like a gas that expands to fill the container. Then during the crash people are going to discover that some fraction of that demand wasn't needed and they'll just turn it off.
It would be a good sign if paid demand exceeds supply but I don't know if we can measure that.
There is only so many requests people want to process with LLMs.
Even though I'm a touch sceptical as to how good LLMs are at coding, imagine developing world demand for LLMs via a cheap Android handset for $1/month.
There's a lot of use case for the information retrieval side, especially for people whose educational level and mindset means well-crafted search queries were never a thing for them.
But there's also the other side of the trade. How much are suppliers willing to pay for upranking? It's a dirty business model, Enshittification 3.0+, but when did that stop anyone?
Cost to consumers is being subsidized by investors trying to grab market share - penetration pricing.
I don’t think that’s true, given the wide range of inference providers on OpenRouter. These are mostly undifferentiated and commoditized, with no chance of gaining market share at the exclusion of others.