I think this article has "aged well" in the sense that... nothing has changed for the better :( Since I wrote it, I did upgrade my machine: I now have a 24-core 13th Gen i7 laptop with a fast NVMe drive and... well, Windows 11 is _still_ visibly laggy throughout. Comparing it to KDE on the same machine is like night and day in terms of general desktop snappiness (and yes, KDE has its own bloat too, but it seems to have evolved in a more "manageable" manner).
I've also gotten an M2 laptop for work since then, and same issue there: I remember how transformative the M1 felt at launch with everything being extremely quick in macOS, but the signs of bloat are _already_ showing up. Upgrading anything takes ages because every app is a monster that weighs hundreds of MBs, and reopening apps after a reboot is painfully slow. Still, though, macOS feels generally better than Windows on modern hardware.
About the article itself, I'll say that there was a complaint back then (and I see it now here too) about my blaming of .NET rewrites being misplaced. Yes, I'll concede that; I was too quick to write that, and likely wrong. But don't let that distract yourself from the rest of the article. Modern Notepad is inexplicably slower than older Notepad, and for what reason? (I honestly don't know and haven't researched it.)
And finally, I'll leave you with this other article that I wrote as a follow-up to that one, with a list of things that I feel developers just don't think about when writing software, and that inevitably leads to the issues we see industry-wide: https://jmmv.dev/2023/09/performance-is-not-big-o.html
I just got a second hand M1 Air, upgrading from 10 year old MacbookPro… wow. The animations on the new MacOS are slow as hell tho. Any way to speed them up? On iOS is it a breeze (no jailbreak): https://cowabun.ga/ (Choose: CowabungaLite)
I'm not 100% sure, but I think the main reason the new Win11 apps like Notepad, File Explorer, Task Manager have a slow UI and the piece-by-piece drawing issues is because they combine UWP (the "new" tech) with the old Win32 controls, they are not a from-scratch rewrite. There seems to be some big overhead in using the two different UI frameworks together.
There was an attempt to make a Chrome OS competitor that had the entire UI rewritten in UWP, and when that got canceled it seems MS saved the new start menu + taskbar and bolted it on top of Win10, and for Explorer and other apps made this UWP+Win32 abomination. Actually when you profile Explorer you can also see DirectUI running, which is I believe another UI framework from the Office org. this time (maybe it's related to OneDrive).
Btw, apps written in C# using WPF can be surprisingly fast, so it's not really a problem of .NET managed vs. native apps. For ex. may favorite Git GUI (git-fork.com) is C#/WPF.
The author mentions rewriting core applications in C# on windows but I don’t think this is the problem. Write a simple hello world app in c#, compile it and see how long it takes to run vs a rust app or a python script - it’s almost native. Unity is locked to a horrifically ancient version of mono and still manages to do a lot of work in a small period of time. (If we start talking JavaScript or python on the other hand…)
I agree with him though. I recently had a machine that I upgraded from Win10 to Win11 and it was like someone kneecapped it. I don’t know if it’s modern app frameworks, or the OS, but something has gone horribly wrong on macOS and windows (iOS doesn’t suffer from this as much for whatever reason IME)
My gut instinct is an adjustment to everything being asynchronous, combined with development on 0 latency networks in isolated environments means that when you compound “wait for windows defender to scan, wait for the local telemetry service to respond, incrementally async load 500 icon or text files and have them run through all the same slowness” with frameworks that introduce latency, context switching, and are thin wrappers that spend most of our time FFI’ing things to native languages, and then deploy them in non perfect conditions you get the mess we’re in now.
> Unity is locked to a horrifically ancient version of mono and still manages to do a lot of work in a small period of time
Unity is the great battery killer!
The last example I remember is that I could play the first xcom remake (which had a native mac version) on battery for 3-4 hours, while I was lucky to get 2 hours for $random_unity_based_indie with less graphics.
I’m well aware how to make games, thanks, I’m the tech director on a AAA game with 15 years working on them.
I said “constantly” rendering and you and everyone seem to have taken the least charitable interpretation of what I said and decided I couldn’t possibly know what I’m talking about.
> incrementally async load 500 icon or text files and have them run through all the same slowness”
This really shouldn't be slower when done asynchronously compared to synchronously. I would expect, actually, that it would be faster (all available cores get used).
> I would expect, actually, that it would be faster (all available cores get used).
And I think this assumption is what's killing us. Async != parallel for a start, and parallel IO is not guaranteed to be fast.
If you write a function:
async Task<ImageFile> LoadFile(string path)
{
var f = await load_file(path);
return new ImageFile(f);
}
And someone comes along and makes it into a batch operation;
async Task<List<ImageFile>> LoadFiles(List<string> paths)
{
var results = new List<ImageFile>();
foreach(var path in paths) {
var f = await load_file(path);
results.Add(ImageFile(f));
}
return results;
}
and provides it with 2 files instead of 1, you won't notice it. Over time 2 becomes 10, and 10 becomes 500. You're now at the mercy of whatever is running your tasks. If you yield alongside await [0] in an event loop, you introduce a loop iteration of latency in proceeding, meaning you've now introduced 500 loops of latency.
In case you say "but that's bad code", well yes, it is. But it's also very common. When I was reading for this reply, I found this stackoverflow [0] post that has exactly this problem.
The same can be asked about any number of things - why would someone ever not free their memory, use a dynamic allocation, not use a lock, use an orm, not use an orm.
> I just can’t understand the thought process that led to that.
Bluntly, it’s not understanding the tools and fundamentals. Your original reply assumed that it would use all cores, for example. To me that’s obviously not true but I’m sure there’s 10 things you could list off that are obvious to you but I’d get wrong -
well I mean you'd use await foreach and IAsyncEnumerable equivalent... async would mean the UI would not be blocked so I agree with the original commenter you replied to.
paraphrasing another reply I left here, if everyone just wrote their code correctly we’d all be writing C and everything would be safe and fast. I’ve fixed this exact issue with this exact cause (someone wrapped an await in a loop and it passed code review because our benchmark test still passed. But the benchmark test simulated the entire stack on a local network and when we deployed it, all of a sudden it took 5 seconds to load)
It depends how the application is written in C#. A lot of Modern C# relies on IoC frameworks. These do some reflection shenanigans and this has a performance impact.
> The author mentions rewriting core applications in C# on windows but I don’t think this is the problem. Write a simple hello world app in c#, compile it and see how long it takes to run vs a rust app or a python script - it’s almost native <...>
> My gut instinct is an adjustment to everything being asynchronous, combined <...> with frameworks that introduce latency, context switching, and are thin wrappers that spend most of our time FFI’ing things to native languages, and then deploy them in non perfect conditions you get the mess we’re in now.
Back then, programmers had to care about performance. The field of programming was less accessible, so the average skills to reach the barrier to entry were higher. So people were, on average, better programmers. The commercial incentives of today to reach market with something half-assed and then never fix it don’t help.
In 2002 I ran OpenBSD on my laptop (thus sacrificing wifi). The memory footprint of running X11, a browser, a terminal, and an editor: 28MB
More than CPU speed, I think the increase in storage and RAM is to blame for the slow decay in latency.
When you have only a few Kb/Mb of RAM and storage, you can't really afford to add much more to the software than what is the core feature. Your binary need to be small, which lead to faster loading in RAM, and do less, which means less things to run before the actual program.
When size is not an issue, it's harder to say no when the business demand for a telemetry system, an auto-update system, a crash handler with automatic report, and a bunch of features, a lot of which needs to be initialized at the start of the program, introducing significant latency at startup.
It's also complexity - added more than necessary, and @ a faster pace than hardware can keep up with.
Take font rendering: in early machines, fonts were small bitmaps (often 8x8 pixels, 1 bit/pixel), hardcoded in ROM. As screen resolutions grew (and varied between devices), OSes stored fonts in different sizes. Later: scalable fonts, chosen from a selection of styles / font families, rendered to sub-pixel accuracy, sub-pixel configuration adjustable to match hw construction of the display panel.
Yeah this is very flexible & can produce good looking fonts (if set up correctly). Which scale nicely when zooming in or out.
But it also makes rendering each single character a lot more complex. And thus eats a lot more cpu, RAM & storage than 8x8 fixed size, 1bpp font.
Or the must-insert-network-request-everywhere bs. No, I don't need search engine to start searching & provide suggestions after I've typed 1 character & didn't hit "search" yet.
There are many examples like the above, I won't elaborate.
Some of that complexity is necessary. Some of it isn't, but lightweight & very useful. But much of it is just a pile of unneeded crap of dubious usefulness (if any).
Imho, software development really should return to 1st principles. Start with a minimum viable product, that only has the absolute necessary functionality relevant to end-users. Don't even bother to include anything other than the absolute minimum. Optimise the heck out of that, and presto: v1.0 is done. Go from there.
> But it also makes rendering each single character a lot more complex.
Not millions of times more complex.
Except for some outliers that mess up everything (like anything from Microsoft), almost all of the increased latency between keypress and character rendering we see on modern computers comes from optimizing for modularity and generalization instead of specialized code for handling the keyboard.
Not even our hardware reacts fast enough to give you the latency computers had in the 90s.
Im not really sure what are you talking about ;) HW become much much much faster.
Mostly in speed of computing, but latency also dropped nicely. The letency bloat you see its 99% of software (OS). I still run Win2003 on modern desktop, and it flies! Really, booting/shutdown is quick. Im on spinning rust, so first start of webbrowser is slowish a bit, but once cached, its like 200ms-500ms depending on version (more modern = slower).
Take a look on the latency of a keypress on your modern keyboard to when your CPU has the chance to first process the data, you'll be surprised. Depending on how your hardware is configured, it can reach 100ms there alone.
Your mouse has an equivalent issue, except that it's not usually optimized to the same level, so the worst case is way more common. Audio has a much worse issue, and can lag a large fraction of a second on hardware alone. And there's the network, that has optional features that are emulated by adding lags proportional to the bandwidth.
All of our hardware has become more capable with time, but that doesn't mean latency has decreased. Some kinds of latency have gone down, others have gone way up.
Okey, yes.. Its a mix of HW/OS issue indeed. I remember my old gaming rig I assembled more than 10 years ago. Asus mother board, i5-760 CPU, ATI HD 6850.
Win2003 as OS. All was tuned and it was really great. Basic DPC latency was around 30-40us. Under load it increased slighty but it was always <100us.
Now catastrohpic (with I didnt knew at that time) event occured. After 10 years, Internal NIC burnded. I was like, okey.. I have dozens of PCI NICs, lets plug one and vioala. And I did. But, there was problems, after a while (hour or so) I noticed Audio glitches, especially when when there was network activitiy.
After more investigation and reading mobo manual I noticed that IRQs were nicely spread for all internal mobo components + PCIe x16 bus. Other PCI ports were always shared one or another. I could do nothing to fix it.
PC now catch dust, I bought used HP 8200 PC with works nice, but its not gaming rig, standard DPC latency is around 2000us, with is quite large.. Still, for normal use that latency is great, Im very sensitive to lag and latency so if I had issues here, I would be mad.
I'm not sure your font rendering is very good example here. Windows has used vector fonts since 90s and ClearType since Windows XP. That is nearly 25 years ago. And it wasn't really much of a performance issue even back then.
> Notepad had been a native app until very recently, and it still opened pretty much instantaneously. With its rewrite as a UWP app, things went downhill. The before and after are apparent, and yet… the app continues to be as unfeatureful as it had always been. This is extra slowness for no user benefit.
We now have HUGE (/s) advancements in Notepad, like tabs and uh... Copilot
I think it's the antivirus and security protection that makes windows slow at opening apps. As for people saying the M series macs are slow, those are the first macs that felt fast to me in many years.
My take on it - performance decays when engineering management doesn’t prioritize it.
Modern example: Laptops boot in seconds. My servers take about 5 minutes to get to Linux boot, with long stretches of time taken by various subsystems, while Coreboot (designed to be fast) boots them nearly as quickly as a laptop.
Old example: early in my career we were developing a telecom system with a 5 min per year (5 9s) downtime target. The prototype took 30 minutes to boot, and engineers didn’t care because management hadn’t told them to make it boot faster. It drove me nuts. (a moot point, as it eventually got cancelled and we all got laid off)
I suspect (in a common pattern) the main thing that blocks making performance a priority is that it equates to reordering various ranks among developers and product managers.
When performance supersedes "more features", developers are gatekeepers and manager initiatives can be re-examined. The "solution" is to make performance a non-priority and paint complainers as stale and out-of-fashion.
Probably more common is that software isn't developed with end-users as #1 priority; it's developed to advance business goals.
Those 2 goals align about as often as planets in our solar system.
To some degree this is true for open source software as well. Developers may choose to work on features they find interesting (or projects done in a language they feel comfortable with), vs. looking at user experience first. Never mind that optimizing UX is hard (& fuzzy) as it is.
Or all the work done on libre software to cater to various corporate interests. As opposed to power users fixing & improving things for their needs.
I think that the author is spot on about the cause of the problem: software developers (meaning the organizations that produce software, not necessarily the individuals writing code) prioritize selfish goals like ease of development or profits over the quality of the product. This has resulted in a lot of software being quite terrible. Not just slow (though definitely slow), but also buggy and crammed full of features that users hate (but which make the developer money). The only market that seems to reliably produce quality software any more is the open source community, because they are making the software that they themselves use and their incentives are aligned with users.
> software developers ... prioritize selfish goals like ease of development or profits over the quality of the product.
What is an unspoken assumption in this type of complaint is that the software would still have gotten written if it had to be more handwritten. The amount of projects I've seen beached on an outdated version of Spring because they neglected upgrading even patch releases convinces me that more effort is just not tenable.
Are mobile devices slow/unresponsive. I haven't experienced that unless I realllllly cheap out. Or after 4 years of OS updates on Apple devices for some reason. Androids seem OK in this regard.
I switched from android back to iOS last year. There seems to be some sort of inherent latency in either android or Samsung’s UI that causes the UI thread to lag behind your inputs by a noticeable amount, and for the UI thread to block app actions in many cases.
Things like summoning a keyboard causing my 120hz galaxy phone to drop to sub 10fps playing the intro animation for GBoard were just rampant. All non existent in iOS
I do wonder if part of it is down to Android default animation speeds.... Pixel 6 here, Gboard snappy enough. Something I do on every android device I own though is go into developer settings and change all the animation durations to 0.5x. Makes stuff seem snappier. In reality I'm sure it's dropping just as many frames as it async loads garbage enterprise uncompressed asset icons or whatever, but hey it shows up on screen 2x as fast!!!!
Edit: oh, no, you have a point about the UI blocking stuff, it's fine when apps are loaded and active but "cold booting" a UI component definitely has lags in stupid places, android UX feels like a web perform sometimes due to that.... Tap button, go on holiday for a week, come back and it's responded to the button press (while you were trying to do something completely different and now you've pressed something else and you're not sure what because this time the button you pressed closed the activity overlay 1ms after)
I begin to wonder if all the commenters in this thread have compromised devices. I'm on a 5 year old Samsung (the model is, I bought it new six months ago) - my Linux machines are fast (gentoo) and my windows 10 and 11 machines are fast. My kid's computer is an i3 7350k and he plays roblox, Minecraft, teardown on it with no issues. That computer is a couple years older than he is at 9-10 years old. That computer's twin is my nas backup with 10gbe running windows - the drive array refused to work at any decent speed on Linux and I didn't want jbod, I wanted RAID.
Some things are slow, like discord on windows takes ~12 seconds before it starts to pop in ui elements after you double click. My main computer is beefy, though, 32 thread 128GB; but no NVMe. Sata spindle and SSDs. But I have Ryzen 3600s that run windows 11 fine.
> I'm on a 5 year old Samsung (the model is, I bought it new six months ago)
How quick is the Share menu on your samsung? On mine, it takes about 3 seconds of repainting itself before it settles down. On iOS there's about a 100ms pause and the drawer pops up, fully populated. I found [0] which is a perfect example of this sort of bloat.
> My main computer is beefy, though, 32 thread 128GB; but no NVMe. Sata spindle and SSDs. But I have Ryzen 3600s that run windows 11 fine.
My main computer is a 24 core i9 with 64GB ram on NVMe. It runs windows fine. But, I saw exactly the same behaviour out of the box on this machine (and on the machine I replaced) as the linked article shows. I can compile, play games, do AV transcoding. But using apps like slack or discord is like walking through molasses, and even launching lightweight apps like WIndows Terminal, and Notepad have a noticeable delay from button press to the window appearing on screen. There's just something a bit broken about it.
No i was referring only to the comments here. The share menu comes up real fast, there's no "delay", as soon as the menu with "share" goes away, the bottom slides up and the share panel appears, icons populated.
Is Slack also electron? a cursory search says discord is electron. Now, firefox lags to start, too - compared to edge, which is nearly instant to "launch". Brave is also <1second to launch. I don't use "notepad", but i just ran it, and while it's slower than the old notepad.exe, it's much faster than notepad++ to come up - but i generally launch that once and leave it running, same with firefox.
I don't use windows terminal, i use pwsh 7.5.x, and it's <1 second to be ready for input. the window comes up real fast, and then the text appears about that long again later. I launched cmd.exe and it came up perhaps a bit slower than pwsh, but not noticeably so.
this is what i am talking about, i don't notice anything slow on my computer, any of my computers, really. This leads me to believe that either the people who do experience slowness have a compromised system, or some other issue. 5400 RPM data/boot drive, only using electron apps (which are slow in general), or otherwise misconfigured.
I haven't used Ubuntu desktop since they tampered with the system menu/start menu, so i have no idea what that is like now. The server can be snappy, though - you just have to configure networking and the like correctly. I'm not a fan of systemd; that colors my opinion of a lot of linux systems. I use Gentoo and Devuan because they fully support OpenRC, which i consider vastly superior for my use cases. However, i do maintain a couple of ubuntu based OSes for neighbors on HP EliteDesk SFF computers, and they seem alright, 3-4x as fast as an rpi3/4 at ubuntu desktop, as far as launch lag and boot times and the like.
I think to put this to bed, we need to establish a baseline or at least a list of applications to launch while screen recording, and then literally count frames. It would be funny if this was all perception and it wasn't actually "slow", that is, "seconds to load" is actually like 2 seconds, and not 12...
> I think to put this to bed, we need to establish a baseline or at least a list of applications to launch while screen recording, and then literally count frames. It would be funny if this was all perception and it wasn't actually "slow", that is, "seconds to load" is actually like 2 seconds, and not 12...
2 seconds is an eternity. Its 15 round trips from London to New York, 5-8GB read from an NVMe drive, and 7 billion cpu cycles per core - with even the cheapest devices on the market having multiple cores. There’s no really any excuse for things being that slow.
Also I think that this idea that the only number that matters is the synthetic exact frame count from interaction to window for a specific list of operations is how we ended up in the scenario where windows explorer takes over a second to show the window on modern OS’s despite multiple orders of magnitude more computing power being available. Nobody really cares about what the actual performance of the running system is, as long as the exact part they’re responsible for doesn’t trip an alert.
A few years ago I worked on a project for a batch operation that was done a few hundred times a day. It took 15 minutes to run and it was starting to get in the way of things. Another team had hired 2 engineers to spin up kafka and a k8s cluster to queue these operations up, along with a reporting dashboard and metrics for queue length etc.
I ran the code through a profiler and found a hotspot on shutdown where we were searching through an array millions of times to remove a single entry, repeatedly, until the array was finished. I replaced it with a set lookup in reverse, and changed a handful of places where we rebuilt this massive array and did the same thing to use the same data structure and it dropped to under 15 seconds with about a weeks work. I identified enough other wins that would bring it under a second that would negate the need for the entire other project to exist, but it had already been approved and budget allocated for it, so last I heard they’re still taking 15 seconds and spending hundreds of thousands a month on this absolutely gargantuan monitoring system that they don’t need.
In both cases, IMO, a vision holder of “this isn’t good enough” using the product in their own device at home is what’s needed
FWIW I updated my phone to a relatively budget samsung recently, and had a similar noticeable delay to bring up the keyboard, installing 'simple keyboard' from F-droid seems to have helped. I wouldn't be surprised if it is missing features compared to the samsung/google ones where their absence will annoy a power user, but for whatever subset I use it works fine and doesn't appear as though my phone hangs.
I have a top end Pixel phone running stock Android and encounter latency all the time. All first start time is usually a couple of seconds. Switching back to an open app is fast in many cases but some still have to do a refresh (which I suspect involves communication to servers, although that's still not much of an excuse).
Personally I've decided to just vote with my feet and avoid using poor performing software as much as possibl, but that's frequently impractical or not worth the cost of missing out. I also doubt this will change the behaviors of companies as we see with, for example, TV advertising that they give no shits about degrading the consumer experience over the long term.
There doesn't seem much hope on the technical side either as software complexity is only increasing.aybe longer term AI has a role to play in auto-optimization?
Too much developers relying on bloatware, not enough “implement it yourself” because in reality every hash map doves a unique problem that a Palin hash map is not necessarily suited for. Embrace NIH
This is why I do not use any google apps and deleted my google anccount. Their shitty software pisses me off.
Every time I try to use it, the UI elements shift around as I try an action as simple as activating a search box. But it is just laggy enough that you click on something a second time, since it apparently didn’t activate the first time. But by then it’s shifted everything around and you are either cancelling what you were doing, or you are taken off into some new workflow you didn’t want.
The absolute WORST is when you focus the search box and it shifts to the top of the screen. With lag, you click on the search box again, but now where it was is a list of completion suggestions and you are taken off to some search result page you never asked for.
It’s fucking infuriating and I won’t entertain it.
Startup time has always been a bit of a sketchy metric as modern OSs and languages do a lot of processing on application launch. Some scan for viruses. On Macs you have checks for x86 vs Apple silicon and loading of Rosetta if required. Managed runtime environments have various JITs that get invoked. And apps are now huge webs of dependencies, so lots of dynamically linked code being loaded. A better metric is their performance once everything is in memory. That said, I still think we’re doing poorly at that metric as well. As resources have ballooned over the last decade, we’ve become lazy and we just don’t care about writing tight code.
The applications are launched at startup time because they have runtime startup slowness.
The applications have startup slowness because of the JIT runtime/deps/.dll.
At the end of the day, end users pay for the cost of developer convenience (JIT and deps most of the time, even thought there are some case where dynamic linking is alright) because they don't do native apps.
Offloading everything at startup is a symptom IMO.
Replying to your specific point about virus scans.
For some (naive) reason, I expect them to run a single time against a binary app that is never changed. So in theory it shouldn't even be a problem, but the reality says otherwise.
> Replying to your specific point about virus scans. For some (naive) reason, I expect them to run a single time against a binary app that is never changed.
Playing devil's advocate: the executable might not have changed, but the database of known virus signatures changes daily or more often. Either every single executable would have to be re-scanned every time the database updates, or the executable has to be lazily scanned on load.
Original author here! Thanks for (re)sharing. We previously discussed this at length in https://news.ycombinator.com/item?id=36503983.
I think this article has "aged well" in the sense that... nothing has changed for the better :( Since I wrote it, I did upgrade my machine: I now have a 24-core 13th Gen i7 laptop with a fast NVMe drive and... well, Windows 11 is _still_ visibly laggy throughout. Comparing it to KDE on the same machine is like night and day in terms of general desktop snappiness (and yes, KDE has its own bloat too, but it seems to have evolved in a more "manageable" manner).
I've also gotten an M2 laptop for work since then, and same issue there: I remember how transformative the M1 felt at launch with everything being extremely quick in macOS, but the signs of bloat are _already_ showing up. Upgrading anything takes ages because every app is a monster that weighs hundreds of MBs, and reopening apps after a reboot is painfully slow. Still, though, macOS feels generally better than Windows on modern hardware.
About the article itself, I'll say that there was a complaint back then (and I see it now here too) about my blaming of .NET rewrites being misplaced. Yes, I'll concede that; I was too quick to write that, and likely wrong. But don't let that distract yourself from the rest of the article. Modern Notepad is inexplicably slower than older Notepad, and for what reason? (I honestly don't know and haven't researched it.)
And finally, I'll leave you with this other article that I wrote as a follow-up to that one, with a list of things that I feel developers just don't think about when writing software, and that inevitably leads to the issues we see industry-wide: https://jmmv.dev/2023/09/performance-is-not-big-o.html
I just got a second hand M1 Air, upgrading from 10 year old MacbookPro… wow. The animations on the new MacOS are slow as hell tho. Any way to speed them up? On iOS is it a breeze (no jailbreak): https://cowabun.ga/ (Choose: CowabungaLite)
The only slow animation I noticed is the genie effect when minimizing, but you can switch to a zoom effect and it's fastish.
I'm not 100% sure, but I think the main reason the new Win11 apps like Notepad, File Explorer, Task Manager have a slow UI and the piece-by-piece drawing issues is because they combine UWP (the "new" tech) with the old Win32 controls, they are not a from-scratch rewrite. There seems to be some big overhead in using the two different UI frameworks together.
There was an attempt to make a Chrome OS competitor that had the entire UI rewritten in UWP, and when that got canceled it seems MS saved the new start menu + taskbar and bolted it on top of Win10, and for Explorer and other apps made this UWP+Win32 abomination. Actually when you profile Explorer you can also see DirectUI running, which is I believe another UI framework from the Office org. this time (maybe it's related to OneDrive).
Btw, apps written in C# using WPF can be surprisingly fast, so it's not really a problem of .NET managed vs. native apps. For ex. may favorite Git GUI (git-fork.com) is C#/WPF.
The author mentions rewriting core applications in C# on windows but I don’t think this is the problem. Write a simple hello world app in c#, compile it and see how long it takes to run vs a rust app or a python script - it’s almost native. Unity is locked to a horrifically ancient version of mono and still manages to do a lot of work in a small period of time. (If we start talking JavaScript or python on the other hand…)
I agree with him though. I recently had a machine that I upgraded from Win10 to Win11 and it was like someone kneecapped it. I don’t know if it’s modern app frameworks, or the OS, but something has gone horribly wrong on macOS and windows (iOS doesn’t suffer from this as much for whatever reason IME)
My gut instinct is an adjustment to everything being asynchronous, combined with development on 0 latency networks in isolated environments means that when you compound “wait for windows defender to scan, wait for the local telemetry service to respond, incrementally async load 500 icon or text files and have them run through all the same slowness” with frameworks that introduce latency, context switching, and are thin wrappers that spend most of our time FFI’ing things to native languages, and then deploy them in non perfect conditions you get the mess we’re in now.
> Unity is locked to a horrifically ancient version of mono and still manages to do a lot of work in a small period of time
Unity is the great battery killer!
The last example I remember is that I could play the first xcom remake (which had a native mac version) on battery for 3-4 hours, while I was lucky to get 2 hours for $random_unity_based_indie with less graphics.
In my defence, I never said it was efficient! Games always suck for battery because they’re constantly rendering.
> Games always suck for battery because they’re constantly rendering.
Only the badly designed ones.
They don't need to. Enable v-sync.
That’s still 60 times a second on my iPhone and 120 on my Samsung
Competently done games have vsync and also a fps limiter.
They also decouple game logic and input from drawing :)
And then we have those that thing "OMG BIGGER FPS" is better and burn your video card when they render the startup menu at 1480 fps...
I’m well aware how to make games, thanks, I’m the tech director on a AAA game with 15 years working on them.
I said “constantly” rendering and you and everyone seem to have taken the least charitable interpretation of what I said and decided I couldn’t possibly know what I’m talking about.
Well then, what are you talking about? Maybe my english isn't good enough.
> incrementally async load 500 icon or text files and have them run through all the same slowness”
This really shouldn't be slower when done asynchronously compared to synchronously. I would expect, actually, that it would be faster (all available cores get used).
> I would expect, actually, that it would be faster (all available cores get used).
And I think this assumption is what's killing us. Async != parallel for a start, and parallel IO is not guaranteed to be fast.
If you write a function:
And someone comes along and makes it into a batch operation; and provides it with 2 files instead of 1, you won't notice it. Over time 2 becomes 10, and 10 becomes 500. You're now at the mercy of whatever is running your tasks. If you yield alongside await [0] in an event loop, you introduce a loop iteration of latency in proceeding, meaning you've now introduced 500 loops of latency.In case you say "but that's bad code", well yes, it is. But it's also very common. When I was reading for this reply, I found this stackoverflow [0] post that has exactly this problem.
[0] https://stackoverflow.com/questions/5061761/is-it-possible-t...
Why would someone put an await inside a loop?
Don't get me wrong... I believe you have seen it, I just can't understand the thought process that led to that.
In my mind, await is used when you want to use the result, not when you store it or return it.
The same can be asked about any number of things - why would someone ever not free their memory, use a dynamic allocation, not use a lock, use an orm, not use an orm.
> I just can’t understand the thought process that led to that.
Bluntly, it’s not understanding the tools and fundamentals. Your original reply assumed that it would use all cores, for example. To me that’s obviously not true but I’m sure there’s 10 things you could list off that are obvious to you but I’d get wrong -
> Your original reply assumed that it would use all cores, for example.
Sure. My original reply assumed that there wouldn't be an unnecessary `await` inserted inside a loop.
Unless I'm misunderstanding the code you posted, removing the `await` in the loop would use all cores.
well I mean you'd use await foreach and IAsyncEnumerable equivalent... async would mean the UI would not be blocked so I agree with the original commenter you replied to.
paraphrasing another reply I left here, if everyone just wrote their code correctly we’d all be writing C and everything would be safe and fast. I’ve fixed this exact issue with this exact cause (someone wrapped an await in a loop and it passed code review because our benchmark test still passed. But the benchmark test simulated the entire stack on a local network and when we deployed it, all of a sudden it took 5 seconds to load)
I agree that these are very simple cases, and it's like benchmarking compiliation or execution of helloworld.cpp
It would be interesting to see something like:
- cold boot
- load Windows
- load Office
- open a 200 page document
- create pdf from said document
- open 5000 row spreadsheet
- do a mail merge
- open 150,000 record database
- generate some goofy report
Do people still do mail merge? Not sure.
It depends how the application is written in C#. A lot of Modern C# relies on IoC frameworks. These do some reflection shenanigans and this has a performance impact.
This is literally what I said:
> The author mentions rewriting core applications in C# on windows but I don’t think this is the problem. Write a simple hello world app in c#, compile it and see how long it takes to run vs a rust app or a python script - it’s almost native <...>
> My gut instinct is an adjustment to everything being asynchronous, combined <...> with frameworks that introduce latency, context switching, and are thin wrappers that spend most of our time FFI’ing things to native languages, and then deploy them in non perfect conditions you get the mess we’re in now.
I didn't really know what that second sentence meant tbh.
Specifically with C# reflection will cause the app have a big affect on startup time. I have seen this with almost all the versions of .NET.
Back then, programmers had to care about performance. The field of programming was less accessible, so the average skills to reach the barrier to entry were higher. So people were, on average, better programmers. The commercial incentives of today to reach market with something half-assed and then never fix it don’t help.
In 2002 I ran OpenBSD on my laptop (thus sacrificing wifi). The memory footprint of running X11, a browser, a terminal, and an editor: 28MB
Browsers are the big problem. Security and compatibility push upgrades to the latest, very heavy, ones.
[dead]
There was plenty of software that ran like absolute garbage “back then” but OS’s didn’t.
More than CPU speed, I think the increase in storage and RAM is to blame for the slow decay in latency. When you have only a few Kb/Mb of RAM and storage, you can't really afford to add much more to the software than what is the core feature. Your binary need to be small, which lead to faster loading in RAM, and do less, which means less things to run before the actual program.
When size is not an issue, it's harder to say no when the business demand for a telemetry system, an auto-update system, a crash handler with automatic report, and a bunch of features, a lot of which needs to be initialized at the start of the program, introducing significant latency at startup.
It's also complexity - added more than necessary, and @ a faster pace than hardware can keep up with.
Take font rendering: in early machines, fonts were small bitmaps (often 8x8 pixels, 1 bit/pixel), hardcoded in ROM. As screen resolutions grew (and varied between devices), OSes stored fonts in different sizes. Later: scalable fonts, chosen from a selection of styles / font families, rendered to sub-pixel accuracy, sub-pixel configuration adjustable to match hw construction of the display panel.
Yeah this is very flexible & can produce good looking fonts (if set up correctly). Which scale nicely when zooming in or out.
But it also makes rendering each single character a lot more complex. And thus eats a lot more cpu, RAM & storage than 8x8 fixed size, 1bpp font.
Or the must-insert-network-request-everywhere bs. No, I don't need search engine to start searching & provide suggestions after I've typed 1 character & didn't hit "search" yet.
There are many examples like the above, I won't elaborate.
Some of that complexity is necessary. Some of it isn't, but lightweight & very useful. But much of it is just a pile of unneeded crap of dubious usefulness (if any).
Imho, software development really should return to 1st principles. Start with a minimum viable product, that only has the absolute necessary functionality relevant to end-users. Don't even bother to include anything other than the absolute minimum. Optimise the heck out of that, and presto: v1.0 is done. Go from there.
> But it also makes rendering each single character a lot more complex.
Not millions of times more complex.
Except for some outliers that mess up everything (like anything from Microsoft), almost all of the increased latency between keypress and character rendering we see on modern computers comes from optimizing for modularity and generalization instead of specialized code for handling the keyboard.
Not even our hardware reacts fast enough to give you the latency computers had in the 90s.
Im not really sure what are you talking about ;) HW become much much much faster. Mostly in speed of computing, but latency also dropped nicely. The letency bloat you see its 99% of software (OS). I still run Win2003 on modern desktop, and it flies! Really, booting/shutdown is quick. Im on spinning rust, so first start of webbrowser is slowish a bit, but once cached, its like 200ms-500ms depending on version (more modern = slower).
Take a look on the latency of a keypress on your modern keyboard to when your CPU has the chance to first process the data, you'll be surprised. Depending on how your hardware is configured, it can reach 100ms there alone.
Your mouse has an equivalent issue, except that it's not usually optimized to the same level, so the worst case is way more common. Audio has a much worse issue, and can lag a large fraction of a second on hardware alone. And there's the network, that has optional features that are emulated by adding lags proportional to the bandwidth.
All of our hardware has become more capable with time, but that doesn't mean latency has decreased. Some kinds of latency have gone down, others have gone way up.
Okey, yes.. Its a mix of HW/OS issue indeed. I remember my old gaming rig I assembled more than 10 years ago. Asus mother board, i5-760 CPU, ATI HD 6850. Win2003 as OS. All was tuned and it was really great. Basic DPC latency was around 30-40us. Under load it increased slighty but it was always <100us.
Now catastrohpic (with I didnt knew at that time) event occured. After 10 years, Internal NIC burnded. I was like, okey.. I have dozens of PCI NICs, lets plug one and vioala. And I did. But, there was problems, after a while (hour or so) I noticed Audio glitches, especially when when there was network activitiy. After more investigation and reading mobo manual I noticed that IRQs were nicely spread for all internal mobo components + PCIe x16 bus. Other PCI ports were always shared one or another. I could do nothing to fix it.
PC now catch dust, I bought used HP 8200 PC with works nice, but its not gaming rig, standard DPC latency is around 2000us, with is quite large.. Still, for normal use that latency is great, Im very sensitive to lag and latency so if I had issues here, I would be mad.
At the end, some pics from my DPC stall fight:
http://ds-1.ovh.uu3.net/~borg/pics/DPClat.png
http://ds-1.ovh.uu3.net/~borg/pics/DPC_stall.png
I'm not sure your font rendering is very good example here. Windows has used vector fonts since 90s and ClearType since Windows XP. That is nearly 25 years ago. And it wasn't really much of a performance issue even back then.
Correct. Modern font rendering likely falls into that "more complex, but lightweight / useful" category.
My point was it's much more complex even though it does essentially the same thing (output character to screen).
>> There are many examples like the above
Death by a 1000 cuts! Probably much worse offenders out there that deserve being attacked.
> Notepad had been a native app until very recently, and it still opened pretty much instantaneously. With its rewrite as a UWP app, things went downhill. The before and after are apparent, and yet… the app continues to be as unfeatureful as it had always been. This is extra slowness for no user benefit.
We now have HUGE (/s) advancements in Notepad, like tabs and uh... Copilot
Don't forget dark mode!
I remember Windows 3.1 where I could change not only the color of the buttons, but the color of the light and shadow edge.
I think it's the antivirus and security protection that makes windows slow at opening apps. As for people saying the M series macs are slow, those are the first macs that felt fast to me in many years.
My take on it - performance decays when engineering management doesn’t prioritize it.
Modern example: Laptops boot in seconds. My servers take about 5 minutes to get to Linux boot, with long stretches of time taken by various subsystems, while Coreboot (designed to be fast) boots them nearly as quickly as a laptop.
Old example: early in my career we were developing a telecom system with a 5 min per year (5 9s) downtime target. The prototype took 30 minutes to boot, and engineers didn’t care because management hadn’t told them to make it boot faster. It drove me nuts. (a moot point, as it eventually got cancelled and we all got laid off)
I suspect (in a common pattern) the main thing that blocks making performance a priority is that it equates to reordering various ranks among developers and product managers.
When performance supersedes "more features", developers are gatekeepers and manager initiatives can be re-examined. The "solution" is to make performance a non-priority and paint complainers as stale and out-of-fashion.
Probably more common is that software isn't developed with end-users as #1 priority; it's developed to advance business goals.
Those 2 goals align about as often as planets in our solar system.
To some degree this is true for open source software as well. Developers may choose to work on features they find interesting (or projects done in a language they feel comfortable with), vs. looking at user experience first. Never mind that optimizing UX is hard (& fuzzy) as it is.
Or all the work done on libre software to cater to various corporate interests. As opposed to power users fixing & improving things for their needs.
Or you can have technical managers that understand what they are managing.
I think that the author is spot on about the cause of the problem: software developers (meaning the organizations that produce software, not necessarily the individuals writing code) prioritize selfish goals like ease of development or profits over the quality of the product. This has resulted in a lot of software being quite terrible. Not just slow (though definitely slow), but also buggy and crammed full of features that users hate (but which make the developer money). The only market that seems to reliably produce quality software any more is the open source community, because they are making the software that they themselves use and their incentives are aligned with users.
> software developers ... prioritize selfish goals like ease of development or profits over the quality of the product.
What is an unspoken assumption in this type of complaint is that the software would still have gotten written if it had to be more handwritten. The amount of projects I've seen beached on an outdated version of Spring because they neglected upgrading even patch releases convinces me that more effort is just not tenable.
Are mobile devices slow/unresponsive. I haven't experienced that unless I realllllly cheap out. Or after 4 years of OS updates on Apple devices for some reason. Androids seem OK in this regard.
I switched from android back to iOS last year. There seems to be some sort of inherent latency in either android or Samsung’s UI that causes the UI thread to lag behind your inputs by a noticeable amount, and for the UI thread to block app actions in many cases.
Things like summoning a keyboard causing my 120hz galaxy phone to drop to sub 10fps playing the intro animation for GBoard were just rampant. All non existent in iOS
I do wonder if part of it is down to Android default animation speeds.... Pixel 6 here, Gboard snappy enough. Something I do on every android device I own though is go into developer settings and change all the animation durations to 0.5x. Makes stuff seem snappier. In reality I'm sure it's dropping just as many frames as it async loads garbage enterprise uncompressed asset icons or whatever, but hey it shows up on screen 2x as fast!!!!
Edit: oh, no, you have a point about the UI blocking stuff, it's fine when apps are loaded and active but "cold booting" a UI component definitely has lags in stupid places, android UX feels like a web perform sometimes due to that.... Tap button, go on holiday for a week, come back and it's responded to the button press (while you were trying to do something completely different and now you've pressed something else and you're not sure what because this time the button you pressed closed the activity overlay 1ms after)
I begin to wonder if all the commenters in this thread have compromised devices. I'm on a 5 year old Samsung (the model is, I bought it new six months ago) - my Linux machines are fast (gentoo) and my windows 10 and 11 machines are fast. My kid's computer is an i3 7350k and he plays roblox, Minecraft, teardown on it with no issues. That computer is a couple years older than he is at 9-10 years old. That computer's twin is my nas backup with 10gbe running windows - the drive array refused to work at any decent speed on Linux and I didn't want jbod, I wanted RAID.
Some things are slow, like discord on windows takes ~12 seconds before it starts to pop in ui elements after you double click. My main computer is beefy, though, 32 thread 128GB; but no NVMe. Sata spindle and SSDs. But I have Ryzen 3600s that run windows 11 fine.
Did you watch the videos linked in the article?
> I'm on a 5 year old Samsung (the model is, I bought it new six months ago)
How quick is the Share menu on your samsung? On mine, it takes about 3 seconds of repainting itself before it settles down. On iOS there's about a 100ms pause and the drawer pops up, fully populated. I found [0] which is a perfect example of this sort of bloat.
> My main computer is beefy, though, 32 thread 128GB; but no NVMe. Sata spindle and SSDs. But I have Ryzen 3600s that run windows 11 fine.
My main computer is a 24 core i9 with 64GB ram on NVMe. It runs windows fine. But, I saw exactly the same behaviour out of the box on this machine (and on the machine I replaced) as the linked article shows. I can compile, play games, do AV transcoding. But using apps like slack or discord is like walking through molasses, and even launching lightweight apps like WIndows Terminal, and Notepad have a noticeable delay from button press to the window appearing on screen. There's just something a bit broken about it.
[0] https://www.androidpolice.com/2018/05/05/google-please-fix-a...
No i was referring only to the comments here. The share menu comes up real fast, there's no "delay", as soon as the menu with "share" goes away, the bottom slides up and the share panel appears, icons populated.
Is Slack also electron? a cursory search says discord is electron. Now, firefox lags to start, too - compared to edge, which is nearly instant to "launch". Brave is also <1second to launch. I don't use "notepad", but i just ran it, and while it's slower than the old notepad.exe, it's much faster than notepad++ to come up - but i generally launch that once and leave it running, same with firefox.
I don't use windows terminal, i use pwsh 7.5.x, and it's <1 second to be ready for input. the window comes up real fast, and then the text appears about that long again later. I launched cmd.exe and it came up perhaps a bit slower than pwsh, but not noticeably so.
this is what i am talking about, i don't notice anything slow on my computer, any of my computers, really. This leads me to believe that either the people who do experience slowness have a compromised system, or some other issue. 5400 RPM data/boot drive, only using electron apps (which are slow in general), or otherwise misconfigured.
I haven't used Ubuntu desktop since they tampered with the system menu/start menu, so i have no idea what that is like now. The server can be snappy, though - you just have to configure networking and the like correctly. I'm not a fan of systemd; that colors my opinion of a lot of linux systems. I use Gentoo and Devuan because they fully support OpenRC, which i consider vastly superior for my use cases. However, i do maintain a couple of ubuntu based OSes for neighbors on HP EliteDesk SFF computers, and they seem alright, 3-4x as fast as an rpi3/4 at ubuntu desktop, as far as launch lag and boot times and the like.
I think to put this to bed, we need to establish a baseline or at least a list of applications to launch while screen recording, and then literally count frames. It would be funny if this was all perception and it wasn't actually "slow", that is, "seconds to load" is actually like 2 seconds, and not 12...
> I think to put this to bed, we need to establish a baseline or at least a list of applications to launch while screen recording, and then literally count frames. It would be funny if this was all perception and it wasn't actually "slow", that is, "seconds to load" is actually like 2 seconds, and not 12...
2 seconds is an eternity. Its 15 round trips from London to New York, 5-8GB read from an NVMe drive, and 7 billion cpu cycles per core - with even the cheapest devices on the market having multiple cores. There’s no really any excuse for things being that slow.
Also I think that this idea that the only number that matters is the synthetic exact frame count from interaction to window for a specific list of operations is how we ended up in the scenario where windows explorer takes over a second to show the window on modern OS’s despite multiple orders of magnitude more computing power being available. Nobody really cares about what the actual performance of the running system is, as long as the exact part they’re responsible for doesn’t trip an alert.
A few years ago I worked on a project for a batch operation that was done a few hundred times a day. It took 15 minutes to run and it was starting to get in the way of things. Another team had hired 2 engineers to spin up kafka and a k8s cluster to queue these operations up, along with a reporting dashboard and metrics for queue length etc.
I ran the code through a profiler and found a hotspot on shutdown where we were searching through an array millions of times to remove a single entry, repeatedly, until the array was finished. I replaced it with a set lookup in reverse, and changed a handful of places where we rebuilt this massive array and did the same thing to use the same data structure and it dropped to under 15 seconds with about a weeks work. I identified enough other wins that would bring it under a second that would negate the need for the entire other project to exist, but it had already been approved and budget allocated for it, so last I heard they’re still taking 15 seconds and spending hundreds of thousands a month on this absolutely gargantuan monitoring system that they don’t need.
In both cases, IMO, a vision holder of “this isn’t good enough” using the product in their own device at home is what’s needed
My share is 1s. Didn't realise it was that slow. From the UI interaction I wonder if it is on purpose to avoid an accidental share.
FWIW I updated my phone to a relatively budget samsung recently, and had a similar noticeable delay to bring up the keyboard, installing 'simple keyboard' from F-droid seems to have helped. I wouldn't be surprised if it is missing features compared to the samsung/google ones where their absence will annoy a power user, but for whatever subset I use it works fine and doesn't appear as though my phone hangs.
But but ... Samsung has bigger numbers in the spec sheet! It must be faster!
I have a top end Pixel phone running stock Android and encounter latency all the time. All first start time is usually a couple of seconds. Switching back to an open app is fast in many cases but some still have to do a refresh (which I suspect involves communication to servers, although that's still not much of an excuse).
Wasn't there some common Windows 10 bug a while back where Command Prompt would take forever to load because of DNS lookups or some crap?
These days you aren't just opening a 64k executable (notepad), you're calling back to the mothership, recording usage data, blah blah
So is there any hope for improvement?
Personally I've decided to just vote with my feet and avoid using poor performing software as much as possibl, but that's frequently impractical or not worth the cost of missing out. I also doubt this will change the behaviors of companies as we see with, for example, TV advertising that they give no shits about degrading the consumer experience over the long term.
There doesn't seem much hope on the technical side either as software complexity is only increasing.aybe longer term AI has a role to play in auto-optimization?
Too much developers relying on bloatware, not enough “implement it yourself” because in reality every hash map doves a unique problem that a Palin hash map is not necessarily suited for. Embrace NIH
Similarly (linked in a footnote): http://danluu.com/input-lag/
I’ve recently noticed this on an especially well used app I have on my iPhone 14 with a stupid animation which regularly annoys me.
Google Authenticator’s filter box, when you tap it there is a very noticeable delay after tapping the filter box and the keyboard showing.
And what makes it worse is that if you switch away from the app, it auto clears the filter.
This isn’t a complex app and it’s slow at doing a use case easily performed millions of times a day.
This is why I do not use any google apps and deleted my google anccount. Their shitty software pisses me off.
Every time I try to use it, the UI elements shift around as I try an action as simple as activating a search box. But it is just laggy enough that you click on something a second time, since it apparently didn’t activate the first time. But by then it’s shifted everything around and you are either cancelling what you were doing, or you are taken off into some new workflow you didn’t want.
The absolute WORST is when you focus the search box and it shifts to the top of the screen. With lag, you click on the search box again, but now where it was is a list of completion suggestions and you are taken off to some search result page you never asked for.
It’s fucking infuriating and I won’t entertain it.
Startup time has always been a bit of a sketchy metric as modern OSs and languages do a lot of processing on application launch. Some scan for viruses. On Macs you have checks for x86 vs Apple silicon and loading of Rosetta if required. Managed runtime environments have various JITs that get invoked. And apps are now huge webs of dependencies, so lots of dynamically linked code being loaded. A better metric is their performance once everything is in memory. That said, I still think we’re doing poorly at that metric as well. As resources have ballooned over the last decade, we’ve become lazy and we just don’t care about writing tight code.
Why not both?
The applications are launched at startup time because they have runtime startup slowness. The applications have startup slowness because of the JIT runtime/deps/.dll.
At the end of the day, end users pay for the cost of developer convenience (JIT and deps most of the time, even thought there are some case where dynamic linking is alright) because they don't do native apps.
Offloading everything at startup is a symptom IMO.
Replying to your specific point about virus scans. For some (naive) reason, I expect them to run a single time against a binary app that is never changed. So in theory it shouldn't even be a problem, but the reality says otherwise.
> Replying to your specific point about virus scans. For some (naive) reason, I expect them to run a single time against a binary app that is never changed.
Playing devil's advocate: the executable might not have changed, but the database of known virus signatures changes daily or more often. Either every single executable would have to be re-scanned every time the database updates, or the executable has to be lazily scanned on load.
> How does this all happen? It’s easy to say “Bloat!”
Bloat!
It was pretty easy indeed.
"Makes me sick, motherfucker, how far we done fell." --Det. William 'Bunk' Moreland
[dead]