There are so many extra steps, obviously the CPU is designed for legacy monolithic OS like Windows which uses syscalls rarely and would work slowly with much safer and better, than Windows, microkernels.
For example, why bother saving userspace registers? Just zero them out to prevent leaks. Ideally with a single instruction.
The article quotes the Intel docs: "Instruction ordering: Instructions following a SYSCALL may be fetched from memory before earlier instructions complete execution, but they will not execute (even speculatively) until all instructions prior to the SYSCALL have completed execution (the later instructions may execute before data stored by the earlier instructions have become globally visible)."
More detail here would be great, especially using the terms "issue" and "commit" rather than execute.
A barrier makes sense to me, but preventing instructions from issuing seems like too hard of a requirement, how could anyone tell?
it might have more to do with the difficult in separating out the contexts of the two execution streams across the rings. someone may have looked at the cost and complexity of all that accounting and said 'hell no'
Yeah, I would probably say the same. It is a bit strange to document this as part of the architecture (rather than leaving it open as a potential future microarchitectural optimization). Is there some advantage an OS has knowing that the CPU flushes the pipeline on each system call?
Linux used to deliver relatively low syscall overhead esp. on modern aggressively speculating CPUs.
But after spectre+meltdown mitigations landed it felt like the 1990s all over again where syscall overhead was a huge cost relative to the MIPS available.
There are so many extra steps, obviously the CPU is designed for legacy monolithic OS like Windows which uses syscalls rarely and would work slowly with much safer and better, than Windows, microkernels.
For example, why bother saving userspace registers? Just zero them out to prevent leaks. Ideally with a single instruction.
On a secure system (not serving to the Internet, and all trusted local users), you can add "mitigations=off" to greatly improve performance.
https://fosspost.org/disable-cpu-mitigations-on-linux
The article quotes the Intel docs: "Instruction ordering: Instructions following a SYSCALL may be fetched from memory before earlier instructions complete execution, but they will not execute (even speculatively) until all instructions prior to the SYSCALL have completed execution (the later instructions may execute before data stored by the earlier instructions have become globally visible)."
More detail here would be great, especially using the terms "issue" and "commit" rather than execute.
A barrier makes sense to me, but preventing instructions from issuing seems like too hard of a requirement, how could anyone tell?
it might have more to do with the difficult in separating out the contexts of the two execution streams across the rings. someone may have looked at the cost and complexity of all that accounting and said 'hell no'
Is it that difficult, add a "ring" bit to every instruction in instruction queue? Sorry I never made a OoO CPU before.
Yeah, I would probably say the same. It is a bit strange to document this as part of the architecture (rather than leaving it open as a potential future microarchitectural optimization). Is there some advantage an OS has knowing that the CPU flushes the pipeline on each system call?
And given Intel’s numerous speculation related vulnerabilities, it must have been quite a rare moment!!!
Linux used to deliver relatively low syscall overhead esp. on modern aggressively speculating CPUs.
But after spectre+meltdown mitigations landed it felt like the 1990s all over again where syscall overhead was a huge cost relative to the MIPS available.