Not exactly a bug, but I was given a company written video player that receives a video stream, decodes it via the browser WebCodecs API, and renders via WebGL. Users complained that video was laggy and often froze on their iPhones. My task was to make it perform better - using the browser's built-in player wasn't an option.
After profiling, I found two bottlenecks: converting frames to RGB was happening on the CPU and was quite costly, so I rendered the decoded YUV frames directly on the GPU without conversion. Second, I moved all logic off the main thread since our heavy UI was competing for the same resources.
The main thread thing was that I was iterating through the frame buffer multiple times per second to select the appropriate frame for rendering. When heavy UI animations occurred, the main thread would block, causing the iteration to complete late - by then, the target frame's timestamp had passed, so it would get skipped and only the next frame would be drawn, creating visible stuttering.
While building GTWY, we realized stack traces stop being useful once workflows go async. So we designed things around step-level visibility and shared context instead.
Async stack traces are a nightmare. You lose the causality chain completely.
We ran into a similar issue with 'Shared Context.' We tried to sync the context between an x86 server and an ARM edge node, but because of the floating-point drift, the 'Context' itself was slightly different on each machine.
Step-level visibility is great, but did you have to implement any strict serialization for that shared context to keep it consistent?
Not exactly a bug, but I was given a company written video player that receives a video stream, decodes it via the browser WebCodecs API, and renders via WebGL. Users complained that video was laggy and often froze on their iPhones. My task was to make it perform better - using the browser's built-in player wasn't an option.
After profiling, I found two bottlenecks: converting frames to RGB was happening on the CPU and was quite costly, so I rendered the decoded YUV frames directly on the GPU without conversion. Second, I moved all logic off the main thread since our heavy UI was competing for the same resources.
The main thread thing was that I was iterating through the frame buffer multiple times per second to select the appropriate frame for rendering. When heavy UI animations occurred, the main thread would block, causing the iteration to complete late - by then, the target frame's timestamp had passed, so it would get skipped and only the next frame would be drawn, creating visible stuttering.
While building GTWY, we realized stack traces stop being useful once workflows go async. So we designed things around step-level visibility and shared context instead.
Async stack traces are a nightmare. You lose the causality chain completely.
We ran into a similar issue with 'Shared Context.' We tried to sync the context between an x86 server and an ARM edge node, but because of the floating-point drift, the 'Context' itself was slightly different on each machine.
Step-level visibility is great, but did you have to implement any strict serialization for that shared context to keep it consistent?