OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computers

(os-world.github.io)

77 points | by kristianpaul 16 days ago ago

39 comments