Position: Coding Benchmarks Are Misaligned with Agentic Software Engineering

(arxiv.org)

1 points | by popey 10 hours ago ago

1 comments