ImpossibleBench: Measuring LLMs' Propensity of Exploiting Test Cases

(arxiv.org)

2 points | by BalinKing 8 hours ago ago

No comments yet.