Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models

(arxiv.org)

69 points | by mfiguiere 4 days ago ago

9 comments