Tree Search Distillation for Language Models Using PPO

(ayushtambde.com)

71 points | by at2005 13 hours ago ago

7 comments