Show HN: Llama 3.2 Interpretability with Sparse Autoencoders

(github.com)

575 points | by PaulPauls 7 days ago ago

97 comments