TransMLA: Multi-head latent attention is all you need

(arxiv.org)

123 points | by ocean_moist 5 days ago ago

37 comments