Nvidia releases 8B model with learned 8x KV cache compression

(huggingface.co)

7 points | by alecco 18 hours ago ago

3 comments