Show HN: Speeding up LLM inference 2x times (possibly)

(asciinema.org)

419 points | by kolinko 15 days ago ago

122 comments