FairyFuse: Multiplication-Free LLM Inference on CPUs via Fused Ternary Kernels

(arxiv.org)

24 points | by PaulHoule 2 days ago ago

1 comments