vLLM large scale serving: DeepSeek 2.2k tok/s/h200 with wide-ep

(blog.vllm.ai)

146 points | by robertnishihara 3 days ago ago

60 comments