10 points | by i386 9 hours ago ago
3 comments
You lost me on "spare GPU". I don't have any capable GPUs, let alone spare ones :)
This is very promising, definitely looks more user friendly than exo. Can't wait to try it out.
> MoE models via expert sharding with zero cross-node inference traffic
This makes the whole project questionable
You lost me on "spare GPU". I don't have any capable GPUs, let alone spare ones :)
This is very promising, definitely looks more user friendly than exo. Can't wait to try it out.
> MoE models via expert sharding with zero cross-node inference traffic
This makes the whole project questionable