We’ve just released Neuphonic TTS Air, a lightweight open-source speech foundation model under Apache 2.0.
The main idea: frontier-quality text-to-speech, but small enough to run in realtime on CPU. No GPUs, no cloud APIs, no rate limits.
Why we built this:
- Most speech models today live behind paid APIs → privacy tradeoffs, recurring costs, and external dependencies.
- With Air, you get full control, privacy, and zero marginal cost.
- It enables new use cases where running speech models on-device matters (edge compute, accessibility tools, offline apps).
We’ve just released Neuphonic TTS Air, a lightweight open-source speech foundation model under Apache 2.0.
The main idea: frontier-quality text-to-speech, but small enough to run in realtime on CPU. No GPUs, no cloud APIs, no rate limits.
Why we built this: - Most speech models today live behind paid APIs → privacy tradeoffs, recurring costs, and external dependencies. - With Air, you get full control, privacy, and zero marginal cost. - It enables new use cases where running speech models on-device matters (edge compute, accessibility tools, offline apps).
Repo: https://github.com/neuphonic/neutts-air
Would love feedback from HN on performance, applications, and contributions.
So basically kokoro, but VC backed? Both models use espeak, so it seems like it’s the same general approach.
Demo sounds great.
Really cool, all my experiments with local speech has been spotty. Keen to try this out.
Can it run on a gpu? Would it be faster?
How does this compare to Piper?
Appears to use a proprietary codec as well.
Piper is a VAE model which is quite robotic. This is a speech language model, which sound quite realistic.
You can listen to the model on this video => https://www.youtube.com/watch?v=YAB3hCtu5wE
The codec is open source: https://huggingface.co/neuphonic/neucodec
Then may I suggest that this should be edited?
> Audio Codec: NeuCodec - our proprietary neural audio codec that achieves exceptional audio quality at low bitrates using a single codebook
( https://huggingface.co/neuphonic/neutts-air#model-details )
> The codec is open source: https://huggingface.co/neuphonic/neucodec
This says it was trained on proprietary data.