VoxCPM2 is an advanced open-source text-to-speech system that redefines speech synthesis by eliminating traditional tokenization and instead generating continuous speech representations through a diffusion-based autoregressive architecture. Built on top of the MiniCPM model family, it enables highly natural, expressive, and context-aware speech generation that adapts tone, emotion, and pacing directly from input text. The system is trained on massive multilingual datasets, enabling support for dozens of languages and dialects while maintaining high fidelity and realism in generated audio. VoxCPM stands out for its ability to perform voice cloning with minimal input, capturing not only the speaker’s timbre but also nuanced features such as rhythm, accent, and emotional delivery. It also introduces voice design capabilities, allowing users to generate entirely new voices from natural language descriptions without requiring reference audio.

Features

  • Tokenizer-free speech generation using diffusion autoregressive modeling
  • Multilingual support across dozens of languages without explicit tagging
  • High-quality voice cloning from short reference audio samples
  • Voice design from natural language descriptions without audio input
  • Real-time streaming synthesis with low latency performance
  • Studio-quality audio output with built-in super-resolution

Project Samples

Project Activity

See All Activity >

Categories

Text to Speech

License

Apache License V2.0

Follow VoxCPM2

VoxCPM2 Web Site

Other Useful Business Software
Next-Gen Encryption for Post-Quantum Security | CLEAR by Quantum Knight Icon
Next-Gen Encryption for Post-Quantum Security | CLEAR by Quantum Knight

Lock Down Any Resource, Anywhere, Anytime

CLEAR by Quantum Knight is a FIPS-140-3 validated encryption SDK engineered for enterprises requiring top-tier security. Offering robust post-quantum cryptography, CLEAR secures files, streaming media, databases, and networks with ease across over 30 modern platforms. Its compact design, smaller than a single smartphone image, ensures maximum efficiency and low energy consumption.
Learn More
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of VoxCPM2!

Additional Project Details

Programming Language

Python

Related Categories

Python Text to Speech Software

Registered

10 hours ago