Official MiniMax Model Context Protocol (MCP) server
Open-source framework for intelligent speech interaction
Framework for building real-time voice and multimodal AI agents
Open-source multi-speaker long-form text-to-speech model
A nearly-live implementation of OpenAI's Whisper
MARS5 speech model (TTS) from CAMB.AI
Generate audiobooks from e-books, voice cloning & 1107+ languages
Audio foundation model excelling in audio understanding
A simple native web interface that uses ChatTTS to synthesize text
Controllable and fast Text-to-Speech for over 7000 languages
Qwen3-ASR is an open-source series of ASR models
Real-time voice interactive digital human
EPUB to audiobook converter, optimized for Audiobookshelf
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Persian NLP Toolkit
AI-powered tool for generating, optimizing, and translating subtitles
Synchronized Translation for Videos
Clone a voice in 5 seconds to generate arbitrary speech in real-time
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
SOTA discrete acoustic codec models with 40/75 tokens per second
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML
A TTS model capable of generating ultra-realistic dialogue
Multi-lingual large voice generation model, providing inference
Underthesea - Vietnamese NLP Toolkit
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX