Framework for building real-time voice and multimodal AI agents
One-stop AI digital human system with video voice synthesis tools
Tokenizer-Free TTS for Multilingual Speech Generation
Open source AI VTuber platform with voice chat and Live2D avatars
Industrial-level controllable zero-shot text-to-speech system
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Controllable & emotion-expressive zero-shot TTS
High-Quality Voice Cloning TTS for 600+ Languages
A high-quality rapid TTS voice cloning model
Qwen3-TTS is an open-source series of TTS models
Curated collection of Amazing Python scripts
Multi-lingual large voice generation model, providing inference
In-App assistant SDK to build a multimodal conversational UX websites
Long-form streaming TTS system for multi-speaker dialogue generation
Large Audio Language Model built for natural interactions
Realtime AI Voice Agents with SoTA Multimodal AI models on Arduino ESP
Open speech-to-speech models and pipelines by Hugging Face toolkit AI
TTS with kokoro and onnx runtime
Production ready toolkit to run AI locally
Real-time voice interactive digital human
A robust, efficient, low-latency speech-to-text library
Open Source Speech Language Model
Component library and custom registry built on top of shadcn/ui
AI tool for automatic batch short video creation and editing
A lightweight text-to-speech model with zero-shot voice cloning