Framework for building real-time voice and multimodal AI agents
One-stop AI digital human system with video voice synthesis tools
Open source AI VTuber platform with voice chat and Live2D avatars
Tokenizer-Free TTS for Multilingual Speech Generation
Industrial-level controllable zero-shot text-to-speech system
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Controllable & emotion-expressive zero-shot TTS
A high-quality rapid TTS voice cloning model
High-Quality Voice Cloning TTS for 600+ Languages
Qwen3-TTS is an open-source series of TTS models
Curated collection of Amazing Python scripts
In-App assistant SDK to build a multimodal conversational UX websites
Large Audio Language Model built for natural interactions
Multi-lingual large voice generation model, providing inference
Long-form streaming TTS system for multi-speaker dialogue generation
Open speech-to-speech models and pipelines by Hugging Face toolkit AI
Realtime AI Voice Agents with SoTA Multimodal AI models on Arduino ESP
TTS with kokoro and onnx runtime
Production ready toolkit to run AI locally
Real-time voice interactive digital human
A robust, efficient, low-latency speech-to-text library
Open Source Speech Language Model
AI tool for automatic batch short video creation and editing
Component library and custom registry built on top of shadcn/ui
A lightweight text-to-speech model with zero-shot voice cloning