Generate audiobooks from e-books, voice cloning & 1107+ languages
A lightweight text-to-speech model with zero-shot voice cloning
On-device Speech-to-Intent engine powered by deep learning
World's first open-source, agentic video production system
Open source machine learning framework to automate text conversations
Foundational model for human-like, expressive TTS
Free, high-quality text-to-speech API endpoint to replace OpenAI
C++ inference library for multiple SVC/TTS
Bailing is a voice dialogue robot similar to GPT-4o
Open-source abilities for OpenHome agents
AI framework for automated short video creation and editing tools
The most powerful local music generation model
From Images to High-Fidelity 3D Assets
Open source personal AI Assistant for Linux, Windows and Mac
Open-source model for program synthesis
Use Microsoft Edge's online text-to-speech service from Python
Fast multimodal LLM for real-time voice interaction and AI apps
Multi-modal large language model designed for audio understanding
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Automatic Speech Recognition with Word-level Timestamps
Towards Human-Sounding Speech
Chat with it via text and voice
Open-source framework for conversational voice AI agents
Wan2.1: Open and Advanced Large-Scale Video Generative Model
A Model Context Protocol Server for Home Assistant