Generate audiobooks from e-books, voice cloning & 1107+ languages
A lightweight text-to-speech model with zero-shot voice cloning
Foundational model for human-like, expressive TTS
AI tool for automatic batch short video creation and editing
On-device Speech-to-Intent engine powered by deep learning
Open source machine learning framework to automate text conversations
World's first open-source, agentic video production system
Automagically synchronize subtitles with video
Free, high-quality text-to-speech API endpoint to replace OpenAI
C++ inference library for multiple SVC/TTS
Bailing is a voice dialogue robot similar to GPT-4o
Open-source abilities for OpenHome agents
PersonaPlex code
AI framework for automated short video creation and editing tools
Open source personal AI Assistant for Linux, Windows and Mac
Free open source speech synthesizer for Russian and other languages
Use Microsoft Edge's online text-to-speech service from Python
Open-source model for program synthesis
The most powerful local music generation model
From Images to High-Fidelity 3D Assets
Fast multimodal LLM for real-time voice interaction and AI apps
Berkeley Quantum Synthesis Toolkit
Multi-modal large language model designed for audio understanding
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Automatic Speech Recognition with Word-level Timestamps