Python library and CLI tool to interface with Google Translate
Industrial-level controllable zero-shot text-to-speech system
PersonaPlex code
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
TTS with kokoro and onnx runtime
A lightweight text-to-speech model with zero-shot voice cloning
An Open Source text-to-speech system built by inverting Whisper
End-to-end speech processing toolkit
High-Quality Voice Cloning TTS for 600+ Languages
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Free, high-quality text-to-speech API endpoint to replace OpenAI
Framework for building realtime multimodal voice AI agents apps
Capable of understanding text, audio, vision, video
Converts text to speech in realtime
Translate the video from one language to another and embed dubbing
A simple, high-quality voice conversion tool focused on ease of use
Voice Recognition to Text Tool
A fast TTS architecture with conditional flow matching
Toolkit for conversational AI
Fast multimodal LLM for real-time voice interaction and AI apps
Controllable & emotion-expressive zero-shot TTS
Offline inference engine for art, real-time voice conversations
Towards Human-Sounding Speech
Faster Whisper transcription with CTranslate2
State-of-the-art TTS model under 25MB