Open speech-to-speech models and pipelines by Hugging Face toolkit AI
Speech-to-text, text-to-speech, and speaker recognition
Multilingual speech recognition and audio understanding model
Speech recognition module for Python
Open-source industrial-grade ASR models
Captcha solver extension for humans
A free, open source, and extensible speech-to-text application
Speech recognition for your site
Cross-platform AI language practice app
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Replace OpenAI GPT with another LLM in your app
Speech to Text to Speech, sends text as OSC messages
Omnilingual ASR Open-Source Multilingual SpeechRecognition
AzioSpeech Recognition and Translation
Real-time voice interactive digital human
Open source AI VTuber platform with voice chat and Live2D avatars
Build your own AI friend
AI-powered tool for generating, optimizing, and translating subtitles
The media player for language learning, with dual subtitles
Realtime AI Voice Agents with SoTA Multimodal AI models on Arduino ESP
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
Framework for building neural networks
A Web UI for easy subtitle using whisper model
Python Audio Analysis Library: Feature Extraction, Classification
Build voice-based LLM agents. Modular + open source