Open speech-to-speech models and pipelines by Hugging Face toolkit AI
Speech-to-text, text-to-speech, and speaker recognition
Multilingual speech recognition and audio understanding model
Speech recognition module for Python
Open-source industrial-grade ASR models
Captcha solver extension for humans
A free, open source, and extensible speech-to-text application
Speech recognition for your site
Cross-platform AI language practice app
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Replace OpenAI GPT with another LLM in your app
Speech to Text to Speech, sends text as OSC messages
Translate the video from one language to another and embed dubbing
Open source AI VTuber platform with voice chat and Live2D avatars
Real-time voice interactive digital human
Build your own AI friend
The media player for language learning, with dual subtitles
AI-powered tool for generating, optimizing, and translating subtitles
Realtime AI Voice Agents with SoTA Multimodal AI models on Arduino ESP
Framework for building neural networks
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
A Web UI for easy subtitle using whisper model
Python Audio Analysis Library: Feature Extraction, Classification
Build voice-based LLM agents. Modular + open source
Models for the spaCy Natural Language Processing (NLP) library