ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Training data (data labeling, annotation, workflow) for all data types
An opinionated CLI to transcribe Audio files w/ Whisper on-device
Omnilingual ASR Open-Source Multilingual SpeechRecognition
Generate audiobooks from e-books
Multi-modal large language model designed for audio understanding
Easy-to-use Speech Toolkit including Self-Supervised Learning model
NLP Cloud serves high performance pre-trained or custom models for NER
Han Language Processing
Management of Yandex Station and other smart home devices
A Web UI for easy subtitle using whisper model
Large Audio Language Model built for natural interactions
Chat with it via text and voice
Virtual AI anchor that combines state-of-the-art technology
Official Python inference and LoRA trainer package
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
Powerful Android AI agent with tools, automation, and Linux shell
Flowly is 100x faster than OpenClaw
Python Audio Analysis Library: Feature Extraction, Classification
FAIR Sequence Modeling Toolkit 2
Towards Studio-Grade Character Animation via In-Context Learning of 3D
High-quality multi-lingual text-to-speech library by MyShell.ai
Framework for building AI-powered interactive digital humans and agent
Pre-trained Deep Learning models and demos
Trained models & code to predict toxic comments