Build and run agents you can see, understand and trust
A generative speech model for daily dialogue
Framework for building realtime multimodal voice AI agents apps
Flowly is 100x faster than OpenClaw
AI Slack bot for reading, summarizing, and chatting with content
MARS5 speech model (TTS) from CAMB.AI
Run a full local LLM stack with one command using Docker
Generate audiobooks from e-books
PyTorch3D is FAIR's library of reusable components for deep learning
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
A specialized Claude Code workspace for creating long-form
Easy-to-use Speech Toolkit including Self-Supervised Learning model
Framework for building AI-powered interactive digital humans and agent
SoTA open-source TTS
Machine learning on FPGAs using HLS
SDG is a specialized framework
Open Source Deep Research Alternative to Reason and Search
Management of Yandex Station and other smart home devices
Voice Recognition to Text Tool
Fully Local Manus AI. No APIs, No $200 monthly bills
Multilingual speech recognition and audio understanding model
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
A nearly-live implementation of OpenAI's Whisper
A fast TTS architecture with conditional flow matching