From Images to High-Fidelity 3D Assets
Fast multimodal LLM for real-time voice interaction and AI apps
Open source personal AI Assistant for Linux, Windows and Mac
The most powerful local music generation model
Open-source model for program synthesis
Use Microsoft Edge's online text-to-speech service from Python
Open source AI wearable platform for recording and summarizing speech
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
AI framework for automated short video creation and editing tools
Multi-modal large language model designed for audio understanding
High-quality multi-lingual text-to-speech library by MyShell.ai
Chat with it via text and voice
Automatic Speech Recognition with Word-level Timestamps
Wan2.1: Open and Advanced Large-Scale Video Generative Model
One-click deployment (including offline integration package)
LLM Large Model of Selling Anchor
Towards Human-Sounding Speech
Software that uses AI to perform real-time voice conversion
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Virtual AI anchor that combines state-of-the-art technology
Interface for OuteTTS models
Robust Speech Recognition via Large-Scale Weak Supervision
Offline inference engine for art, real-time voice conversations
Chemcrow
Official MiniMax Model Context Protocol (MCP) server