A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Easy-to-use Speech Toolkit including Self-Supervised Learning model
A Python library for audio
Machine learning on FPGAs using HLS
Open Source Deep Research Alternative to Reason and Search
Voice Recognition to Text Tool
Generate audiobooks from e-books
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Generate high-definition story short videos with one click using AI
DeepCode: Open Agentic Coding
A Web UI for easy subtitle using whisper model
Synchronized Translation for Videos
Multilingual speech recognition and audio understanding model
An open source python library for automated feature engineering
Framework for building AI-powered interactive digital humans and agent
High-Fidelity and Controllable Generation of Textured 3D Assets
A nearly-live implementation of OpenAI's Whisper
Fully Local Manus AI. No APIs, No $200 monthly bills
A natural language interface for computers
Convert AI papers to GUI
Capable of understanding text, audio, vision, video
Chat & pretrained large audio language model proposed by Alibaba Cloud
ComfyUI wrapper nodes for HunyuanVideo
LLM Council works together to answer your hardest questions
Aider is AI pair programming in your terminal