Repo of Qwen2-Audio chat & pretrained large audio language model
Chat & pretrained large audio language model proposed by Alibaba Cloud
Python Audio Analysis Library: Feature Extraction, Classification
AudioMuse-AI is an Open Source Dockerized environment
Fast multimodal LLM for real-time voice interaction and AI apps
Get your documents ready for gen AI
Large Multimodal Models for Video Understanding and Editing
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Private chat with local GPT with document, images, video, etc.
Toolkit for audio, music, and speech generation
SPPAS - the automatic annotation and analyses of speech
An extremely simple tool for separating vocals and background music
Task of transcribing piano recordings into MIDI files
General Speech Restoration
IPTV/NVR/CCTV/Video cloud https://fastocloud.com
Recommends music based upon your current taste.