Repo of Qwen2-Audio chat & pretrained large audio language model
Chat & pretrained large audio language model proposed by Alibaba Cloud
Python Audio Analysis Library: Feature Extraction, Classification
A library for audio and music analysis, feature extraction
AudioMuse-AI is an Open Source Dockerized environment
Audio Plugin for Audio to MIDI transcription using deep learning
Fast multimodal LLM for real-time voice interaction and AI apps
Get your documents ready for gen AI
Cross-platform, customizable ML solutions
A suite of advanced multi-modal LLMs
Large Multimodal Models for Video Understanding and Editing
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Private chat with local GPT with document, images, video, etc.
Toolkit for audio, music, and speech generation
SPPAS - the automatic annotation and analyses of speech
Local AI file organization with categorization and rename suggestions
An extremely simple tool for separating vocals and background music
Visual AI Workflow Builder
A library for audio and music analysis, feature extraction.
Common Resource Grep
Task of transcribing piano recordings into MIDI files
General Speech Restoration
IPTV/NVR/CCTV/Video cloud https://fastocloud.com