Repo of Qwen2-Audio chat & pretrained large audio language model
Chat & pretrained large audio language model proposed by Alibaba Cloud
Python Audio Analysis Library: Feature Extraction, Classification
AudioMuse-AI is an Open Source Dockerized environment
Audio Plugin for Audio to MIDI transcription using deep learning
A library for audio and music analysis, feature extraction
Fast multimodal LLM for real-time voice interaction and AI apps
Cross-platform, customizable ML solutions
Get your documents ready for gen AI
A suite of advanced multi-modal LLMs
Large Multimodal Models for Video Understanding and Editing
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Private chat with local GPT with document, images, video, etc.
Toolkit for audio, music, and speech generation
SPPAS - the automatic annotation and analyses of speech
Local AI file organization with categorization and rename suggestions
An AI assistant for everyone, powered by the Qwen series models
An extremely simple tool for separating vocals and background music
Visual AI Workflow Builder
AI macOS app for real-time coding interview coaching assistance
A library for audio and music analysis, feature extraction.
Common Resource Grep
Task of transcribing piano recordings into MIDI files