Search Results for "audio source separation"

Showing 3543 open source projects for "audio source separation"

View related business solutions
  • The Most Powerful Software Platform for EHSQ and ESG Management Icon
    The Most Powerful Software Platform for EHSQ and ESG Management

    Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

    Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.
    Learn More
  • Next-Gen Encryption for Post-Quantum Security | CLEAR by Quantum Knight Icon
    Next-Gen Encryption for Post-Quantum Security | CLEAR by Quantum Knight

    Lock Down Any Resource, Anywhere, Anytime

    CLEAR by Quantum Knight is a FIPS-140-3 validated encryption SDK engineered for enterprises requiring top-tier security. Offering robust post-quantum cryptography, CLEAR secures files, streaming media, databases, and networks with ease across over 30 modern platforms. Its compact design, smaller than a single smartphone image, ensures maximum efficiency and low energy consumption.
    Learn More
  • 1
    Ultimate Vocal Remover (UVR5)

    Ultimate Vocal Remover (UVR5)

    GUI for a Vocal Remover that uses Deep Neural Networks

    This application uses state-of-the-art source separation models to remove vocals from audio files. UVR's core developers trained all of the models provided in this package (except for the Demucs v3 and v4 4-stem models).
    Downloads: 788 This Week
    Last Update:
    See Project
  • 2
    MLX-Audio

    MLX-Audio

    A text-to-speech, speech-to-text and speech-to-speech library

    MLX-Audio is a speech library built on Apple’s MLX framework and optimized for Apple Silicon machines (M-series Macs). It focuses on text-to-speech and speech-to-speech workflows, with APIs and a command-line interface that make it easy to generate high-quality audio from text. Because it uses MLX and targets Apple Silicon, inference is fast and can take advantage of hardware acceleration and quantization for efficient on-device performance. The project provides a straightforward CLI...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 3
    OpenVINO AI Plugins for Audacity

    OpenVINO AI Plugins for Audacity

    A set of AI-enabled effects, generators, and analyzers for Audacity

    A set of AI-enabled effects, generators, and analyzers for Audacity. These AI features run 100% locally on your PC, no internet connection is necessary. OpenVINO™ is used to run AI models on supported accelerators found on the user's system such as CPU, GPU, and NPU.
    Downloads: 163 This Week
    Last Update:
    See Project
  • 4
    Librosa

    Librosa

    Python library for audio and music analysis

    Librosa is a powerful Python library for analyzing and processing audio and music signals. Built on top of NumPy, SciPy, and matplotlib, it provides a wide range of tools for feature extraction, time-series manipulation, audio display, and music information retrieval. Whether you're building machine learning models for audio classification or visualizing spectrograms, Librosa is a go-to library for researchers and developers working in audio signal processing.
    Downloads: 13 This Week
    Last Update:
    See Project
  • Rezku Point of Sale Icon
    Rezku Point of Sale

    Designed for Real-World Restaurant Operations

    Rezku is an all-inclusive ordering platform and management solution for all types of restaurant and bar concepts. You can now get a fully custom branded downloadable smartphone ordering app for your restaurant exclusively from Rezku.
    Learn More
  • 5
    Kimi-Audio

    Kimi-Audio

    Audio foundation model excelling in audio understanding

    Kimi-Audio is an ambitious open-source audio foundation model designed to unify a wide array of audio processing tasks — from speech recognition and audio understanding to generative conversation and sound event classification — within a single cohesive architecture. Instead of fragmenting work across specialized models, Kimi-Audio handles automatic speech recognition (ASR), audio question answering, automatic audio captioning, speech emotion recognition, and audio-to-text chat in one system, enabling developers to build rich, multimodal audio applications without stitching together disparate components. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Qwen2-Audio

    Qwen2-Audio

    Repo of Qwen2-Audio chat & pretrained large audio language model

    Qwen2-Audio is a large audio-language model by Alibaba Cloud, part of the Qwen series. It is trained to accept various audio signal inputs (including speech, sounds, etc.) and perform both voice chat and audio analysis, producing textual responses. It supports two major modes: Voice Chat (interactive voice only input) and Audio Analysis (audio + text instructions), with both base and instruction-tuned models. It is evaluated on many benchmarks (speech recognition, translation, sound...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Qwen-Audio

    Qwen-Audio

    Chat & pretrained large audio language model proposed by Alibaba Cloud

    Qwen-Audio is a large audio-language model developed by Alibaba Cloud, built to accept various types of audio input (speech, natural sounds, music, singing) along with text input, and output text. There is also an instruction-tuned version called Qwen-Audio-Chat which supports conversational interaction (multi-round), audio + text input, creative tasks and reasoning over audio. It uses multi-task training over many different audio tasks (30+), and achieves strong multi-benchmarks performance...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    audioFlux

    audioFlux

    A library for audio and music analysis, feature extraction

    ...It can be provided to deep learning networks for training and is used to study various tasks in the audio field such as Classification, Separation, Music Information Retrieval(MIR) ASR, etc.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    Audio Priority Bar

    Audio Priority Bar

    A native macOS menu bar app for managing audio device priorities

    Audio Priority Bar is a lightweight macOS utility that gives users precise control over how audio output is prioritized across different apps and devices, filling a gap in the system audio stack that Apple doesn’t natively expose. Once installed, it places an always-accessible control in the menu bar that lets you assign priority levels to individual audio sources so that more important sounds (like alerts, calls, or music) can override or duck less important ones (like background noise or...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Turn traffic into pipeline and prospects into customers Icon
    Turn traffic into pipeline and prospects into customers

    For account executives and sales engineers looking for a solution to manage their insights and sales data

    Docket is an AI-powered sales enablement platform designed to unify go-to-market (GTM) data through its proprietary Sales Knowledge Lake™ and activate it with intelligent AI agents. The platform helps marketing teams increase pipeline generation by 15% by engaging website visitors in human-like conversations and qualifying leads. For sales teams, Docket improves seller efficiency by 33% by providing instant product knowledge, retrieving collateral, and creating personalized documents. Built for GTM teams, Docket integrates with over 100 tools across the revenue tech stack and offers enterprise-grade security with SOC 2 Type II, GDPR, and ISO 27001 compliance. Customers report improved win rates, shorter sales cycles, and dramatically reduced response times. Docket’s scalable, accurate, and fast AI agents deliver reliable answers with confidence scores, empowering teams to close deals faster.
    Learn More
  • 10
    Fun Audio Chat

    Fun Audio Chat

    Large Audio Language Model built for natural interactions

    Fun Audio Chat is an interactive voice-first conversational AI platform designed to let users engage in natural spoken dialogue with large language models in real time, turning speech into context-aware responses while maintaining a smooth back-and-forth experience. It combines speech recognition, audio processing, and AI generation so users can speak simply and receive spoken replies, enabling applications such as virtual assistants, voice bots, and hands-free chat interfaces. The system...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Voice-Pro

    Voice-Pro

    Comprehensive Gradio WebUI for audio processing

    Voice-Pro is the best gradio WebUI for transcription, translation and text-to-speech. It can be easily installed with one click. Create a virtual environment using Miniconda, running completely separate from the Windows system (fully portable). Supports real-time transcription and translation, as well as batch mode.
    Downloads: 36 This Week
    Last Update:
    See Project
  • 12
    Whisper-WebUI

    Whisper-WebUI

    A Web UI for easy subtitle using whisper model

    Whisper WebUI is an open-source browser-based interface that simplifies the use of Whisper speech recognition models by providing an intuitive graphical environment for transcription, translation, and subtitle generation. Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools.
    Downloads: 23 This Week
    Last Update:
    See Project
  • 13
    OBS Studio

    OBS Studio

    Open source software for live streaming and recording

    OBS Studio, also known as Open Broadcaster Software, is a free and open source software program for live streaming and video recording. Features of the software include device/source capture, recording, encoding and broadcasting. Stream on Windows, Mac or Linux. This software is commonly used by video game streamers on the popular streaming platform Twitch.
    Downloads: 269 This Week
    Last Update:
    See Project
  • 14
    TTS WebUI

    TTS WebUI

    A single Gradio + React WebUI with extensions for ACE-Step

    TTS-WebUI is a unified Gradio + React web interface that brings together a large ecosystem of text-to-speech, voice conversion, and audio generation models under a single UI. It supports a wide range of models such as Bark, MusicGen, Tortoise, RVC, StyleTTS2, ParlerTTS, CosyVoice, XTTSv2, Stable Audio, SeamlessM4T, and many others, exposing them as interchangeable backends for speech and music synthesis. The project provides an installer that sets up Conda, Python environments, and all...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    LosslessCut

    LosslessCut

    The swiss army knife of lossless video/audio editing

    LosslessCut aims to be the ultimate cross platform FFmpeg GUI for extremely fast and lossless operations on video, audio, subtitle and other related media files. The main feature is lossless trimming and cutting of video and audio files, which is great for saving space by rough-cutting your large video files taken from a video camera, GoPro, drone, etc. It lets you quickly extract the good parts from your videos and discard many gigabytes of data without doing a slow re-encode and thereby...
    Downloads: 211 This Week
    Last Update:
    See Project
  • 16
    BlackHole

    BlackHole

    BlackHole is a modern macOS audio loopback driver

    ...The driver integrates directly with macOS Core Audio and appears in Audio MIDI Setup and supported audio applications. Designed with performance and stability in mind, BlackHole works on both Intel and Apple Silicon Macs without requiring kernel extensions or system security modifications. As an open-source project, it offers transparency, customization options, and active community-driven development.
    Downloads: 100 This Week
    Last Update:
    See Project
  • 17
    eqMac

    eqMac

    macOS System-wide audio equalizer & volume mixer

    System audio equalizer for macOS. Professional grade Parametric EQ & volume mixer. If you feel like your audio device (Headphones or Speaker) does not have enough Bass (low frequency) punch, or vice versa, you can adjust that using eqMac. macOS does not have a direct way to access the System Audio stream, so we use the eqMac Audio driver to divert the system audio to the driver's input stream. Then eqMac captures that Input audio stream processes it, and sends it directly to the output...
    Downloads: 83 This Week
    Last Update:
    See Project
  • 18
    Cider App

    Cider App

    A new cross-platform Apple Music experience based on Electron and Vue

    An open-source, community-oriented Apple Music client for Windows, Linux, macOS, and more. Whether it be Discord, LastFM, or even equalizers we've got you covered. Discord & Last.fm Integration. Quickly share and show others what you're listening to; right out of the box. Audio Enhancements. Audio Spatialization, Adrenaline Processor™, and Equalizers are all available and actively engineered by our Audio Engineer, Maikiwi.
    Downloads: 105 This Week
    Last Update:
    See Project
  • 19
    NeuralNote

    NeuralNote

    Audio Plugin for Audio to MIDI transcription using deep learning

    NeuralNote is an open-source audio software tool designed to convert recorded audio into MIDI data using modern machine learning techniques. The software functions as an audio plugin that can be used inside digital audio workstations as well as a standalone application for music production and analysis. Its main purpose is to perform audio-to-MIDI transcription, allowing musicians to record a performance and automatically transform it into editable MIDI notes. ...
    Downloads: 79 This Week
    Last Update:
    See Project
  • 20
    TheCodingMachine React Native

    TheCodingMachine React Native

    A React Native template for building solid applications

    Simple, Lightweight and Scalable.Explore the optimal React Native boilerplate for your project, featuring a straightforward architecture founded on the principle of Separation of Concerns. Join our vibrant community and watch it flourish. This boilerplate offers a robust foundation for developing cross-platform mobile applications using React Native. It emphasizes a clear separation between the user interface and business logic, promoting maintainability and scalability. Developers can...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    Snapcast

    Snapcast

    Synchronous multiroom audio player

    Snapcast is a multiroom client-server audio player, where all clients are time synchronized with the server to play perfectly synced audio. It's not a standalone player, but an extension that turns your existing audio player into a Sonos-like multiroom solution. Audio is captured by the server and routed to the connected clients. Several players can feed audio to the server in parallel and clients can be grouped to play the same audio stream. One of the most generic ways to use Snapcast is...
    Downloads: 75 This Week
    Last Update:
    See Project
  • 22
    Strawberry Music Player

    Strawberry Music Player

    Strawberry Music Player

    Strawberry is a cross-platform music player and music collection organizer. It is aimed at music collectors and audiophiles. With Strawberry you can play and manage your digital music collection, or stream your favorite radios. Strawberry is a music player and music collection organizer. It is aimed at music collectors and audiophiles. With Strawberry you can play and manage your digital music collection, or stream your favorite radios. Strawberry is free software released under GPL. The...
    Downloads: 152 This Week
    Last Update:
    See Project
  • 23
    Navidrome

    Navidrome

    Your Personal Streaming Service

    Navidrome is an open-source, web-based personal music server that lets you stream and manage your entire music collection from any browser or compatible mobile app, effectively turning your own files into a cloud-accessible music service. It supports large libraries and handles a wide variety of audio formats while maintaining very low resource usage, so it runs well even on small servers, Raspberry Pi devices, and other constrained hardware.
    Downloads: 115 This Week
    Last Update:
    See Project
  • 24
    LTX-2.3

    LTX-2.3

    Official Python inference and LoRA trainer package

    LTX-2.3 is an open-source multimodal artificial intelligence foundation model developed by Lightricks for generating synchronized video and audio from prompts or other inputs. Unlike most earlier video generation systems that only produced silent clips, LTX-2 combines video and audio generation in a unified architecture capable of producing coherent audiovisual scenes.
    Downloads: 155 This Week
    Last Update:
    See Project
  • 25
    Spotube

    Spotube

    Open source Spotify client that doesn't require Premium

    An open source, cross-platform Spotify client compatible across multiple platforms utilizing Spotify's data API and YouTube, Piped video or JioSaavn as an audio source, eliminating the need for Spotify Premium. It is still recommended to support creators by engaging with their YouTube channels/Spotify tracks (or preferably by buying their merch/concert tickets/physical media).
    Downloads: 75 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB