An Open Source implementation of Notebook LM with more flexibility
Interface for OuteTTS models
DoWhy is a Python library for causal inference
Robust Speech Recognition via Large-Scale Weak Supervision
Automatically translates the text of a video based on a subtitle file
Multi-user UI for managing and running Stable Diffusion workflows tool
Trying to be a robust, user-friendly and hackable music player
Label Studio is a multi-type data labeling and annotation tool
The music player of today
AI tool converting video/audio into structured documents instantly
Open source AI model for generating full songs from lyrics prompts
Unified web UI for training and running open models locally
Offline Text To Speech synthesis for python
Generate blog articles from video or audio
Unofficial Python API and agentic skill for Google NotebookLM
A Systematic Framework for Interactive World Modeling
MARS5 speech model (TTS) from CAMB.AI
A general fine-tuning kit geared toward image/video/audio diffusion
The most powerful and modular diffusion model GUI, api and backend
EPUB to audiobook converter, optimized for Audiobookshelf
Helps scientists define testable, modular, self-documenting dataflow
Controllable & emotion-expressive zero-shot TTS
The official Python SDK for the ElevenLabs API
Converts text to speech in realtime
ChatGPT interface with better UI