Open Source Python Multimedia Software - Page 5

Sort By:

Python Multimedia Software

Multimedia Python Clear Filters

Browse free open source Python Multimedia Software and projects below. Use the toggles on the left to filter open source Python Multimedia Software by OS, license, language, programming language, and project status.

Skillfully - The future of skills based hiring
Realistic Workplace Simulations that Show Applicant Skills in Action

Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.

Learn More
The Most Powerful Software Platform for EHSQ and ESG Management
Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.

Learn More
1

Kornia

Open Source Differentiable Computer Vision Library

Kornia is a differentiable computer vision library for PyTorch. It consists of a set of routines and differentiable modules to solve generic computer vision problems. At its core, the package uses PyTorch as its main backend both for efficiency and to take advantage of the reverse-mode auto-differentiation to define and compute the gradient of complex functions. Inspired by existing packages, this library is composed by a subset of packages containing operators that can be inserted within neural networks to train models to perform image transformations, epipolar geometry, depth estimation, and low-level image processing such as filtering and edge detection that operate directly on tensors. With Kornia we fill the gap between classical and deep computer vision that implements standard and advanced vision algorithms for AI. Our libraries and initiatives are always according to the community needs.

Downloads: 9 This Week

Last Update: 2025-11-08
See Project
2

Mkchromecast

Cast macOS and Linux Audio/Video to your Google Cast and Sonos Devices

This is a program to cast audio and video from your macOS, or Linux desktop to your Google Cast devices or Sonos speakers. It is written in Python, and it streams via node.js, ffmpeg, or avconv. Mkchromecast is capable of using lossy and lossless audio formats provided that ffmpeg, avconv (Linux), or parec (Linux) are installed. It also supports Multi-room group playback, and 24-bits/96kHz high audio resolution. Linux users also can configure ALSA to capture audio.

Downloads: 9 This Week

Last Update: 2024-06-25
See Project
3

PersonaPlex

PersonaPlex code

PersonaPlex is an open-source real-time conversational speech AI model that goes beyond traditional text chat by providing full-duplex speech-to-speech interaction, meaning it can listen and talk at the same time instead of waiting for you to finish speaking before responding. This architectural approach eliminates awkward pauses and makes conversations feel much more human-like, with natural behaviors such as overlapping speech, interruptions, and fluent turn-taking, traits that traditional AI assistants typically lack. PersonaPlex also supports persona and voice control, allowing developers to define the role and speaking style of the agent using text prompts and voice conditioning, making it suitable for applications like customized voice assistants, interactive character agents, or domain-specific conversational tools. Internally, it processes continuous audio streams in a hybrid input format so that speech understanding and generation occur jointly.

Downloads: 9 This Week

Last Update: 2026-03-02
See Project
4

Podcastfy.ai

Transforming Multimodal Content into Captivating Multilingual Audio

Podcastfy is an open-source Python package that transforms multi-modal content (text, images) into engaging, multi-lingual audio conversations using GenAI. Input content includes websites, PDFs, youtube videos as well as images. Unlike UI-based tools focused primarily on note-taking or research synthesis (e.g. NotebookLM), Podcastfy focuses on the programmatic and bespoke generation of engaging, conversational transcripts and audio from a multitude of multi-modal sources enabling customization and scale.

Downloads: 9 This Week

Last Update: 2024-11-16
See Project
AestheticsPro Medical Spa Software
Our new software release will dramatically improve your medspa business performance while enhancing the customer experience

AestheticsPro is the most complete Aesthetics Software on the market today. HIPAA Cloud Compliant with electronic charting, integrated POS, targeted marketing and results driven reporting; AestheticsPro delivers the tools you need to manage your medical spa business. It is our mission To Provide an All-in-One Cutting Edge Software to the Aesthetics Industry.

Learn More
5

Spotify Music Downloader

Spotify Music Downloader

Download music from Spotify and other music sources.

1 Review

Downloads: 9 This Week

Last Update: 2022-04-27
See Project
6

Tilf

Tilf (Tiny Elf) is a free, simple yet powerful pixel art editor

Tilf (Tiny Elf) is a lightweight, cross-platform pixel art editor developed in Python with PySide6, designed for simplicity, speed, and freedom from account systems or installation overhead. It focuses on enabling artists to create sprites, icons, and small 2D assets quickly, without requiring setup, dependencies, or internet connectivity. Tilf provides a familiar drawing environment with essential tools—such as pencil, eraser, fill, eyedropper, rectangle, and ellipse—along with zoom, grid display, real-time preview, and undo/redo capabilities. It supports importing and exporting images in PNG, JPG, and BMP formats, including transparency options. With its single-executable builds for Windows, macOS, and Linux, Tilf can be run instantly and is ideal for both hobbyist pixel artists and developers needing a quick sketching tool for sprite work. The project emphasizes accessibility and minimalism over complexity, making it approachable even for users with no technical background.

Downloads: 9 This Week

Last Update: 2025-10-09
See Project
7

ipyvizzu

Build animated charts in Jupyter Notebook and similar environments

ipyvizzu - Build animated charts in Jupyter Notebook and similar environments with a simple Python syntax ipyvizzu is an animated charting tool for Jupyter, Google Colab, Databricks, Kaggle and Deepnote notebooks among other platforms. ipyvizzu enables data scientists and analysts to utilize animation for storytelling with data using Python. It's built on the open-source JavaScript/C++ charting library Vizzu. There is a new extension of ipyvizzu, ipyvizzu-story with which the animated charts can be presented right from the notebooks. Since ipyvizzu-story's syntax is a bit different to ipyvizzu's, we suggest you to start from the ipyvizzu-story repo if you're interested in using animated charts to present your findings live or to share your presentation as an HTML file.

Downloads: 9 This Week

Last Update: 2025-02-26
See Project
8

word_cloud

A little word cloud generator in Python

A little word cloud generator in Python. The code is tested against Python 2.7, 3.4, 3.5, 3.6 and 3.7. If you are using conda, you can install from the conda-forge channel. wordcloud depends on numpy and pillow. To save the wordcloud into a file, matplotlib can also be installed. If there are no wheels available for your version of python, installing the package requires having a C compiler set up. Before installing a compiler, report an issue describing the version of python and operating system being used. The wordcloud_cli tool can be used to generate word clouds directly from the command-line. If you're dealing with PDF files, then pdftotext, included by default with many Linux distribution, comes in handy. Use wordcloud_cli --help so see all available options. The wordcloud library is MIT licenced, but contains DroidSansMono.ttf, a true type font by Google, that is apache licensed.

Downloads: 9 This Week

Last Update: 2026-01-22
See Project
9

Impressive

Impressive is a program that displays PDF presentation slides with style. Smooth alpha-blended slide transitions are provided for the sake of eye candy, but in addition to this, Impressive offers some unique tools that are very useful for presentations.

13 Reviews

Downloads: 47 This Week

Last Update: 2023-11-04
See Project
Simplify Purchasing For Your Business
Manage what you buy and how you buy it with Order.co, so you have control over your time and money spent.

Simplify every aspect of buying for your business in Order.co. From sourcing products to scaling purchasing across locations to automating your AP and approvals workstreams, Order.co is the platform of choice for growing businesses.

Learn More
10

StreamTuner2 ♪♬#

Internet radio directory browser

Streamtuner2 is an internet radio station and video browser. It simply lists stations in categories from different directories. Launches your preferred media apps for playback. It's built in Python now, but retains UI similarity with the original StreamTuner 0.99

6 Reviews

Downloads: 67 This Week

Last Update: 2022-02-22
See Project
11

Crunch PNG

Insane(ly slow but wicked good) PNG image optimization

Crunch is a tool for lossy PNG image file optimization. It combines selective bit depth, color type, and color palette reduction with zopfli DEFLATE compression algorithm encoding using the pngquant and zopflipng PNG optimization tools. This approach leads to a significant file size gain relative to lossless approaches at the expense of a relatively modest decrease in image quality. Continuous benchmark testing is available in our GitHub Actions CI. Please see the benchmarks directory of this repository for details about the benchmarking approach and instructions on how to execute benchmarks locally on the reference images distributed in this repository or with your own image files.

Downloads: 8 This Week

Last Update: 2024-08-22
See Project
12

Emoji for Python

emoji terminal output for Python

Emoji for Python. This project was inspired by kyokomi. The entire set of Emoji codes as defined by the Unicode consortium is supported in addition to a bunch of aliases. By default, only the official list is enabled but doing emoji.emojize(language='alias') enables both the full list and aliases. By default, the language is English (language='en') but also supported languages are Spanish ('es'), Portuguese ('pt'), Italian ('it'), French ('fr'), German ('de'). The utils/get-codes-from-unicode-consortium.py may help when updating unicode_codes.py but is not guaranteed to work. Generally speaking it scrapes a table on the Unicode Consortium's website with BeautifulSoup and prints the contents to stdout in a more useful format.

Downloads: 8 This Week

Last Update: 2025-09-21
See Project
13

PML

The easiest way to use deep metric learning in your application

This library contains 9 modules, each of which can be used independently within your existing codebase, or combined together for a complete train/test workflow. To compute the loss in your training loop, pass in the embeddings computed by your model, and the corresponding labels. The embeddings should have size (N, embedding_size), and the labels should have size (N), where N is the batch size. The TripletMarginLoss computes all possible triplets within the batch, based on the labels you pass into it. Anchor-positive pairs are formed by embeddings that share the same label, and anchor-negative pairs are formed by embeddings that have different labels. Loss functions can be customized using distances, reducers, and regularizers. In the diagram below, a miner finds the indices of hard pairs within a batch. These are used to index into the distance matrix, computed by the distance object. For this diagram, the loss function is pair-based, so it computes a loss per pair.

Downloads: 8 This Week

Last Update: 2025-08-17
See Project
14

PyVista

3D plotting and mesh analysis through a streamlined interface

3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK). PyVista is a helper module for the Visualization Toolkit (VTK) that takes a different approach on interfacing with VTK through NumPy and direct array access. This package provides a Pythonic, well-documented interface exposing VTK’s powerful visualization backend to facilitate rapid prototyping, analysis, and visual integration of spatially referenced datasets. This module can be used for scientific plotting for presentations and research papers as well as a supporting module for other mesh-dependent Python modules. Easily integrate with NumPy and create a variety of geometries and plot them. You could use any geometry to create your glyphs, or even plot the points directly. Direct access to mesh analysis and transformation routines. Intuitive plotting routines with matplotlib similar syntax.

Downloads: 8 This Week

Last Update: 2026-04-06
See Project
15

Robust Video Matting (RVM)

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX

We introduce a robust, real-time, high-resolution human video matting method that achieves new state-of-the-art performance. Our method is much lighter than previous approaches and can process 4K at 76 FPS and HD at 104 FPS on an Nvidia GTX 1080Ti GPU. Unlike most existing methods that perform video matting frame-by-frame as independent images, our method uses a recurrent architecture to exploit temporal information in videos and achieves significant improvements in temporal coherence and matting quality. Furthermore, we propose a novel training strategy that enforces our network on both matting and segmentation objectives. This significantly improves our model's robustness. Our method does not require any auxiliary inputs such as a trimap or a pre-captured background image, so it can be widely applied to existing human matting applications. RVM is specifically designed for robust human video matting.

Downloads: 8 This Week

Last Update: 2023-03-25
See Project
16

asciinema

Open source terminal session recorder

asciinema is a free and open source terminal session recorder. It lets you easily record and play back terminal sessions in the terminal or in a web browser. Forget old screen recording methods and resulting blurry videos. asciinema lets you record your terminal sessions the right way, which is right where you work, in the terminal. Recording is as easy as running one command, and since it’s purely text-based you can copy and paste any content you want, simply pause the recording! You can also easily share your recordings on the web, embed an asciicast player in your blog post, project documentation page or in your conference talk slides. See plenty of example sessions recorded with asciinema here: https://asciinema.org/

Downloads: 8 This Week

Last Update: 2026-03-01
See Project
17

castero

TUI podcast client for the terminal

castero is a TUI podcast client for the terminal.

Downloads: 8 This Week

Last Update: 2024-09-18
See Project
18

Curlew Multimedia Converter

Easy to use Multimedia Converter for Linux

8 Reviews

Downloads: 54 This Week

Last Update: 2018-05-26
See Project
19

Open Asset Import Library

Importer library to import assets from different common 3D file formats such as Collada, Blend, Obj, X, 3DS, LWO, MD5, MD2, MD3, MDL, MS3D and a lot of other formats. The data is stored in an own in-memory data-format, which can be easily processed. www.open3mod.com/ is a 3D model viewer and exporter based on Assimp that is also Open Source.

24 Reviews

Downloads: 32 This Week

Last Update: 2014-06-21
See Project
20

Audiomentations

A Python library for audio data augmentation

A Python library for audio data augmentation. Inspired by albumentations. Useful for deep learning. Runs on CPU. Supports mono audio and multichannel audio. Can be integrated in training pipelines in e.g. Tensorflow/Keras or Pytorch. Has helped people get world-class results in Kaggle competitions. Is used by companies making next-generation audio products. Mix in another sound, e.g. a background noise. Useful if your original sound is clean and you want to simulate an environment where background noise is present. A folder of (background noise) sounds to be mixed in must be specified. These sounds should ideally be at least as long as the input sounds to be transformed. Otherwise, the background sound will be repeated, which may sound unnatural. Note that the gain of the added noise is relative to the amount of signal in the input. This implies that if the input is completely silent, no noise will be added.

Downloads: 7 This Week

Last Update: 2025-09-13
See Project
21

ChatterBot

Machine learning, conversational dialog engine for creating chat bots

ChatterBot is a Python library that makes it easy to generate automated responses to a user’s input. ChatterBot uses a selection of machine learning algorithms to produce different types of responses. This makes it easy for developers to create chat bots and automate conversations with users. For more details about the ideas and concepts behind ChatterBot see the process flow diagram. The language independent design of ChatterBot allows it to be trained to speak any language. Additionally, the machine-learning nature of ChatterBot allows an agent instance to improve it’s own knowledge of possible responses as it interacts with humans and other sources of informative data. An untrained instance of ChatterBot starts off with no knowledge of how to communicate. Each time a user enters a statement, the library saves the text that they entered and the text that the statement was in response to. As ChatterBot receives more input the number of responses that it can reply increase.

Downloads: 7 This Week

Last Update: 2026-03-24
See Project
22

GIMP ML

AI for GNU Image Manipulation Program

This repository introduces GIMP3-ML, a set of Python plugins for the widely popular GNU Image Manipulation Program (GIMP). It enables the use of recent advances in computer vision to the conventional image editing pipeline. Applications from deep learning such as monocular depth estimation, semantic segmentation, mask generative adversarial networks, image super-resolution, de-noising and coloring have been incorporated with GIMP through Python-based plugins. Additionally, operations on images such as edge detection and color clustering have also been added. GIMP-ML relies on standard Python packages such as numpy, scikit-image, pillow, pytorch, open-cv, scipy. In addition, GIMP-ML also aims to bring the benefits of using deep learning networks used for computer vision tasks to routine image processing workflows.

Downloads: 7 This Week

Last Update: 2022-08-19
See Project
23

Mirrorcast

Open Source Alternative to Chromecast, Mirror Desktop and Play media r

The idea is to replicate what Chromecast can do in regards to screen mirroring and streaming media to a remote display. Google chromes screen mirroring feature works well when used with a receiver such as Chromecast but this is a proprietary solution and audio does not work for desktop mirroring on some operating systems. At the moment, there is only a client for Debian/Ubuntu Operating systems and a server/receiver application for Raspberry pi. Mirrorcast aims to be a low latency screen mirroring solution with high-quality video and audio at 25-30fps, the later is why we will not use something like VNC. Mirrorcast uses up about the same amount of system resources as google chromes cast feature. The delay is less than 1 second on most networks. To achieve this we will use existing FOSS software such as ffmpeg, mpv, and omxplayer.

Downloads: 7 This Week

Last Update: 2023-08-04
See Project
24

The FreeMoCap Project

Free Motion Capture for Everyone

FreeMoCap is an open-source markerless motion capture system that enables users to record human movement using ordinary cameras and convert the footage into usable 3D motion data. The project’s goal is to democratize motion capture by removing the need for expensive suits or proprietary studio hardware, instead relying on computer vision and pose estimation pipelines. It processes synchronized video feeds to reconstruct skeletal motion, which can then be exported for animation, biomechanics research, or creative projects. FreeMoCap includes tools for calibration, recording, processing, and visualization, allowing users to move from raw footage to structured motion data within a single ecosystem. Because it is open and extensible, researchers and developers can adapt the pipeline for specialized motion analysis or integrate it into animation workflows.

Downloads: 7 This Week

Last Update: 2026-02-19
See Project
25

CamDesk

The Desktop Webcam Widget

CamDesk is a free, open source, desktop webcam widget, that was created as home surveillance application. Although others have used it for demonstrations even with CamStudio, and QuickTime Player for screen casting.

7 Reviews

Downloads: 50 This Week

Last Update: 2016-03-08
See Project