Open Source Python Machine Learning Software - Page 9

Sort By:

Python Machine Learning Software

Machine Learning Python Clear Filters

Browse free open source Python Machine Learning Software and projects below. Use the toggles on the left to filter open source Python Machine Learning Software by OS, license, language, programming language, and project status.

Intelligent Retail Management
Retail space, product categories, planograms, automatic ordering, and shelf labels management

Quant offers a wide range of solutions for retail. Within one integrated software system, it allows you to efficiently combine the management of retail space, shelf labels and marketing materials with task management, reporting and automatic replenishment.

Learn More
Employees get more done with Rippling
Streamline your business with an all-in-one platform for HR, IT, payroll, and spend management.

Effortlessly manage the entire employee lifecycle, from hiring to benefits administration. Automate HR tasks, ensure compliance, and streamline approvals. Simplify IT with device management, software access, and compliance monitoring, all from one dashboard. Enjoy timely payroll, real-time financial visibility, and dynamic spend policies. Rippling empowers your business to save time, reduce costs, and enhance efficiency, allowing you to focus on growth. Experience the power of unified management with Rippling today.

Learn More
1

snorkel

A system for quickly generating training data with weak supervision

The Snorkel team is now focusing their efforts on Snorkel Flow, an end-to-end AI application development platform based on the core ideas behind Snorkel. The Snorkel project started at Stanford in 2016 with a simple technical bet: that it would increasingly be the training data, not the models, algorithms, or infrastructure, that decided whether a machine learning project succeeded or failed. Given this premise, we set out to explore the radical idea that you could bring mathematical and systems structure to the messy and often entirely manual process of training data creation and management, starting by empowering users to programmatically label, build, and manage training data. Snorkel Flow, an end-to-end machine learning platform for developing and deploying AI applications. Snorkel Flow incorporates many of the concepts of the Snorkel project with a range of newer techniques around weak supervision modeling, data augmentation, multi-task learning, data slicing and structuring.

Downloads: 3 This Week

Last Update: 2024-08-02
See Project
2

supabase-py

Python Client for Supabase. Query Postgres from Flask, Django

Python Client for Supabase. Query Postgres from Flask, Django, FastAPI. Python user authentication, security policies, edge functions, file storage, and realtime data streaming. Good first issue.

Downloads: 3 This Week

Last Update: 2026-03-20
See Project
3

SMILI

Scientific Visualisation Made Easy

The Simple Medical Imaging Library Interface (SMILI), pronounced 'smilie', is an open-source, light-weight and easy-to-use medical imaging viewer and library for all major operating systems. The main sMILX application features for viewing n-D images, vector images, DICOMs, anonymizing, shape analysis and models/surfaces with easy drag and drop functions. It also features a number of standard processing algorithms for smoothing, thresholding, masking etc. images and models, both with graphical user interfaces and/or via the command-line. See our YouTube channel for tutorial videos via the homepage. The applications are all built out of a uniform user-interface framework that provides a very high level (Qt) interface to powerful image processing and scientific visualisation algorithms from the Insight Toolkit (ITK) and Visualisation Toolkit (VTK). The framework allows one to build stand-alone medical imaging applications quickly and easily.

Downloads: 75 This Week

Last Update: 2026-03-16
See Project
4

AI-Tutorials/Implementations Notebooks

Codes/Notebooks for AI Projects

AI-Tutorials/Implementations Notebooks repository is a comprehensive collection of artificial intelligence tutorials and implementation examples intended for developers, students, and researchers who want to learn by building practical AI projects. The repository contains numerous Jupyter notebooks and code samples that demonstrate modern techniques in machine learning, deep learning, data science, and large language model workflows. It includes implementations for a wide range of AI topics such as computer vision, agent systems, federated learning, distributed systems, adversarial attacks, and generative AI. Many of the tutorials focus on building AI agents, multi-agent systems, and workflows that integrate language models with external tools or APIs. The codebase acts as a hands-on learning resource, allowing users to experiment with new frameworks, architectures, and machine learning workflows through guided examples.

Downloads: 2 This Week

Last Update: 8 hours ago
See Project
Kinetic Software - Epicor ERP
Discrete, make-to-order and mixed-mode manufacturers who need a global cloud ERP solution

Grow, thrive, and compete in a global marketplace with Kinetic—an industry-tailored, cognitive ERP that helps you work smarter and stay connected.

Learn More
5

AWS Neuron

Powering Amazon custom machine learning chips

AWS Neuron is a software development kit (SDK) for running machine learning inference using AWS Inferentia chips. It consists of a compiler, run-time, and profiling tools that enable developers to run high-performance and low latency inference using AWS Inferentia-based Amazon EC2 Inf1 instances. Using Neuron developers can easily train their machine learning models on any popular framework such as TensorFlow, PyTorch, and MXNet, and run it optimally on Amazon EC2 Inf1 instances. You can continue to use the same ML frameworks you use today and migrate your software onto Inf1 instances with minimal code changes and without tie-in to vendor-specific solutions. Neuron is pre-integrated into popular machine learning frameworks like TensorFlow, MXNet and Pytorch to provide a seamless training-to-inference workflow. It includes a compiler, runtime driver, as well as debug and profiling utilities with a TensorBoard plugin for visualization.

Downloads: 2 This Week

Last Update: 2026-04-09
See Project
6

AWS Step Functions Data Science SDK

For building machine learning (ML) workflows and pipelines on AWS

The AWS Step Functions Data Science SDK is an open-source library that allows data scientists to easily create workflows that process and publish machine learning models using Amazon SageMaker and AWS Step Functions. You can create machine learning workflows in Python that orchestrate AWS infrastructure at scale, without having to provision and integrate the AWS services separately. The best way to quickly review how the AWS Step Functions Data Science SDK works is to review the related example notebooks. These notebooks provide code and descriptions for creating and running workflows in AWS Step Functions Using the AWS Step Functions Data Science SDK. In Amazon SageMaker, example Jupyter notebooks are available in the example notebooks portion of a notebook instance. To run the AWS Step Functions Data Science SDK example notebooks locally, download the sample notebooks and open them in a working Jupyter instance.

Downloads: 2 This Week

Last Update: 2022-07-07
See Project
7

Amazing-Python-Scripts

Curated collection of Amazing Python scripts

Amazing-Python-Scripts is a collaborative repository that collects a wide variety of Python scripts designed to demonstrate practical programming techniques and automation tasks. The project includes scripts ranging from beginner-level utilities to more advanced applications involving machine learning, data processing, and system automation. Its goal is to provide developers with useful coding examples that can solve everyday problems, automate repetitive tasks, or serve as learning exercises. The repository encourages community contributions, allowing developers to add their own scripts and improve existing ones through pull requests. Examples include scripts for sentiment analysis, data scraping, web automation, log analysis, and interactive applications such as games or voice-controlled tools. The project also provides contribution guidelines and documentation so that developers can easily collaborate and expand the collection of scripts.

Downloads: 2 This Week

Last Update: 2026-03-11
See Project
8

Auto-PyTorch

Automatic architecture search and hyperparameter optimization

While early AutoML frameworks focused on optimizing traditional ML pipelines and their hyperparameters, another trend in AutoML is to focus on neural architecture search. To bring the best of these two worlds together, we developed Auto-PyTorch, which jointly and robustly optimizes the network architecture and the training hyperparameters to enable fully automated deep learning (AutoDL). Auto-PyTorch is mainly developed to support tabular data (classification, regression) and time series data (forecasting). The newest features in Auto-PyTorch for tabular data are described in the paper "Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL" (see below for bibtex ref). Details about Auto-PyTorch for multi-horizontal time series forecasting tasks can be found in the paper "Efficient Automated Deep Learning for Time Series Forecasting" (also see below for bibtex ref).

Downloads: 2 This Week

Last Update: 2022-08-23
See Project
9

Autodistill

Images to inference with no labeling

Autodistill uses big, slower foundation models to train small, faster supervised models. Using autodistill, you can go from unlabeled images to inference on a custom model running at the edge with no human intervention in between. You can use Autodistill on your own hardware, or use the Roboflow hosted version of Autodistill to label images in the cloud.

Downloads: 2 This Week

Last Update: 2024-08-14
See Project
Easily build robust connections between Salesforce and any platform
We help companies using Salesforce connect their data with a no-code Salesforce-native solution.

Like having Postman inside Salesforce! Declarative Webhooks allows users to quickly and easily configure bi-directional integrations between Salesforce and external systems using a point-and-click interface. No coding is required, making it a fast and efficient and as a native solution, Declarative Webhooks seamlessly integrates with Salesforce platform features such as Flow, Process Builder, and Apex. You can also leverage the AI Integration Agent feature to automatically build your integration templates by providing it with links to API documentation.

Learn More
10

Avalanche

End-to-End Library for Continual Learning based on PyTorch

Avalanche is an end-to-end Continual Learning library based on Pytorch, born within ContinualAI with the unique goal of providing a shared and collaborative open-source (MIT licensed) codebase for fast prototyping, training and reproducible evaluation of continual learning algorithms. Avalanche can help Continual Learning researchers in several ways. This module maintains a uniform API for data handling: mostly generating a stream of data from one or more datasets. It contains all the major CL benchmarks (similar to what has been done for torchvision). Provides all the necessary utilities concerning model training. This includes simple and efficient ways of implementing new continual learning strategies as well as a set of pre-implemented CL baselines and state-of-the-art algorithms you will be able to use for comparison! Avalanche the first experiment of an End-to-end Library for reproducible continual learning research & development where you can find benchmarks, algorithms, etc.

Downloads: 2 This Week

Last Update: 2024-10-29
See Project
11

BEVFormer

Implementation of BEVFormer, a camera-only framework

3D visual perception tasks, including 3D detection and map segmentation based on multi-camera images, are essential for autonomous driving systems. In this work, we present a new framework termed BEVFormer, which learns unified BEV representations with spatiotemporal transformers to support multiple autonomous driving perception tasks. In a nutshell, BEVFormer exploits both spatial and temporal information by interacting with spatial and temporal space through predefined grid-shaped BEV queries. To aggregate spatial information, we design spatial cross-attention that each BEV query extracts the spatial features from the regions of interest across camera views. For temporal information, we propose temporal self-attention to recurrently fuse the history BEV information. Our approach achieves the new state-of-the-art 56.9\% in terms of NDS metric on the nuScenes \texttt{test} set, which is 9.0 points higher than previous best arts and on par with the performance of LiDAR-based baseline.

Downloads: 2 This Week

Last Update: 2022-09-23
See Project
12

Bayesian machine learning notebooks

Notebooks about Bayesian methods for machine learning

Notebooks about Bayesian methods for machine learning.

Downloads: 2 This Week

Last Update: 2024-08-14
See Project
13

Best-of Machine Learning with Python

A ranked list of awesome machine learning Python libraries

This curated list contains 900 awesome open-source projects with a total of 3.3M stars grouped into 34 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome! General-purpose machine learning and deep learning frameworks.

Downloads: 2 This Week

Last Update: 2025-10-30
See Project
14

Catalyst

Accelerated deep learning R&D

Catalyst is a PyTorch framework for accelerated Deep Learning research and development. It allows you to write compact but full-featured Deep Learning pipelines with just a few lines of code. With Catalyst you get a full set of features including a training loop with metrics, model checkpointing and more, all without the boilerplate. Catalyst is focused on reproducibility, rapid experimentation, and codebase reuse so you can break the cycle of writing another regular train loop and make something totally new. Catalyst is compatible with Python 3.6+. PyTorch 1.1+, and has been tested on Ubuntu 16.04/18.04/20.04, macOS 10.15, Windows 10 and Windows Subsystem for Linux. It's part of the PyTorch Ecosystem, as well as the Catalyst Ecosystem which includes Alchemy (experiments logging & visualization) and Reaction (convenient deep learning models serving).

Downloads: 2 This Week

Last Update: 2022-07-24
See Project
15

CodeSearchNet

Datasets, tools, and benchmarks for representation learning of code

CodeSearchNet is a large-scale dataset and research benchmark designed to advance the development of systems that retrieve source code using natural language queries. The project was created through collaboration between GitHub and Microsoft Research and aims to support research on semantic code search and program understanding. The dataset contains millions of pairs of source code functions and corresponding documentation comments extracted from open-source repositories. These pairs allow machine learning models to learn relationships between natural language descriptions and programming code. The dataset currently covers several widely used programming languages, including Python, JavaScript, Ruby, Go, Java, and PHP. In addition to the dataset itself, the repository includes baseline models, evaluation tools, and instructions for building code retrieval systems that can map user queries to relevant code snippets.

Downloads: 2 This Week

Last Update: 2026-03-12
See Project
16

Generative Models

Collection of generative models, e.g. GAN, VAE in Pytorch

This project is a comprehensive open-source collection of implementations of various generative machine learning models designed to help researchers and developers experiment with deep generative techniques. The repository contains practical implementations of well-known architectures such as Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), Restricted Boltzmann Machines, and Helmholtz Machines, implemented primarily using modern deep learning frameworks like PyTorch and TensorFlow. These models are widely used in artificial intelligence to generate new data that resembles the training data, such as images, text, or other structured outputs. The repository serves as an educational and experimental environment where users can study how generative models work internally and replicate results from academic research papers.

Downloads: 2 This Week

Last Update: 2026-03-10
See Project
17

Google Research: Language

Shared repository for open-sourced projects from the Google AI Lang

Google Research: Language is a shared repository maintained by Google Research that contains open-source projects developed by the Google AI Language team. The repository hosts multiple subprojects related to natural language processing, machine learning, and large-scale language understanding systems. Many of the projects included in the repository correspond to research papers released by Google researchers and provide implementations of new NLP algorithms or experimental frameworks. These implementations often explore advanced techniques such as language modeling, semantic understanding, information retrieval, and multilingual text processing. The repository functions as a collaborative hub where different research initiatives can publish their code, enabling the broader community to reproduce experiments and build upon published work.

Downloads: 2 This Week

Last Update: 2026-03-12
See Project
18

HDBSCAN

A high performance implementation of HDBSCAN clustering

HDBSCAN - Hierarchical Density-Based Spatial Clustering of Applications with Noise. Performs DBSCAN over varying epsilon values and integrates the result to find a clustering that gives the best stability over epsilon. This allows HDBSCAN to find clusters of varying densities (unlike DBSCAN), and be more robust to parameter selection. In practice this means that HDBSCAN returns a good clustering straight away with little or no parameter tuning -- and the primary parameter, minimum cluster size, is intuitive and easy to select. HDBSCAN is ideal for exploratory data analysis; it's a fast and robust algorithm that you can trust to return meaningful clusters (if there are any).

Downloads: 2 This Week

Last Update: 2026-03-27
See Project
19

Image Super-Resolution (ISR)

Super-scale your images and run experiments with Residual Dense

The goal of this project is to upscale and improve the quality of low-resolution images. This project contains Keras implementations of different Residual Dense Networks for Single Image Super-Resolution (ISR) as well as scripts to train these networks using content and adversarial loss components. Docker scripts and Google Colab notebooks are available to carry training and prediction. Also, we provide scripts to facilitate training on the cloud with AWS and Nvidia-docker with only a few commands. When training your own model, start with only PSNR loss (50+ epochs, depending on the dataset) and only then introduce GANS and feature loss. This can be controlled by the loss weights argument. The weights used to produce these images are available directly when creating the model object. ISR is compatible with Python 3.6 and is distributed under the Apache 2.0 license.

Downloads: 2 This Week

Last Update: 2022-03-31
See Project
20

Jittor

Jittor is a high-performance deep learning framework

Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators. The whole framework and meta-operators are compiled just in time. A powerful op compiler and tuner are integrated into Jittor. It allowed us to generate high-performance code specialized for your model. Jittor also contains a wealth of high-performance model libraries, including image recognition, detection, segmentation, generation, differentiable rendering, geometric learning, reinforcement learning, etc. The front-end language is Python. Module Design and Dynamic Graph Execution is used in the front-end, which is the most popular design for deep learning framework interface. The back-end is implemented by high-performance languages, such as CUDA, C++. Jittor'op is similar to NumPy. Let's try some operations. We create Var a and b via operation jt.float32, and add them. Printing those variables shows they have the same shape and dtype.

Downloads: 2 This Week

Last Update: 2025-07-28
See Project
21

LightFM

A Python implementation of LightFM, a hybrid recommendation algorithm

LightFM is a Python implementation of a number of popular recommendation algorithms for both implicit and explicit feedback, including efficient implementation of BPR and WARP ranking losses. It's easy to use, fast (via multithreaded model estimation), and produces high-quality results. It also makes it possible to incorporate both item and user metadata into the traditional matrix factorization algorithms. It represents each user and item as the sum of the latent representations of their features, thus allowing recommendations to generalize to new items (via item features) and to new users (via user features).

Downloads: 2 This Week

Last Update: 2024-08-03
See Project
22

Lightweight' GAN

Implementation of 'lightweight' GAN, proposed in ICLR 2021

Implementation of 'lightweight' GAN proposed in ICLR 2021, in Pytorch. The main contribution of the paper is a skip-layer excitation in the generator, paired with autoencoding self-supervised learning in the discriminator. Quoting the one-line summary "converge on single gpu with few hours' training, on 1024 resolution sub-hundred images". Augmentation is essential for Lightweight GAN to work effectively in a low data setting. You can test and see how your images will be augmented before they pass into a neural network (if you use augmentation). The general recommendation is to use suitable augs for your data and as many as possible, then after some time of training disable the most destructive (for image) augs. You can turn on automatic mixed precision with one flag --amp. You should expect it to be 33% faster and save up to 40% memory. Aim is an open-source experiment tracker that logs your training runs, and enables a beautiful UI to compare them.

Downloads: 2 This Week

Last Update: 2025-01-12
See Project
23

MachineLearningStocks

Using python and scikit-learn to make stock predictions

MachineLearningStocks is a Python-based template project that demonstrates how machine learning can be applied to predicting stock market performance. The project provides a structured workflow that collects financial data, processes features, trains predictive models, and evaluates trading strategies. Using libraries such as pandas and scikit-learn, the repository shows how historical financial indicators can be transformed into machine learning features. The model attempts to predict whether specific stocks will outperform a benchmark index such as the S&P 500. The repository includes scripts for parsing financial statistics, building training datasets, and performing backtesting to evaluate model performance over historical periods. Because it is structured as a template project, developers are encouraged to extend or modify the pipeline to test different algorithms, features, or investment strategies.

Downloads: 2 This Week

Last Update: 2026-03-12
See Project
24

Mlxtend

A library of extension and helper modules for Python's data analysis

Mlxtend (machine learning extensions) is a Python library of useful tools for day-to-day data science tasks.

Downloads: 2 This Week

Last Update: 2025-12-13
See Project
25

MuseGAN

An AI for Music Generation

MuseGAN is a deep learning research project designed to generate symbolic music using generative adversarial networks. The system focuses specifically on generating multi-track polyphonic music, meaning that it can simultaneously produce multiple instrument parts such as drums, bass, piano, guitar, and strings. Instead of generating raw audio, the model operates on piano-roll representations of music, which encode notes as time-pitch matrices for each instrument track. This representation allows the neural network to capture rhythmic patterns, harmonic relationships, and structural dependencies across instruments. The architecture is based on convolutional GAN models that learn temporal musical structure and inter-track relationships from training data. The project was trained using the Lakh Pianoroll Dataset, a large collection of multitrack musical sequences derived from MIDI files.

Downloads: 2 This Week

Last Update: 2026-03-12
See Project