The fastest way to build data pipelines
A high performance implementation of HDBSCAN clustering
Python Client for Supabase. Query Postgres from Flask, Django
Extensible, parallel implementations of t-SNE
A framework for real-life data science
A modular, primitive-first, python-first PyTorch library
Detecting silent model failure. NannyML estimates performance
Training data (data labeling, annotation, workflow) for all data types
Uncover insights, surface problems, monitor, and fine tune your LLM
Master the essential skills needed to recognize and solve problems
Label Studio is a multi-type data labeling and annotation tool
The RF and reverse engineering framework for everyone
Python Optimal Transport
A Python Automated Machine Learning tool that optimizes ML
Data science on data without acquiring a copy
Functional Machine Learning
Python package for AutoML on Tabular Data with Feature Engineering
machine learning tutorials (mainly in Python3)
MLOps simplified. From ML Pipeline ⇨ Data Product without the hassle
Helps scientists define testable, modular, self-documenting dataflow
Topic Modelling for Humans
A curated list of data mining papers about fraud detection
Effortless data labeling with AI support from Segment Anything
AutoGluon: AutoML for Image, Text, and Tabular Data