Showing 747 open source projects for "python data analysis"

View related business solutions
  • Failed Payment Recovery for Subscription Businesses Icon
    Failed Payment Recovery for Subscription Businesses

    For subscription companies searching for a failed payment recovery solution to grow revenue, and retain customers.

    FlexPay’s innovative platform uses multiple technologies to achieve the highest number of retained customers, resulting in reduced involuntary churn, longer life span after recovery, and higher revenue. Leading brands like LegalZoom, Hooked on Phonics, and ClinicSense trust FlexPay to recover failed payments, reduce churn, and increase customer lifetime value.
    Learn More
  • Collect! is a highly configurable debt collection software Icon
    Collect! is a highly configurable debt collection software

    Everything that matters to debt collection, all in one solution.

    The flexible & scalable debt collection software built to automate your workflow. From startup to enterprise, we have the solution for you.
    Learn More
  • 1
    VisPy

    VisPy

    Main repository for Vispy

    Vispy is an open-source, high-performance interactive visualization library in Python, designed for creating scientific visualizations and interactive plots. It leverages the power of modern Graphics Processing Units (GPUs) through OpenGL to render large datasets efficiently. Vispy supports a wide range of visualization types, including 2D plots, 3D visualizations, volume rendering, and more, making it suitable for scientific research, data analysis, and educational purposes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Pandas Profiling

    Pandas Profiling

    Create HTML profiling reports from pandas DataFrame objects

    pandas-profiling generates profile reports from a pandas DataFrame. The pandas df.describe() function is handy yet a little basic for exploratory data analysis. pandas-profiling extends pandas DataFrame with df.profile_report(), which automatically generates a standardized univariate and multivariate report for data understanding. High correlation warnings, based on different correlation metrics (Spearman, Pearson, Kendall, Cramér’s V, Phik). Most common categories (uppercase, lowercase, separator), scripts (Latin, Cyrillic) and blocks (ASCII, Cyrilic). ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    PyBroker

    PyBroker

    Algorithmic Trading in Python with Machine Learning

    Are you looking to enhance your trading strategies with the power of Python and machine learning? Then you need to check out PyBroker! This Python framework is designed for developing algorithmic trading strategies, with a focus on strategies that use machine learning. With PyBroker, you can easily create and fine-tune trading rules, build powerful models, and gain valuable insights into your strategy’s performance.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 4
    JS Analyzer

    JS Analyzer

    Burp Suite extension for JavaScript static analysis

    ...It also includes UI features such as live search, result filtering, and the ability to export findings in JSON format for further processing. The underlying engine can be used independently in Python, enabling integration into custom workflows or automated pipelines outside Burp Suite.
    Downloads: 0 This Week
    Last Update:
    See Project
  • AestheticsPro Medical Spa Software Icon
    AestheticsPro Medical Spa Software

    Our new software release will dramatically improve your medspa business performance while enhancing the customer experience

    AestheticsPro is the most complete Aesthetics Software on the market today. HIPAA Cloud Compliant with electronic charting, integrated POS, targeted marketing and results driven reporting; AestheticsPro delivers the tools you need to manage your medical spa business. It is our mission To Provide an All-in-One Cutting Edge Software to the Aesthetics Industry.
    Learn More
  • 5
    Datasette

    Datasette

    An open source multi-tool for exploring and publishing data

    Datasette is a tool for exploring and publishing data. It helps people take data of any shape or size, analyze and explore it, and publish it as an interactive website and accompanying API. Datasette is aimed at data journalists, museum curators, archivists, local governments, scientists, researchers and anyone else who has data that they wish to share with the world. It is part of a wider ecosystem of tools and plugins dedicated to making working with structured data as productive as...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    Optopsy

    Optopsy

    A nimble options backtesting library for Python

    Optopsy is a Python-based, nimble backtesting and statistics library focused on evaluating options trading strategies like calls, puts, straddles, spreads, and more, using pandas-driven analysis. The csv_data() function is a convenience function. Under the hood it uses Panda's read_csv() function to do the import. There are other parameters that can help with loading the csv data, consult the code/future documentation to see how to use them.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    Kronos

    Kronos

    A Foundation Model for the Language of Financial Markets

    Kronos is a specialized open-source foundation model designed for analyzing and predicting financial market data using time-series representations of candlestick patterns. It is built as a decoder-only Transformer model trained specifically on K-line data, which captures open, high, low, close, and volume information across multiple global exchanges. The system introduces a novel tokenization approach that converts continuous financial data into discrete tokens, enabling the model to process...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 8
    SageMaker Spark Container

    SageMaker Spark Container

    Docker image used to run data processing workloads

    Apache Spark™ is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    FiftyOne

    FiftyOne

    The open-source tool for building high-quality datasets

    ...FiftyOne provides the building blocks for optimizing your dataset analysis pipeline. Use it to get hands-on with your data, including visualizing complex labels, evaluating your models, exploring scenarios of interest, identifying failure modes, finding annotation mistakes, and much more! Surveys show that machine learning engineers spend over half of their time wrangling data, but it doesn't have to be that way.
    Downloads: 6 This Week
    Last Update:
    See Project
  • The full-stack observability platform that protects your dataLayer, tags and conversion data Icon
    The full-stack observability platform that protects your dataLayer, tags and conversion data

    Stop losing revenue to bad data today. and protect your marketing data with Code-Cube.io.

    Code-Cube.io detects issues instantly, alerts you in real time and helps you resolve them fast. No manual QA. No unreliable data. Just data you can trust and act on.
    Learn More
  • 10
    atpbar

    atpbar

    Progress bars for threading and multiprocessing tasks on terminal

    Progress bars for threading and multiprocessing tasks on the terminal and Jupyter Notebook. atpbar can display multiple progress bars simultaneously growing to show the progresses of iterations of loops in threading or multiprocessing tasks. atpbar can display progress bars on the terminal and Jupyter Notebook. atpbar can be used with Mantichora. atpbar started its development in 2015 as part of Alphatwirl. atpbar prevented physicists from terminating their running analysis codes, which...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    folium

    folium

    Python data, Leaflet.js maps

    folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the leaflet.js library. Manipulate your data in Python, then visualize it in on a Leaflet map via folium. folium makes it easy to visualize data that’s been manipulated in Python on an interactive leaflet map. It enables both the binding of data to a map for choropleth visualizations as well as passing rich vector/raster/HTML visualizations as markers on the map. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 12
    Qbot

    Qbot

    AI-powered Quantitative Investment Research Platform

    ...For evaluation and analysis, Qbot integrates reporting and visualization (tearsheets, metrics) so you can compare performance across runs and inspect trade-level behavior. It supports multiple strategy runtimes and backtesting engines, is organized for extensibility (strategies live in a dedicated folder).
    Downloads: 41 This Week
    Last Update:
    See Project
  • 13
    Pathway

    Pathway

    Python ETL framework for stream processing, real-time analytics, LLM

    ...Unlike traditional batch processing frameworks, Pathway continuously updates the results of your data logic as new events arrive, functioning more like a database that reacts in real-time. It supports Python, integrates with modern data tools, and offers a deterministic dataflow model to ensure reproducibility and correctness.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 14
    QuantDinger

    QuantDinger

    AI-driven, local-first quantitative trading platform for research

    QuantDinger is a local-first, open-source quantitative trading platform designed to bring AI-assisted analysis, strategy development, backtesting, and live execution into a self-hosted workspace where data and API credentials remain under your control. Unlike cloud-locked quant services, it lets users run the entire trading workflow on their own infrastructure using Docker, with a PostgreSQL database backend, a Python backend API, and a web frontend UI that supports visualization and strategy management. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    MinerU

    MinerU

    A high-quality tool for convert PDF to Markdown and JSON

    MinerU is an open-source, high-quality document extraction toolkit focused on converting PDFs (and other document formats) into structured Markdown and JSON. It leverages OCR and layout analysis to preserve semantic structure and metadata, ideal for research and data science workflows.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 16
    data-diff

    data-diff

    Efficiently diff rows across two different databases

    We're excited to announce the launch of a new open-source product, data-diff that makes comparing datasets across databases fast at any scale. data-diff automates data quality checks for data replication and migration. In modern data platforms, data is constantly moving between systems, and at the modern data volume and complexity, systems go out of sync all the time. Until now, there has not been any tooling to ensure that when the data is correctly copied. Replicating data at scale, across...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 17
    Bytewax

    Bytewax

    Python Stream Processing

    Bytewax is a Python framework that simplifies event and stream processing. Because Bytewax couples the stream and event processing capabilities of Flink, Spark, and Kafka Streams with the friendly and familiar interface of Python, you can re-use the Python libraries you already know and love. Connect data sources, run stateful transformations, and write to various downstream systems with built-in connectors or existing Python libraries.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    Union Pandera

    Union Pandera

    Light-weight, flexible, expressive statistical data testing library

    ...Validate the functions that produce your data by automatically generating test cases for them. Integrate seamlessly with the Python ecosystem. Overcome the initial hurdle of defining a schema by inferring one from clean data, then refine it over time. Identify the critical points in your data pipeline, and validate data going in and out of them. Build confidence in the quality of your data by defining schemas for complex data objects.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 19
    Mage.ai

    Mage.ai

    Build, run, and manage data pipelines for integrating data

    Open-source data pipeline tool for transforming and integrating data. The modern replacement for Airflow. Effortlessly integrate and synchronize data from 3rd party sources. Build real-time and batch pipelines to transform data using Python, SQL, and R. Run, monitor, and orchestrate thousands of pipelines without losing sleep. Have you met anyone who said they loved developing in Airflow?
    Downloads: 6 This Week
    Last Update:
    See Project
  • 20
    marimo

    marimo

    A reactive notebook for Python

    marimo is an open-source reactive notebook for Python, reproducible, git-friendly, executable as a script, and shareable as an app. marimo notebooks are reproducible, extremely interactive, designed for collaboration (git-friendly!), deployable as scripts or apps, and fit for modern Pythonista. Run one cell and marimo reacts by automatically running affected cells, eliminating the error-prone chore of managing the notebook state. marimo's reactive UI elements, like data frame GUIs and plots, make working with data feel refreshingly fast, futuristic, and intuitive. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    HyperTools

    HyperTools

    A Python toolbox for gaining geometric insights

    HyperTools is a library for visualizing and manipulating high-dimensional data in Python. It is built on top of matplotlib (for plotting), seaborn (for plot styling), and scikit-learn (for data manipulation). Functions for plotting high-dimensional datasets in 2/3D. Static and animated plots. Simple API for customizing plot styles. Set of powerful data manipulation tools including hyperalignment, k-means clustering, normalizing and more.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 22
    F1 Race Replay

    F1 Race Replay

    An interactive Formula 1 race visualisation and data analysis tool

    F1 Race Replay is an interactive replay viewer that lets users watch and analyze recorded Formula 1 race sessions with precise control over camera angles, timing, and telemetry overlay, offering a rich experience beyond standard broadcast replays. It ingests official timing and positional data, then renders vehicle movements through track maps and 3D visualizations so fans, analysts, and engineers can review strategy, overtakes, tire degradation effects, and pit stop impacts in detail. Users...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Dask

    Dask

    Parallel computing with task scheduling

    Dask is a Python library for parallel and distributed computing, designed to scale analytics workloads from single machines to large clusters. It integrates with familiar tools like NumPy, Pandas, and scikit-learn while enabling execution across cores or nodes with minimal code changes. Dask excels at handling large datasets that don’t fit into memory and is widely used in data science, machine learning, and big data pipelines.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 24
    Grafana

    Grafana

    Leading open-source visualization and observability platform

    Grafana OSS is the leading open-source platform for visualization and observability. It enables teams to query, visualize, alert on, and explore telemetry data from multiple sources in a single interface. With support for 100+ data source plugins—including Prometheus, Loki, Elasticsearch, InfluxDB, SQL/NoSQL databases, and OpenTelemetry—Grafana helps teams correlate metrics, logs, and traces across applications and infrastructure. Users can build interactive dashboards with rich...
    Downloads: 28 This Week
    Last Update:
    See Project
  • 25
    JILL.py

    JILL.py

    A cross-platform installer for the Julia programming language

    The enhanced Python fork of JILL, Julia Installer for Linux (and every other platform), Light.
    Downloads: 7 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB