Showing 1466 open source projects for "python data analysis"

View related business solutions
  • Data management solutions for confident marketing Icon
    Data management solutions for confident marketing

    For companies wanting a complete Data Management solution that is native to Salesforce

    Verify, deduplicate, manipulate, and assign records automatically to keep your CRM data accurate, complete, and ready for business.
    Learn More
  • The AI workplace management platform Icon
    The AI workplace management platform

    Plan smart spaces, connect teams, manage assets, and get insights with the leading AI-powered operating system for the built world.

    By combining AI workflows, predictive intelligence, and automated insights, OfficeSpace gives leaders a complete view of how their spaces are used and how people work. Facilities, IT, HR, and Real Estate teams use OfficeSpace to optimize space utilization, enhance employee experience, and reduce portfolio costs with precision.
    Learn More
  • 1
    CausalInference.jl

    CausalInference.jl

    Causal inference, graphical models and structure learning in Julia

    Julia package for causal inference and analysis, graphical models and structure learning. This package contains code for the PC algorithm and the extended FCI algorithm, the score based greedy equivalence search (GES) algorithm, the Bayesian Causal Zig-Zag sampler and a function suite for adjustment set search.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    IoTDB

    IoTDB

    Apache IoTDB

    Apache IoTDB (Database for Internet of Things) is an IoT native database with high performance for data management and analysis, deployable on the edge and the cloud. Due to its light-weight architecture, high performance and rich feature set together with its deep integration with Apache Hadoop, Spark and Flink, Apache IoTDB can meet the requirements of massive data storage, high-speed data ingestion and complex data analysis in the IoT industrial fields. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 3
    G6

    G6

    A Graph Visualization Framework in JavaScript

    ...Based on the ability to customize, it provides a set of elegant graph visualization solutions and helps developers to build up applications for graph visualization, graph analysis, and graph editor. G6 is a complete graph visualization engine, which focuses on relational data. According to practical business scenarios, we found the top solutions. Well-designed simple, flexible, and extendable interfaces will satisfy your infinite originality. A social network is an important scenario in graph visualization. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    hctsa

    hctsa

    Highly comparative time-series analysis

    hctsa is a Matlab software package for running highly comparative time-series analysis. It extracts thousands of time-series features from a collection of univariate time series and includes a range of tools for visualizing and analyzing the resulting time-series feature matrix.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Turn traffic into pipeline and prospects into customers Icon
    Turn traffic into pipeline and prospects into customers

    For account executives and sales engineers looking for a solution to manage their insights and sales data

    Docket is an AI-powered sales enablement platform designed to unify go-to-market (GTM) data through its proprietary Sales Knowledge Lake™ and activate it with intelligent AI agents. The platform helps marketing teams increase pipeline generation by 15% by engaging website visitors in human-like conversations and qualifying leads. For sales teams, Docket improves seller efficiency by 33% by providing instant product knowledge, retrieving collateral, and creating personalized documents. Built for GTM teams, Docket integrates with over 100 tools across the revenue tech stack and offers enterprise-grade security with SOC 2 Type II, GDPR, and ISO 27001 compliance. Customers report improved win rates, shorter sales cycles, and dramatically reduced response times. Docket’s scalable, accurate, and fast AI agents deliver reliable answers with confidence scores, empowering teams to close deals faster.
    Learn More
  • 5
    Julia VS Code

    Julia VS Code

    Julia extension for Visual Studio Code

    This VS Code extension provides support for the Julia programming language. We build on Julia’s unique combination of ease-of-use and performance. Beginners and experts can build better software more quickly, and get to a result faster. With a completely live environment, Julia for VS Code aims to take the frustration and guesswork out of programming and put the fun back in. A hybrid “canvas programming” style combines the exploratory power of a notebook with the productivity and static...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    Pathway

    Pathway

    Python ETL framework for stream processing, real-time analytics, LLM

    ...Unlike traditional batch processing frameworks, Pathway continuously updates the results of your data logic as new events arrive, functioning more like a database that reacts in real-time. It supports Python, integrates with modern data tools, and offers a deterministic dataflow model to ensure reproducibility and correctness.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 7
    Union Pandera

    Union Pandera

    Light-weight, flexible, expressive statistical data testing library

    ...Validate the functions that produce your data by automatically generating test cases for them. Integrate seamlessly with the Python ecosystem. Overcome the initial hurdle of defining a schema by inferring one from clean data, then refine it over time. Identify the critical points in your data pipeline, and validate data going in and out of them. Build confidence in the quality of your data by defining schemas for complex data objects.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    marimo

    marimo

    A reactive notebook for Python

    marimo is an open-source reactive notebook for Python, reproducible, git-friendly, executable as a script, and shareable as an app. marimo notebooks are reproducible, extremely interactive, designed for collaboration (git-friendly!), deployable as scripts or apps, and fit for modern Pythonista. Run one cell and marimo reacts by automatically running affected cells, eliminating the error-prone chore of managing the notebook state. marimo's reactive UI elements, like data frame GUIs and plots, make working with data feel refreshingly fast, futuristic, and intuitive. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    Elasticsearch

    Elasticsearch

    A Distributed RESTful Search Engine

    Elasticsearch is a distributed, RESTful search and analytics engine that lets you store, search and analyze with ease at scale. It lets you perform and combine many types of searches; it scales seamlessly, and offers answers incredibly fast with search results you can rank based on a variety of factors. Elasticsearch can be used for a wide variety of use cases, from maps and metrics to site search and workplace search, and with all data types.
    Downloads: 15 This Week
    Last Update:
    See Project
  • Failed Payment Recovery for Subscription Businesses Icon
    Failed Payment Recovery for Subscription Businesses

    For subscription companies searching for a failed payment recovery solution to grow revenue, and retain customers.

    FlexPay’s innovative platform uses multiple technologies to achieve the highest number of retained customers, resulting in reduced involuntary churn, longer life span after recovery, and higher revenue. Leading brands like LegalZoom, Hooked on Phonics, and ClinicSense trust FlexPay to recover failed payments, reduce churn, and increase customer lifetime value.
    Learn More
  • 10
    Bytewax

    Bytewax

    Python Stream Processing

    Bytewax is a Python framework that simplifies event and stream processing. Because Bytewax couples the stream and event processing capabilities of Flink, Spark, and Kafka Streams with the friendly and familiar interface of Python, you can re-use the Python libraries you already know and love. Connect data sources, run stateful transformations, and write to various downstream systems with built-in connectors or existing Python libraries.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    PythonCall & JuliaCall

    PythonCall & JuliaCall

    Python and Julia in harmony

    Bringing Python® and Julia together in seamless harmony. Call Python code from Julia and Julia code from Python via a symmetric interface. Simple syntax, so the Python code looks like Python and the Julia code looks like Julia. Intuitive and flexible conversions between Julia and Python: anything can be converted, you are in control. Fast non-copying conversion of numeric arrays in either direction: modify Python arrays (e.g. bytes, array. array, numpy.ndarray) from Julia or Julia arrays...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    Mage.ai

    Mage.ai

    Build, run, and manage data pipelines for integrating data

    Open-source data pipeline tool for transforming and integrating data. The modern replacement for Airflow. Effortlessly integrate and synchronize data from 3rd party sources. Build real-time and batch pipelines to transform data using Python, SQL, and R. Run, monitor, and orchestrate thousands of pipelines without losing sleep. Have you met anyone who said they loved developing in Airflow?
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    data-diff

    data-diff

    Efficiently diff rows across two different databases

    We're excited to announce the launch of a new open-source product, data-diff that makes comparing datasets across databases fast at any scale. data-diff automates data quality checks for data replication and migration. In modern data platforms, data is constantly moving between systems, and at the modern data volume and complexity, systems go out of sync all the time. Until now, there has not been any tooling to ensure that when the data is correctly copied. Replicating data at scale, across...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    Dask

    Dask

    Parallel computing with task scheduling

    Dask is a Python library for parallel and distributed computing, designed to scale analytics workloads from single machines to large clusters. It integrates with familiar tools like NumPy, Pandas, and scikit-learn while enabling execution across cores or nodes with minimal code changes. Dask excels at handling large datasets that don’t fit into memory and is widely used in data science, machine learning, and big data pipelines.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    IntervalRootFinding.jl

    IntervalRootFinding.jl

    Find all roots of a function in a guaranteed way with Julia

    This package provides guaranteed methods for finding roots of functions, i.e. solutions to the equation f(x) == 0 for a function f. To do so, it uses methods from interval analysis, using interval arithmetic from the IntervalArithmetic.jl package by the same authors. The basic function is roots. A standard Julia function and an interval is provided and the roots function return a list of intervals containing all roots of the function located in the starting interval.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    fluentbit

    fluentbit

    Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX

    Fluent Bit is a super-fast, lightweight, and highly scalable logging and metrics processor and forwarder. It is the preferred choice for cloud and containerized environments. A robust, lightweight, and portable architecture for high throughput with low CPU and memory usage from any data source to any destination. Proven across distributed cloud and container environments. Highly available with I/O handlers to store data for disaster recovery. Granular management of data parsing and routing....
    Downloads: 12 This Week
    Last Update:
    See Project
  • 17
    DataChain

    DataChain

    AI-data warehouse to enrich, transform and analyze unstructured data

    Datachain enables multimodal API calls and local AI inferences to run in parallel over many samples as chained operations. The resulting datasets can be saved, versioned, and sent directly to PyTorch and TensorFlow for training. Datachain can persist features of Python objects returned by AI models, and enables vectorized analytical operations over them. The typical use cases are data curation, LLM analytics and validation, image segmentation, pose detection, and GenAI alignment. Datachain is especially helpful if batch operations can be optimized – for instance, when synchronous API calls can be parallelized or where an LLM API offers batch processing.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    Semantic Type Detection

    Semantic Type Detection

    Metadata/data identification Java library

    ...Usable in either Streaming, Bulk or Record mode. Broad country/language support - including US, Canada, Mexico, Brazil, UK, Australia, much of Europe, Japan and China. Support for sharded analysis (i.e. Analysis results can be merged) Once stream is profiled then subsequent samples can be validated and/or new samples can be generated.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    JILL.py

    JILL.py

    A cross-platform installer for the Julia programming language

    The enhanced Python fork of JILL, Julia Installer for Linux (and every other platform), Light.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    GemGIS

    GemGIS

    Spatial data processing for geomodeling

    GemGIS is a Python-based, open-source geographic information processing library. It is capable of preprocessing spatial data such as vector data (shape files, geojson files, geopackages,…), raster data (tif, png,…), data obtained from online services (WCS, WMS, WFS) or XML/KML files (soon). Preprocessed data can be stored in a dedicated Data Class to be passed to the geomodeling package GemPy in order to accelerate the model-building process. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 21
    Matplotlib

    Matplotlib

    matplotlib: plotting with Python

    Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Matplotlib makes easy things easy and hard things possible. Matplotlib ships with several add-on toolkits, including 3D plotting with mplot3d, axes helpers in axes_grid1 and axis helpers in axisartist. A large number of third party packages extend and build on Matplotlib functionality, including several higher-level plotting interfaces (seaborn, HoloViews, ggplot, ...), and a...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 22
    Mercury

    Mercury

    Convert Python notebook to web app and share with non-technical users

    Turn Python notebooks to web applications with open-source Mercury framework. Hide code and add interactive widgets. Non-technical users can tweak widgets and execute notebook with new parameters. The core of Mercury is Open Source under AGPLv3. We provide Mercury Pro with additional features, dedicated support and friendly commercial license. Mercury is a perfect tool to convert Python notebook to interactive web application and share with non-programmers. You define interactive widgets for...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 23
    Dash

    Dash

    Build beautiful web-based analytic apps, no JavaScript required

    Dash is a Python framework for building beautiful analytical web applications without any JavaScript. Built on top of Plotly.js, React and Flask, Dash easily achieves what an entire team of designers and engineers normally would. It ties modern UI controls and displays such as dropdown menus, sliders and graphs directly to your analytical Python code, and creates exceptional, interactive analytics apps. Dash apps are very lightweight, requiring only a limited number of lines of Python or...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 24
    Plots

    Plots

    Powerful convenience for Julia visualizations and data analysis

    Data visualization has a complicated history. Plotting software makes trade-offs between features and simplicity, speed and beauty, and a static and dynamic interface. Some packages make a display and never change it, while others make updates in real-time. Plots is a visualization interface and toolset. It sits above other backends, like GR, PythonPlot, PGFPlotsX, or Plotly, connecting commands with implementation. If one backend does not support your desired features or make the right...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    Tokenize.jl

    Tokenize.jl

    Tokenization for Julia source code

    Tokenize is a Julia package that serves a similar purpose and API as the tokenize module in Python but for Julia. This is to take a string or buffer containing Julia code, perform lexical analysis and return a stream of tokens.
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB