Showing 16 open source projects for "python data analysis"

View related business solutions
  • The CRM you will want to use every day Icon
    The CRM you will want to use every day

    With CRM, Sales, and Marketing Automation in one, Act! gives you everything you need for happier clients, more revenue, and less stress.

    Act! Premium is perfect for small and midsize businesses looking to market better, sell more, and create customers for life. With unparalleled flexibility and freedom of choice, Act! Premium accommodates the unique ways you do business. Whether it’s customizations to fit your specific business or industry processes or your preferences for deployment and access, the possibilities with Act! Premium are limitless.
    Learn More
  • IT Asset Management (ITAM) Software Icon
    IT Asset Management (ITAM) Software

    Supercharge Your IT Assets, the Easy Way

    Drowning in misplaced IT assets, compliance headaches, and shadow IT? Navigate to clarity with an intuitive IT Asset Management solution. Experience crisp visibility, effortless control, and unshakable security – all while freeing up your budget with optimized software licenses. The best part? It’s easy.
    Learn More
  • 1
    Best-of Python

    Best-of Python

    A ranked list of awesome Python open-source libraries

    ...If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome! Ranked list of awesome python libraries for web development. Correctly generate plurals, ordinals, indefinite articles; convert numbers. Libraries for loading, collecting, and extracting data from a variety of data sources and formats. Libraries for data batch- and stream-processing, workflow automation, job scheduling, and other data pipeline tasks.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    SageMaker Spark Container

    SageMaker Spark Container

    Docker image used to run data processing workloads

    Apache Spark™ is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    Pathway

    Pathway

    Python ETL framework for stream processing, real-time analytics, LLM

    ...Unlike traditional batch processing frameworks, Pathway continuously updates the results of your data logic as new events arrive, functioning more like a database that reacts in real-time. It supports Python, integrates with modern data tools, and offers a deterministic dataflow model to ensure reproducibility and correctness.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 4
    Bytewax

    Bytewax

    Python Stream Processing

    Bytewax is a Python framework that simplifies event and stream processing. Because Bytewax couples the stream and event processing capabilities of Flink, Spark, and Kafka Streams with the friendly and familiar interface of Python, you can re-use the Python libraries you already know and love. Connect data sources, run stateful transformations, and write to various downstream systems with built-in connectors or existing Python libraries.
    Downloads: 8 This Week
    Last Update:
    See Project
  • Endpoint Protection Software for Businesses | HYPERSECURE Icon
    Endpoint Protection Software for Businesses | HYPERSECURE

    DriveLock protects systems, data, end devices from data loss and misuse.

    The HYPERSECURE endpoint protection platform is a comprehensive suite of products and services enhanced by European third-party solutions. It ensures our customers’ IT security, regulatory compliance, and digital sovereignty.
    Learn More
  • 5
    Lithops

    Lithops

    A multi-cloud framework for big data analytics

    Lithops is an open-source serverless computing framework that enables transparent execution of Python functions across multiple cloud providers and on-prem infrastructure. It abstracts cloud providers like IBM Cloud, AWS, Azure, and Google Cloud into a unified interface and turns your Python functions into scalable, event-driven workloads. Lithops is ideal for data processing, ML inference, and embarrassingly parallel workloads, giving you the power of FaaS (Function-as-a-Service) without vendor lock-in. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    Pyper

    Pyper

    Concurrent Python made simple

    Pyper is a Python-native orchestration and scheduling framework designed for modern data workflows, machine learning pipelines, and any task that benefits from a lightweight DAG-based execution engine. Unlike heavier platforms like Airflow, Pyper aims to remain lean, modular, and developer-friendly, embracing Pythonic conventions and minimizing boilerplate.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    fluentbit

    fluentbit

    Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX

    Fluent Bit is a super-fast, lightweight, and highly scalable logging and metrics processor and forwarder. It is the preferred choice for cloud and containerized environments. A robust, lightweight, and portable architecture for high throughput with low CPU and memory usage from any data source to any destination. Proven across distributed cloud and container environments. Highly available with I/O handlers to store data for disaster recovery. Granular management of data parsing and routing....
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Siddhi Core Libraries

    Siddhi Core Libraries

    Stream Processing and Complex Event Processing Engine

    Fully open source, cloud-native, scalable, micro streaming, and complex event processing system capable of building event-driven applications for use cases such as real-time analytics, data integration, notification management, and adaptive decision-making. Event processing logic can be written using Streaming SQL queries via graphical and source editors, to capture events from diverse data sources, process and analyze them, integrate with multiple services and data stores, and publish output to various endpoints in real time. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    Fondant

    Fondant

    Production-ready data processing made easy and shareable

    Fondant is a modular, pipeline-based framework designed to simplify the preparation of large-scale datasets for training machine learning models, especially foundation models. It offers an end-to-end system for ingesting raw data, applying transformations, filtering, and formatting outputs—all while remaining scalable and traceable. Fondant is designed with reproducibility in mind and supports containerized steps using Docker, making it easy to share and reuse data processing components....
    Downloads: 0 This Week
    Last Update:
    See Project
  • Shoplogix Smart Factory Platform Icon
    Shoplogix Smart Factory Platform

    For manufacturers looking for a powerful Manufacturing Execution solution

    Real-time Visibility into Your Shop Floor's Performance. The Shoplogix smart factory platform enables manufacturers to increase overall equipment effectiveness, reduce operational costs, sustain growth and improve profitability by allowing them to visualize, integrate and act on production and machine performance in real-time. Manufacturers that trust us to drive efficiency in their factories. Real-time visual data and analytics provide valuable insights to make better informed decisions. Uncover hidden shop floor potential and drive rapid time to value. Develop a continuously improving culture through training, education and data-driven decisions. Compete in the i4.0 world by making the Shoplogix Smart Factory Platform the cornerstone of your digital transformation. Connect to any equipment or device to automate data collection and exchange it with other manufacturing technologies. Automatically monitor, report and analyze machine states to track real-time production.
    Learn More
  • 10
    CocoIndex

    CocoIndex

    ETL framework to index data for AI, such as RAG

    CocoIndex is an open-source framework designed for building powerful, local-first semantic search systems. It lets users index and retrieve content based on meaning rather than keywords, making it ideal for modern AI-based search applications. CocoIndex leverages vector embeddings and integrates with various models and frameworks, including OpenAI and Hugging Face, to provide high-quality semantic understanding. It’s built for transparency, ease of use, and local control over your search...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 11
    text-dedup

    text-dedup

    All-in-one text de-duplication

    text-dedup is a Python library that enables efficient deduplication of large text corpora by using MinHash and other probabilistic techniques to detect near-duplicate content. This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Padasip

    Padasip

    Python Adaptive Signal Processing

    Padasip (Python Adaptive Signal Processing) is a Python library tailored for adaptive filtering and online learning applications, particularly in signal processing and time series forecasting. It includes a variety of adaptive filter algorithms such as LMS, RLS, and their variants, offering real-time adaptation to changing environments. The library is lightweight, well-documented, and ideal for research, prototyping, or teaching purposes. Padasip supports both supervised and unsupervised...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    Amadeus

    Amadeus

    Harmonious distributed data analysis in Rust

    Amadeus is a high-performance, distributed data processing framework written in Rust, designed to offer an ergonomic and safe alternative to tools like Apache Spark. It provides both streaming and batch capabilities, allowing users to work with real-time and historical data at scale. Thanks to Rust’s memory safety and zero-cost abstractions, Amadeus delivers performance gains while reducing the complexity and bugs common in large-scale data pipelines. It emphasizes developer productivity...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 14
    Wally

    Wally

    Distributed Stream Processing

    Wally is a fast-stream-processing framework. Wally makes it easy to react to data in real-time. By eliminating infrastructure complexity, going from prototype to production has never been simpler. When we set out to build Wally, we had several high-level goals in mind. Create a dependable and resilient distributed computing framework. Take care of the complexities of distributed computing "plumbing," allowing developers to focus on their business logic. Provide high-performance & low-latency...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    Cosmos DB Spark

    Cosmos DB Spark

    Apache Spark Connector for Azure Cosmos DB

    Azure Cosmos DB Spark is the official connector for Azure CosmosDB and Apache Spark. The connector allows you to easily read to and write from Azure Cosmos DB via Apache Spark DataFrames in Python and Scala. It also allows you to easily create a lambda architecture for batch-processing, stream-processing, and a serving layer while being globally replicated and minimizing the latency involved in working with big data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Sed.py is a python module to provide a easy way to do text stream processing. Just like the name of module, it likes to do the work that sed can do. But not in sed's way, it's in Python's way. To use this module, the knowledge of regexp is necessary.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB