Showing 21 open source projects for "python data analysis"

View related business solutions
  • SIEM | API Security | Log Management Software Icon
    SIEM | API Security | Log Management Software

    AI-Powered Security and IT Operations Without Compromise.

    Built on the Graylog Platform, Graylog Security is the industry’s best-of-breed threat detection, investigation, and response (TDIR) solution. It simplifies analysts’ day-to-day cybersecurity activities with an unmatched workflow and user experience while simultaneously providing short- and long-term budget flexibility in the form of low total cost of ownership (TCO) that CISOs covet. With Graylog Security, security analysts can:
    Learn More
  • Caller ID Reputation provides the most comprehensive view of your caller ID scores across all carriers Icon
    Caller ID Reputation provides the most comprehensive view of your caller ID scores across all carriers

    Instantly identify flagged caller IDs and decrease flags by up to 95% your first month.

    Keep your agents on the phone with increased connection rates by monitoring your phone number reputation across all major carriers and call blocking apps.
    Learn More
  • 1
    Best-of Python

    Best-of Python

    A ranked list of awesome Python open-source libraries

    ...If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome! Ranked list of awesome python libraries for web development. Correctly generate plurals, ordinals, indefinite articles; convert numbers. Libraries for loading, collecting, and extracting data from a variety of data sources and formats. Libraries for data batch- and stream-processing, workflow automation, job scheduling, and other data pipeline tasks.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    SageMaker Spark Container

    SageMaker Spark Container

    Docker image used to run data processing workloads

    Apache Spark™ is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    Pathway

    Pathway

    Python ETL framework for stream processing, real-time analytics, LLM

    ...Unlike traditional batch processing frameworks, Pathway continuously updates the results of your data logic as new events arrive, functioning more like a database that reacts in real-time. It supports Python, integrates with modern data tools, and offers a deterministic dataflow model to ensure reproducibility and correctness.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 4
    Bytewax

    Bytewax

    Python Stream Processing

    Bytewax is a Python framework that simplifies event and stream processing. Because Bytewax couples the stream and event processing capabilities of Flink, Spark, and Kafka Streams with the friendly and familiar interface of Python, you can re-use the Python libraries you already know and love. Connect data sources, run stateful transformations, and write to various downstream systems with built-in connectors or existing Python libraries.
    Downloads: 8 This Week
    Last Update:
    See Project
  • The most trusted software in construction Icon
    The most trusted software in construction

    HCSS is the gold standard software solution for winning, planning, and managing construction projects by connecting the office to the field.

    HCSS provides easy-to-use software built for construction companies that want to win more work, work smarter, and boost profits. For nearly 40 years, we've helped heavy civil contractors, infrastructure builders, and utility companies improve operations, from estimating and project management to field tracking, equipment maintenance, and safety. Tools like HeavyBid, HeavyJob, and HCSS Safety are built for the field and designed to work together, giving your team real-time visibility, tighter cost control, and better job outcomes. With 45+ accounting integrations and customizable APIs, HCSS fits seamlessly into your tech stack. We regularly update our software based on feedback from real crews, ensuring it fits the way your team works. Backed by award-winning 24/7/365 support and a proven implementation process, HCSS helps reduce risk, cut inefficiencies, and deliver fast ROI. If you're ready to grow your business and gain a competitive edge, HCSS is the partner that gets you there.
    Learn More
  • 5
    Lithops

    Lithops

    A multi-cloud framework for big data analytics

    Lithops is an open-source serverless computing framework that enables transparent execution of Python functions across multiple cloud providers and on-prem infrastructure. It abstracts cloud providers like IBM Cloud, AWS, Azure, and Google Cloud into a unified interface and turns your Python functions into scalable, event-driven workloads. Lithops is ideal for data processing, ML inference, and embarrassingly parallel workloads, giving you the power of FaaS (Function-as-a-Service) without vendor lock-in. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    Pyper

    Pyper

    Concurrent Python made simple

    Pyper is a Python-native orchestration and scheduling framework designed for modern data workflows, machine learning pipelines, and any task that benefits from a lightweight DAG-based execution engine. Unlike heavier platforms like Airflow, Pyper aims to remain lean, modular, and developer-friendly, embracing Pythonic conventions and minimizing boilerplate.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    fluentbit

    fluentbit

    Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX

    Fluent Bit is a super-fast, lightweight, and highly scalable logging and metrics processor and forwarder. It is the preferred choice for cloud and containerized environments. A robust, lightweight, and portable architecture for high throughput with low CPU and memory usage from any data source to any destination. Proven across distributed cloud and container environments. Highly available with I/O handlers to store data for disaster recovery. Granular management of data parsing and routing....
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Siddhi Core Libraries

    Siddhi Core Libraries

    Stream Processing and Complex Event Processing Engine

    Fully open source, cloud-native, scalable, micro streaming, and complex event processing system capable of building event-driven applications for use cases such as real-time analytics, data integration, notification management, and adaptive decision-making. Event processing logic can be written using Streaming SQL queries via graphical and source editors, to capture events from diverse data sources, process and analyze them, integrate with multiple services and data stores, and publish output to various endpoints in real time. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    Fondant

    Fondant

    Production-ready data processing made easy and shareable

    Fondant is a modular, pipeline-based framework designed to simplify the preparation of large-scale datasets for training machine learning models, especially foundation models. It offers an end-to-end system for ingesting raw data, applying transformations, filtering, and formatting outputs—all while remaining scalable and traceable. Fondant is designed with reproducibility in mind and supports containerized steps using Docker, making it easy to share and reuse data processing components....
    Downloads: 0 This Week
    Last Update:
    See Project
  • Shoplogix Smart Factory Platform Icon
    Shoplogix Smart Factory Platform

    For manufacturers looking for a powerful Manufacturing Execution solution

    Real-time Visibility into Your Shop Floor's Performance. The Shoplogix smart factory platform enables manufacturers to increase overall equipment effectiveness, reduce operational costs, sustain growth and improve profitability by allowing them to visualize, integrate and act on production and machine performance in real-time. Manufacturers that trust us to drive efficiency in their factories. Real-time visual data and analytics provide valuable insights to make better informed decisions. Uncover hidden shop floor potential and drive rapid time to value. Develop a continuously improving culture through training, education and data-driven decisions. Compete in the i4.0 world by making the Shoplogix Smart Factory Platform the cornerstone of your digital transformation. Connect to any equipment or device to automate data collection and exchange it with other manufacturing technologies. Automatically monitor, report and analyze machine states to track real-time production.
    Learn More
  • 10
    CocoIndex

    CocoIndex

    ETL framework to index data for AI, such as RAG

    CocoIndex is an open-source framework designed for building powerful, local-first semantic search systems. It lets users index and retrieve content based on meaning rather than keywords, making it ideal for modern AI-based search applications. CocoIndex leverages vector embeddings and integrates with various models and frameworks, including OpenAI and Hugging Face, to provide high-quality semantic understanding. It’s built for transparency, ease of use, and local control over your search...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 11
    PULSAR

    PULSAR

    Distributed pub-sub messaging system

    ...scale for over 5 years, with millions of messages per second across millions of topics. Expand capacity seamlessly to hundreds of nodes. Low publish latency (< 5ms) at scale with strong durability guarantees. Configurable replication between data centers across multiple geographic regions. Built from the ground up as a multi-tenant system. Supports isolation, authentication, authorization and quotas. Persistent message storage based on Apache BookKeeper. IO-level isolation between write and read operations. Flexible messaging models with high-level APIs for Java, Go, Python, C++, Node.js, WebSocket and C#.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    An innovative Open Source CEP (Complex Event Processing) engine. It implements the event stream processing as a library embeddable in C++ and Perl. You can think of the Complex Event Processing engine as an in-memory database driven by triggers, or a data-flow machine, or a spreadsheet on steroids (and without the GUI part).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    text-dedup

    text-dedup

    All-in-one text de-duplication

    text-dedup is a Python library that enables efficient deduplication of large text corpora by using MinHash and other probabilistic techniques to detect near-duplicate content. This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Padasip

    Padasip

    Python Adaptive Signal Processing

    Padasip (Python Adaptive Signal Processing) is a Python library tailored for adaptive filtering and online learning applications, particularly in signal processing and time series forecasting. It includes a variety of adaptive filter algorithms such as LMS, RLS, and their variants, offering real-time adaptation to changing environments. The library is lightweight, well-documented, and ideal for research, prototyping, or teaching purposes. Padasip supports both supervised and unsupervised...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    Amadeus

    Amadeus

    Harmonious distributed data analysis in Rust

    Amadeus is a high-performance, distributed data processing framework written in Rust, designed to offer an ergonomic and safe alternative to tools like Apache Spark. It provides both streaming and batch capabilities, allowing users to work with real-time and historical data at scale. Thanks to Rust’s memory safety and zero-cost abstractions, Amadeus delivers performance gains while reducing the complexity and bugs common in large-scale data pipelines. It emphasizes developer productivity...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 16
    Wally

    Wally

    Distributed Stream Processing

    Wally is a fast-stream-processing framework. Wally makes it easy to react to data in real-time. By eliminating infrastructure complexity, going from prototype to production has never been simpler. When we set out to build Wally, we had several high-level goals in mind. Create a dependable and resilient distributed computing framework. Take care of the complexities of distributed computing "plumbing," allowing developers to focus on their business logic. Provide high-performance & low-latency...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 17
    Cosmos DB Spark

    Cosmos DB Spark

    Apache Spark Connector for Azure Cosmos DB

    Azure Cosmos DB Spark is the official connector for Azure CosmosDB and Apache Spark. The connector allows you to easily read to and write from Azure Cosmos DB via Apache Spark DataFrames in Python and Scala. It also allows you to easily create a lambda architecture for batch-processing, stream-processing, and a serving layer while being globally replicated and minimizing the latency involved in working with big data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    TeleScope

    TeleScope

    XML Data Stream Broker/Replicator

    TeleScope is the efficient intensive-load XML data stream broker, replicator and simple event processing platform (SEP) written in C for the Fedora 17-18, Slackware 13-14, Red Hat Enterprise Linux 6 (RHEL-6) Linux distributions. The platform is intended to be operated upon the single number/word values and is not meant to be deployed for full-text XML stream analysis. TeleScope has internal query language with a set of standard logical operators that allows to construct relatively complex query expressions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    An experimental CEP (Complex Event Processing) engine. It implements the event stream processing as a library embeddable in C++ and Perl. Since then it has been renamed to Triceps, so please look at the new location https://sourceforge.net/projects/t
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Sed.py is a python module to provide a easy way to do text stream processing. Just like the name of module, it likes to do the work that sed can do. But not in sed's way, it's in Python's way. To use this module, the knowledge of regexp is necessary.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    StreamMine is a distributed event processing (streaming) infrastructure. You can create low-latency, fault-tolerant stream processing functionality with any stream-oriented operators that can be implemented in Python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB