Stream Processing Tools

View 56 business solutions
  • Safety Compliance Made Easy Icon
    Safety Compliance Made Easy

    SiteDocs is a digital safety management software used to support work site compliance.

    Ideally designed for business that deals with Construction, Oil & Gas, Mining, Manufacturing, Mechanical, Electrical, Plumbing, Heating, and Excavating, SiteDocs is a perfect solution for any size business looking to modernize the way Safety Compliance is organized.
    Learn More
  • OpenMetal is an automated bare metal and on-demand private cloud provider. Icon
    OpenMetal is an automated bare metal and on-demand private cloud provider.

    Large Scale. Cloud Native. Fixed Costs.

    OpenMetal is an automated bare metal and on-demand private cloud provider. Our mission is to empower your team with cost effective private infrastructure that outperforms traditional public cloud.
    Learn More
  • 1
    FLOGO

    FLOGO

    Simplify building efficient & modern serverless functions and apps

    Project Flogo is an ultra-light, Go-based open source ecosystem for building event-driven apps. Event-driven, you say? Yup, the notion of triggers and actions are leveraged to process incoming events. An action, a common interface, exposes key capabilities such as application integration, stream processing, etc. All capabilities within the Flogo Ecosystem have a few things in common, they all process events (in a manner suitable for the specific purpose) and they all implement the action interface exposed by Flogo Core. Integration Flows Application Integration process engine with conditional branching and a visual development environment. A simple pipeline-based stream processing action with event joining capabilities across multiple triggers & aggregation over time windows. Microgateway pattern for conditional, content-based routing, JWT validation, rate limiting, circuit breaking and other common patterns.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Fondant

    Fondant

    Production-ready data processing made easy and shareable

    Fondant is a modular, pipeline-based framework designed to simplify the preparation of large-scale datasets for training machine learning models, especially foundation models. It offers an end-to-end system for ingesting raw data, applying transformations, filtering, and formatting outputs—all while remaining scalable and traceable. Fondant is designed with reproducibility in mind and supports containerized steps using Docker, making it easy to share and reuse data processing components. It’s built for use in research and production, empowering data scientists to streamline dataset curation and preprocessing workflows efficiently.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    A Middleware for Distrubted Data Stream Processing
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    HStreamDB

    HStreamDB

    HStreamDB is an open-source, cloud-native streaming database

    HStreamDB is an open-source, cloud-native streaming database for IoT and beyond. Modernize your data stack for real-time applications. By subscribing to streams in HStreamDB, any update of the data stream will be pushed to your apps in real-time, and this promotes your apps to be more responsive. You can also replace message brokers with HStreamDB and everything you do with message brokers can be done better with HStreamDB. HStreamDB provides built-in support for event time-based stream processing. You can use your familiar SQL to perform basic filtering and transformation operations, statistics and aggregation based on multiple kinds of time windows and even joining between multiple streams. With connectors provided, you can easily integrate HStreamDB with other external systems, such as MQTT Broker, MySQL, Redis and ElasticSearch. More connectors will be added.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Globalscape Enhanced File Transfer (EFT) is a best-in-class managed file transfer (MFT) solution Icon
    Globalscape Enhanced File Transfer (EFT) is a best-in-class managed file transfer (MFT) solution

    For Windows-Centric Organizations Looking for Secure File Transfer solutions

    Globalscape’s Enhanced File Transfer (EFT) platform is a comprehensive, user-friendly managed file transfer (MFT) software. Thousands of Windows-Centric Organizations trust Globalscape EFT for their mission-critical file transfers.
    Learn More
  • 5

    LogsGrep

    A grep-like utility for log files.

    LogsGrep is a unique, grep-like utility designed specifically to target log files containing multi-line entries. The primary target is Java log files (Log4J, common, ...), where it is very common to have multiline log entries (for example log entries with a stacktrace). It follows Unix philosophy, does only its primary job and expects its input to be generated by other more advanced tools (tail, cat, type, find...); There is no goal to be compatible with Unix grep. LogsGrep is written in the Java programming langue having performance and low resource usage in mind (no strings, no object creation, stream-processing).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    MXQuery is a low-footprint implementation of XQuery 1.0, XQuery Update 1.0, XQuery Fulltext 1.0 and XQuery Scripting 1.0 as well as a subset of XQuery 1.1 (windowing, try/catch). It provides extensions to do data stream processing/CEP and SOAP/REST
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    PULSAR

    PULSAR

    Distributed pub-sub messaging system

    Apache Pulsar is a cloud-native, distributed messaging and streaming platform originally created at Yahoo! and now a top-level Apache Software Foundation project. Easy to deploy, lightweight compute process, developer-friendly APIs, no need to run your own stream processing engine. Run in production at Yahoo! scale for over 5 years, with millions of messages per second across millions of topics. Expand capacity seamlessly to hundreds of nodes. Low publish latency (< 5ms) at scale with strong durability guarantees. Configurable replication between data centers across multiple geographic regions. Built from the ground up as a multi-tenant system. Supports isolation, authentication, authorization and quotas. Persistent message storage based on Apache BookKeeper. IO-level isolation between write and read operations. Flexible messaging models with high-level APIs for Java, Go, Python, C++, Node.js, WebSocket and C#.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Sed.py is a python module to provide a easy way to do text stream processing. Just like the name of module, it likes to do the work that sed can do. But not in sed's way, it's in Python's way. To use this module, the knowledge of regexp is necessary.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Pyper

    Pyper

    Concurrent Python made simple

    Pyper is a Python-native orchestration and scheduling framework designed for modern data workflows, machine learning pipelines, and any task that benefits from a lightweight DAG-based execution engine. Unlike heavier platforms like Airflow, Pyper aims to remain lean, modular, and developer-friendly, embracing Pythonic conventions and minimizing boilerplate. It focuses on local development ergonomics and seamless transition to production environments, making it ideal for small teams and individuals needing a programmable and flexible orchestration solution without the overhead of enterprise systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Power through agendas and documents, make more informed decisions and conduct board meetings faster. Icon
    Power through agendas and documents, make more informed decisions and conduct board meetings faster.

    For team managers searching for a solution to manage their meetings

    iBabs not only captures the entire decision-making process – it takes all the paperwork out of meetings. iBabs empowers everyone who has ever organized or attended, a meeting. With a seemingly simple app that offers complete control and a comprehensive overview of all those fiddly details. With about 3000 organizations and over 300,000 users, iBabs gives you peace of mind. So you can quickly organize effective meetings, and good decisions can be made with confidence. iBabs didn’t just happen overnight. We started analyzing and simplifying board meeting processes many years ago. We understand all the work that goes into meetings, and how to streamline everything so it all flows smoothly. On any device, confidentially, securely and automatically. Make good decisions with confidence.
    Learn More
  • 10
    A production stable Java utility library with convenience methods for string- and stream processing, file handling, XML, XSLTs and XPath, checksums, console formatting, and more. The project is developed by the State and University Library of Denmark
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    SPar: Stream Parallelism in Multi-Cores

    SPar: Stream Parallelism in Multi-Cores

    An Embedded C++ Domain-Specific Language

    SPar is an internal C++ Domain-Specific Language (DSL) suitable to model and implement classical stream parallel patterns. The DSL uses standard C++ attributes to introduce annotations tagging the notable components of stream parallel applications: stream sources and stream processing stages. Latest version can be downloaded from the SVN using the following command: svn checkout svn://svn.code.sf.net/p/spar-dsl-compiler/svn/ spar
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    SageMaker Spark Container

    SageMaker Spark Container

    Docker image used to run data processing workloads

    Apache Spark™ is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing. The SageMaker Spark Container is a Docker image used to run batch data processing workloads on Amazon SageMaker using the Apache Spark framework. The container images in this repository are used to build the pre-built container images that are used when running Spark jobs on Amazon SageMaker using the SageMaker Python SDK. The pre-built images are available in the Amazon Elastic Container Registry (Amazon ECR), and this repository serves as a reference for those wishing to build their own customized Spark containers for use in Amazon SageMaker.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    StreamMine is a distributed event processing (streaming) infrastructure. You can create low-latency, fault-tolerant stream processing functionality with any stream-oriented operators that can be implemented in Python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    StreamScale is a Java based Data Stream Processing System.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Strings Edit

    Strings Edit

    String editing and formatting library for Ada

    Strings edit is a library that provides I/O facilities for integers, floating-point numbers, Roman numerals, and strings. Both input and output subroutines support string pointers for consequent stream processing. The output can be aligned in a fixed size field with padding. Numeric input can be checked against expected values range to be either saturated or to raise an exception. For floating-point output either relative or absolute output precision can be specified. UTF-8 encoded strings are supported, including wildcard pattern matching, sets and maps of code points, upper/lowercase, and other Unicode categorizations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    TeleScope

    TeleScope

    XML Data Stream Broker/Replicator

    TeleScope is the efficient intensive-load XML data stream broker, replicator and simple event processing platform (SEP) written in C for the Fedora 17-18, Slackware 13-14, Red Hat Enterprise Linux 6 (RHEL-6) Linux distributions. The platform is intended to be operated upon the single number/word values and is not meant to be deployed for full-text XML stream analysis. TeleScope has internal query language with a set of standard logical operators that allows to construct relatively complex query expressions. The platform features the pub-sub architecture and serves a set of simultaneously connected XML stream subscribers. The broker features Continuous Query engine over the XML stream. TeleScope provides the remote cli interface to login (in cisco fashion via telnet) and change/reset the query transaction on the current stream on the fly in real time. It also gives data query and subscribers statistics via a separate status port.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Video Stream Processing On CBE Project
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Watermill

    Watermill

    Building event-driven applications the easy way in Go

    Go library for building event-driven applications. Our goal was to create a tool that is easy to understand, even by junior developers. It doesn't matter if you want to do Event-driven architecture, CQRS, Event Sourcing or just stream MySQL Binlog to Kafka. Watermill was designed to process hundreds of thousands of messages per second. Every component is built in a way that allows you to configure it for your needs. You can also implement your own middleware for the router. Watermill is using proven technologies and has a strong unit and integration tests coverage for critical areas. Watermill is a Go library for working efficiently with message streams. It is intended for building event driven applications, enabling event sourcing, RPC over messages, sagas and basically whatever else comes to your mind. You can use conventional pub/sub implementations like Kafka or RabbitMQ, but also HTTP or MySQL binlog if that fits your use case.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    activeinsight
    ActiveInsight provides real-time detection and reaction to events and patterns. It is a platform that enables the detection of meaningful events within multiple, high frequency, event streams.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    collapse

    collapse

    Advanced and Fast Data Transformation in R

    collapse is a high-performance R package designed for fast and efficient data transformation, aggregation, reshaping, and statistical computation. Built to offer a more performant alternative to dplyr and data.table, it is particularly well-suited for large datasets and econometric applications. It operates on base R data structures like data frames and vectors and uses highly optimized C++ code under the hood to deliver significant speed improvements. collapse also includes tools for grouped operations, weighted statistics, and time series manipulation, making it a compact yet powerful utility for data scientists and researchers working in R.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    horizon

    horizon

    Horizon is a realtime, open-source backend for JavaScript apps

    Horizon is an open-source developer platform for building sophisticated realtime apps. It provides a complete backend that makes it dramatically simpler to build, deploy, manage, and scale engaging JavaScript web and mobile apps. Horizon is extensible, integrates with the Node.js stack, and allows building modern, arbitrarily complex applications. While technologies like RethinkDB and WebSocket make it possible to build engaging realtime apps, empirically there is still too much friction for most developers. Building realtime apps now requires understanding and manually orchestrating multiple systems across the software stack, understanding distributed stream processing, and learning how to deploy and scale realtime systems. The learning curve is quite steep, and most of the initial work involves boilerplate code that is far removed from the primary task of building a realtime app.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    ksqlDB

    ksqlDB

    The database purpose-built for stream processing applications

    Build applications that respond immediately to events. Craft materialized views over streams. Receive real-time push updates, or pull current state on demand. Seamlessly leverage your existing Apache Kafka® infrastructure to deploy stream-processing workloads and bring powerful new capabilities to your applications. Use a familiar, lightweight syntax to pack a powerful punch. Capture, process, and serve queries using only SQL. No other languages or services are required. ksqlDB enables you to build event streaming applications leveraging your familiarity with relational databases. Three categories are foundational to building an application: collections, stream processing, and queries. Streams are immutable, append-only sequences of events. They're useful for representing a series of historical facts. Tables are mutable collections of events. They let you represent the latest version of each value per key.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    text-dedup

    text-dedup

    All-in-one text de-duplication

    text-dedup is a Python library that enables efficient deduplication of large text corpora by using MinHash and other probabilistic techniques to detect near-duplicate content. This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage. It supports Jaccard similarity thresholding, parallel execution, and flexible deduplication strategies, making it ideal for cleaning web-scraped data, language model training datasets, or document archives.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB