Open Source Linux Stream Processing Tools - Page 2

Stream Processing Tools for Linux

View 7 business solutions
  • The AI workplace management platform Icon
    The AI workplace management platform

    Plan smart spaces, connect teams, manage assets, and get insights with the leading AI-powered operating system for the built world.

    By combining AI workflows, predictive intelligence, and automated insights, OfficeSpace gives leaders a complete view of how their spaces are used and how people work. Facilities, IT, HR, and Real Estate teams use OfficeSpace to optimize space utilization, enhance employee experience, and reduce portfolio costs with precision.
    Learn More
  • Simplify Purchasing For Your Business Icon
    Simplify Purchasing For Your Business

    Manage what you buy and how you buy it with Order.co, so you have control over your time and money spent.

    Simplify every aspect of buying for your business in Order.co. From sourcing products to scaling purchasing across locations to automating your AP and approvals workstreams, Order.co is the platform of choice for growing businesses.
    Learn More
  • 1
    An experimental CEP (Complex Event Processing) engine. It implements the event stream processing as a library embeddable in C++ and Perl. Since then it has been renamed to Triceps, so please look at the new location https://sourceforge.net/projects/t
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Cosmos DB Spark

    Cosmos DB Spark

    Apache Spark Connector for Azure Cosmos DB

    Azure Cosmos DB Spark is the official connector for Azure CosmosDB and Apache Spark. The connector allows you to easily read to and write from Azure Cosmos DB via Apache Spark DataFrames in Python and Scala. It also allows you to easily create a lambda architecture for batch-processing, stream-processing, and a serving layer while being globally replicated and minimizing the latency involved in working with big data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    DSPatch

    DSPatch

    The Refreshingly Simple C++ Dataflow Framework

    Webite: http://flowbasedprogramming.com DSPatch, pronounced "dispatch", is a powerful C++ dataflow framework. DSPatch is not limited to any particular domain or data type, from reactive programming to stream processing, DSPatch's generic, object-oriented API allows you to create virtually any dataflow system imaginable. *See also:* DSPatcher ( https://github.com/MarcusTomlinson/DSPatcher ): A cross-platform graphical tool for building DSPatch circuits. DSPatchables ( https://github.com/MarcusTomlinson/DSPatchables ): A DSPatch component repository.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Dataflow Java SDK

    Dataflow Java SDK

    Google Cloud Dataflow provides a simple, powerful model

    The Dataflow Java SDK is the open-source Java library that powers Apache Beam pipelines for Google Cloud Dataflow, a serverless and scalable platform for processing large datasets in both batch and stream modes. This SDK allows developers to write Beam-based pipelines in Java and execute them on Dataflow, taking advantage of features like autoscaling, dynamic work rebalancing, and fault-tolerant distributed processing. While it has been mostly succeeded by the unified Beam SDKs, it remains relevant for legacy systems and offers insight into the underlying mechanisms that power scalable data workflows on Google Cloud.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Failed Payment Recovery for Subscription Businesses Icon
    Failed Payment Recovery for Subscription Businesses

    For subscription companies searching for a failed payment recovery solution to grow revenue, and retain customers.

    FlexPay’s innovative platform uses multiple technologies to achieve the highest number of retained customers, resulting in reduced involuntary churn, longer life span after recovery, and higher revenue. Leading brands like LegalZoom, Hooked on Phonics, and ClinicSense trust FlexPay to recover failed payments, reduce churn, and increase customer lifetime value.
    Learn More
  • 5
    Fondant

    Fondant

    Production-ready data processing made easy and shareable

    Fondant is a modular, pipeline-based framework designed to simplify the preparation of large-scale datasets for training machine learning models, especially foundation models. It offers an end-to-end system for ingesting raw data, applying transformations, filtering, and formatting outputs—all while remaining scalable and traceable. Fondant is designed with reproducibility in mind and supports containerized steps using Docker, making it easy to share and reuse data processing components. It’s built for use in research and production, empowering data scientists to streamline dataset curation and preprocessing workflows efficiently.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    A Middleware for Distrubted Data Stream Processing
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    HStreamDB

    HStreamDB

    HStreamDB is an open-source, cloud-native streaming database

    HStreamDB is an open-source, cloud-native streaming database for IoT and beyond. Modernize your data stack for real-time applications. By subscribing to streams in HStreamDB, any update of the data stream will be pushed to your apps in real-time, and this promotes your apps to be more responsive. You can also replace message brokers with HStreamDB and everything you do with message brokers can be done better with HStreamDB. HStreamDB provides built-in support for event time-based stream processing. You can use your familiar SQL to perform basic filtering and transformation operations, statistics and aggregation based on multiple kinds of time windows and even joining between multiple streams. With connectors provided, you can easily integrate HStreamDB with other external systems, such as MQTT Broker, MySQL, Redis and ElasticSearch. More connectors will be added.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    LogsGrep

    A grep-like utility for log files.

    LogsGrep is a unique, grep-like utility designed specifically to target log files containing multi-line entries. The primary target is Java log files (Log4J, common, ...), where it is very common to have multiline log entries (for example log entries with a stacktrace). It follows Unix philosophy, does only its primary job and expects its input to be generated by other more advanced tools (tail, cat, type, find...); There is no goal to be compatible with Unix grep. LogsGrep is written in the Java programming langue having performance and low resource usage in mind (no strings, no object creation, stream-processing).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    MXQuery is a low-footprint implementation of XQuery 1.0, XQuery Update 1.0, XQuery Fulltext 1.0 and XQuery Scripting 1.0 as well as a subset of XQuery 1.1 (windowing, try/catch). It provides extensions to do data stream processing/CEP and SOAP/REST
    Downloads: 0 This Week
    Last Update:
    See Project
  • AestheticsPro Medical Spa Software Icon
    AestheticsPro Medical Spa Software

    Our new software release will dramatically improve your medspa business performance while enhancing the customer experience

    AestheticsPro is the most complete Aesthetics Software on the market today. HIPAA Cloud Compliant with electronic charting, integrated POS, targeted marketing and results driven reporting; AestheticsPro delivers the tools you need to manage your medical spa business. It is our mission To Provide an All-in-One Cutting Edge Software to the Aesthetics Industry.
    Learn More
  • 10
    Sed.py is a python module to provide a easy way to do text stream processing. Just like the name of module, it likes to do the work that sed can do. But not in sed's way, it's in Python's way. To use this module, the knowledge of regexp is necessary.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    A production stable Java utility library with convenience methods for string- and stream processing, file handling, XML, XSLTs and XPath, checksums, console formatting, and more. The project is developed by the State and University Library of Denmark
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    SPar: Stream Parallelism in Multi-Cores

    SPar: Stream Parallelism in Multi-Cores

    An Embedded C++ Domain-Specific Language

    SPar is an internal C++ Domain-Specific Language (DSL) suitable to model and implement classical stream parallel patterns. The DSL uses standard C++ attributes to introduce annotations tagging the notable components of stream parallel applications: stream sources and stream processing stages. Latest version can be downloaded from the SVN using the following command: svn checkout svn://svn.code.sf.net/p/spar-dsl-compiler/svn/ spar
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Strings Edit

    Strings Edit

    String editing and formatting library for Ada

    Strings edit is a library that provides I/O facilities for integers, floating-point numbers, Roman numerals, and strings. Both input and output subroutines support string pointers for consequent stream processing. The output can be aligned in a fixed size field with padding. Numeric input can be checked against expected values range to be either saturated or to raise an exception. For floating-point output either relative or absolute output precision can be specified. UTF-8 encoded strings are supported, including wildcard pattern matching, sets and maps of code points, upper/lowercase, and other Unicode categorizations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    TeleScope

    TeleScope

    XML Data Stream Broker/Replicator

    TeleScope is the efficient intensive-load XML data stream broker, replicator and simple event processing platform (SEP) written in C for the Fedora 17-18, Slackware 13-14, Red Hat Enterprise Linux 6 (RHEL-6) Linux distributions. The platform is intended to be operated upon the single number/word values and is not meant to be deployed for full-text XML stream analysis. TeleScope has internal query language with a set of standard logical operators that allows to construct relatively complex query expressions. The platform features the pub-sub architecture and serves a set of simultaneously connected XML stream subscribers. The broker features Continuous Query engine over the XML stream. TeleScope provides the remote cli interface to login (in cisco fashion via telnet) and change/reset the query transaction on the current stream on the fly in real time. It also gives data query and subscribers statistics via a separate status port.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    An innovative Open Source CEP (Complex Event Processing) engine. It implements the event stream processing as a library embeddable in C++ and Perl. You can think of the Complex Event Processing engine as an in-memory database driven by triggers, or a data-flow machine, or a spreadsheet on steroids (and without the GUI part).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    activeinsight
    ActiveInsight provides real-time detection and reaction to events and patterns. It is a platform that enables the detection of meaningful events within multiple, high frequency, event streams.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    collapse

    collapse

    Advanced and Fast Data Transformation in R

    collapse is a high-performance R package designed for fast and efficient data transformation, aggregation, reshaping, and statistical computation. Built to offer a more performant alternative to dplyr and data.table, it is particularly well-suited for large datasets and econometric applications. It operates on base R data structures like data frames and vectors and uses highly optimized C++ code under the hood to deliver significant speed improvements. collapse also includes tools for grouped operations, weighted statistics, and time series manipulation, making it a compact yet powerful utility for data scientists and researchers working in R.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    ksqlDB

    ksqlDB

    The database purpose-built for stream processing applications

    Build applications that respond immediately to events. Craft materialized views over streams. Receive real-time push updates, or pull current state on demand. Seamlessly leverage your existing Apache Kafka® infrastructure to deploy stream-processing workloads and bring powerful new capabilities to your applications. Use a familiar, lightweight syntax to pack a powerful punch. Capture, process, and serve queries using only SQL. No other languages or services are required. ksqlDB enables you to build event streaming applications leveraging your familiarity with relational databases. Three categories are foundational to building an application: collections, stream processing, and queries. Streams are immutable, append-only sequences of events. They're useful for representing a series of historical facts. Tables are mutable collections of events. They let you represent the latest version of each value per key.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    text-dedup

    text-dedup

    All-in-one text de-duplication

    text-dedup is a Python library that enables efficient deduplication of large text corpora by using MinHash and other probabilistic techniques to detect near-duplicate content. This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage. It supports Jaccard similarity thresholding, parallel execution, and flexible deduplication strategies, making it ideal for cleaning web-scraped data, language model training datasets, or document archives.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB