Showing 56 open source projects for "data processing"

View related business solutions
  • The full-stack observability platform that protects your dataLayer, tags and conversion data Icon
    The full-stack observability platform that protects your dataLayer, tags and conversion data

    Stop losing revenue to bad data today. and protect your marketing data with Code-Cube.io.

    Code-Cube.io detects issues instantly, alerts you in real time and helps you resolve them fast. No manual QA. No unreliable data. Just data you can trust and act on.
    Learn More
  • Field Sales+ for MS Dynamics 365 and Salesforce Icon
    Field Sales+ for MS Dynamics 365 and Salesforce

    Maximize your sales performance on the go.

    Bring Dynamics 365 and Salesforce wherever you go with Resco’s solution. With powerful offline features and reliable data syncing, your team can access CRM data on mobile devices anytime, anywhere. This saves time, cuts errors, and speeds up customer visits.
    Learn More
  • 1
    go-streams

    go-streams

    A lightweight stream processing library for Go

    A lightweight stream processing library for Go. go-streams provides a simple and concise DSL to build data pipelines. In computing, a pipeline, also known as a data pipeline, is a set of data processing elements connected in series, where the output of one element is the input of the next one. The elements of a pipeline are often executed in parallel or in time-sliced fashion.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Kapacitor

    Kapacitor

    Open source framework for processing, monitoring, and alerting

    Open source framework for processing, monitoring, and alerting on time series data. Kapacitor is a real-time data processing engine for monitoring and alerting, specifically designed to work with time-series data from InfluxDB.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Numaflow

    Numaflow

    Kubernetes-native platform to run massively parallel data/streaming

    Numaflow is a Kubernetes-native tool for running massively parallel stream processing. A Numaflow Pipeline is implemented as a Kubernetes custom resource and consists of one or more source, data processing, and sink vertices. Numaflow installs in a few minutes and is easier and cheaper to use for simple data processing applications than a full-featured stream processing platform.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    pdfcpu

    pdfcpu

    A PDF processor written in Go

    pdfcpu is a PDF processing library written in Go supporting encryption. It provides both an API and a CLI. Supported are all versions up to PDF 1.7 (ISO-32000). This is an effort to build a comprehensive PDF processing library from the ground up written in Go. Over time pdfcpu aims to support the standard range of PDF processing features and also any interesting use cases that may present themselves along the way. The main focus lies on strong support for batch processing and scripting via a...
    Downloads: 15 This Week
    Last Update:
    See Project
  • Failed Payment Recovery for Subscription Businesses Icon
    Failed Payment Recovery for Subscription Businesses

    For subscription companies searching for a failed payment recovery solution to grow revenue, and retain customers.

    FlexPay’s innovative platform uses multiple technologies to achieve the highest number of retained customers, resulting in reduced involuntary churn, longer life span after recovery, and higher revenue. Leading brands like LegalZoom, Hooked on Phonics, and ClinicSense trust FlexPay to recover failed payments, reduce churn, and increase customer lifetime value.
    Learn More
  • 5
    Pachyderm

    Pachyderm

    Data-Centric Pipelines and Data Versioning

    ...Pachyderm provides a powerful solution to optimize data processing, MLOps, and ML Lifecycles.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Miller

    Miller

    Miller is like awk, sed, cut, join, and sort for name-indexed data

    Miller is like awk, sed, cut, join, and sort for data formats such as CSV, TSV, JSON, JSON Lines, and positionally-indexed. With Miller, you get to use named fields without needing to count positional indices, using familiar formats such as CSV, TSV, JSON, JSON Lines, and positionally-indexed. Then, on the fly, you can add new fields which are functions of existing fields, drop fields, sort, aggregate statistically, pretty-print, and more. Miller operates on key-value-pair data while the...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 7
    Benthos

    Benthos

    Fancy stream processing made operationally mundane

    Benthos is a high performance and resilient stream processor, able to connect various sources and sinks in a range of brokering patterns and perform hydration, enrichments, transformations and filters on payloads. It comes with a powerful mapping language, is easy to deploy and monitor, and ready to drop into your pipeline either as a static binary, docker image, or serverless function, making it cloud native as heck. Delivery guarantees can be a dodgy subject. Benthos processes and...
    Downloads: 24 This Week
    Last Update:
    See Project
  • 8
    InfluxDB

    InfluxDB

    The open source time series database

    ...Time series is currently the fastest growing database category there is, and InfluxDB is here to ensure businesses can keep up. InfluxDB provides infrastructure and application monitoring, IoT monitoring and analytics and more. It has APIs for storing and querying data, processing it in the background for ETL or monitoring and alerting purposes. This data can also be visualized, explored and more to help businesses seize opportunities and make the best decisions. InfluxDB is easy to start and easy to scale. Learn more about it on https://www.influxdata.com/
    Downloads: 24 This Week
    Last Update:
    See Project
  • 9
    SigLens

    SigLens

    100x Efficient Log Management than Splunk

    Siglens is an open-source signal analysis toolkit designed for processing and visualizing time-series data, commonly used in scientific and engineering applications.
    Downloads: 3 This Week
    Last Update:
    See Project
  • The AI workplace management platform Icon
    The AI workplace management platform

    Plan smart spaces, connect teams, manage assets, and get insights with the leading AI-powered operating system for the built world.

    By combining AI workflows, predictive intelligence, and automated insights, OfficeSpace gives leaders a complete view of how their spaces are used and how people work. Facilities, IT, HR, and Real Estate teams use OfficeSpace to optimize space utilization, enhance employee experience, and reduce portfolio costs with precision.
    Learn More
  • 10
    Watermill

    Watermill

    Building event-driven applications the easy way in Go

    Go library for building event-driven applications. Our goal was to create a tool that is easy to understand, even by junior developers. It doesn't matter if you want to do Event-driven architecture, CQRS, Event Sourcing or just stream MySQL Binlog to Kafka. Watermill was designed to process hundreds of thousands of messages per second. Every component is built in a way that allows you to configure it for your needs. You can also implement your own middleware for the router. Watermill is...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    ojg

    ojg

    Optimized JSON for Go

    Optimized JSON for Go is a high-performance parser with a variety of additional JSON tools. OjG is optimized to processing huge data sets where data does not necessarily conform to a fixed structure.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Bitalosdb

    Bitalosdb

    Bitalosdb is a high-performance KV storage engine

    BitalosDB is a distributed, high-performance key-value database designed for cloud-native applications. It is optimized for scalability, supporting large workloads while maintaining low latency and high availability.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    Nuclio

    Nuclio

    High-Performance Serverless event and data processing platform

    Nuclio is an open source and managed serverless platform used to minimize development and maintenance overhead and automate the deployment of data-science-based applications. Real-time performance running up to 400,000 function invocations per second. Portable across low laptops, edge, on-prem and multi-cloud deployments. The first serverless platform supporting GPUs for optimized utilization and sharing. Automated deployment to production in a few clicks from Jupyter notebook. Deploy one of...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    paperless-gpt

    paperless-gpt

    Use LLMs and LLM Vision (OCR) to handle paperless-ngx

    paperless-gpt is an AI-powered extension for document management systems that enhances the capabilities of paperless-ngx by integrating large language models and vision-based OCR to automate document processing and organization. It is designed to transform scanned or uploaded documents into structured, searchable, and intelligently categorized data without requiring manual tagging or sorting. The system uses OCR combined with LLM reasoning to extract text, classify documents, and generate metadata such as tags, titles, and categories automatically. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    Hacks

    Hacks

    A collection of hacks and one-off scripts

    Hacks is a collection of experimental scripts, utilities, and one-off tools created to solve specific problems in security research, data processing, and automation. Rather than being a single cohesive application, it serves as a repository of practical command-line tools that can be used independently or combined into workflows. The scripts cover a wide range of tasks, including URL manipulation, parameter replacement, data extraction, and reconnaissance automation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    memphis

    memphis

    Next-Generation Event Processing Platform

    Memphis enables building modern queue-based applications that require large volumes of streamed and enriched data, modern protocols, zero ops, up to x9 faster development, up to x46 fewer costs, and significantly lower dev time for data-oriented developers and data engineers. Queues and brokers are a mission-critical component in the modern application architecture and should be highly available and stable as possible. Provide great performance while maintaining efficient resource...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    Neuroglancer

    Neuroglancer

    WebGL-based viewer for volumetric data

    ...The viewer is built with a multi-threaded architecture, separating rendering and data processing to ensure smooth performance even with massive datasets. Extensively used in neuroscience research, Neuroglancer supports integration with tools.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    encoding

    encoding

    Go package containing implementations of efficient encoding

    Go package containing implementations of encoders and decoders for various data formats. At Segment, we do a lot of marshaling and unmarshaling of data when sending, queuing, or storing messages. The resources we need to provision on the infrastructure are directly related to the type and amount of data that we are processing. At the scale we operate at, the tools we choose to build programs can have a large impact on the efficiency of our systems.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    ASNmap

    ASNmap

    CLI tool for mapping organization network ranges using ASN data

    ...Output can be generated in multiple formats including plain text, JSON, and CSV, enabling flexible data processing and analysis. asnmap also supports reading input from standard input and piping its results directly into other command line tools for chained workflows.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 20
    Grafana Agent

    Grafana Agent

    Vendor-neutral programmable observability pipelines

    Grafana Agent is an OpenTelemetry Collector distribution with a configuration inspired by Terraform. It is designed to be flexible, performant, and compatible with multiple ecosystems such as Prometheus and OpenTelemetry. Grafana Agent is based on components. Components are wired together to form programmable observability pipelines for telemetry collection, processing, and delivery.
    Downloads: 59 This Week
    Last Update:
    See Project
  • 21
    Bacalhau

    Bacalhau

    Community-driven, simple, yet powerful framework

    Bacalhau is a decentralized compute platform for running jobs on data stored across distributed networks, like IPFS or Filecoin, without moving the data to centralized cloud environments. It allows developers to run containerized workloads close to where the data lives, reducing latency, cost, and privacy risks. Bacalhau supports various runtime environments and is designed to make decentralized data processing as accessible as traditional cloud computing. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    protoactor-go

    protoactor-go

    Proto Actor - Ultra fast distributed actors for Go, C# and Java/Kotlin

    Built on cloud-native technologies. Taking advantage of proven stability and performance. Asynchronous and Distributed by design. High-level abstractions like Actors and Virtual Grains. Capable of millions of messages per second cross-process communication. Write systems that self-heal using supervisor hierarchies. The Actor Model provides a higher level of abstraction for writing concurrent and distributed systems. It alleviates the developer from having to deal with explicit locking and...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    getty

    getty

    Asynchronous network I/O library

    Getty is an asynchronous network I/O library developed in Golang. It operates on TCP, UDP, and WebSocket network protocols, providing a consistent interface EventListener. Within Getty, each connection (session) involves two separate goroutines. One handles the reading of TCP streams, UDP packets, or WebSocket packages, while the other manages the logic processing and writes responses into the network write buffer. If your logic processing might take a considerable amount of time, it's...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    Colly

    Colly

    Elegant Scraper and Crawler Framework for Golang

    Colly provides a clean interface to write any kind of crawler/scraper/spider. With Colly you can easily extract structured data from websites, which can be used for a wide range of applications, like data mining, data processing or archiving. Clean API. Fast (>1k request/sec on a single core) Manages request delays and maximum concurrency per domain. Automatic cookie and session handling. Sync/async/parallel scraping. Distributed scraping. Caching, automatic encoding of non-unicode responses. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    Argo Workflows

    Argo Workflows

    Workflow engine for Kubernetes

    ...Model multi-step workflows as a sequence of tasks or capture the dependencies between tasks using a directed acyclic graph (DAG). Easily run compute intensive jobs for machine learning or data processing in a fraction of the time using Argo Workflows on Kubernetes. Run CI/CD pipelines natively on Kubernetes without configuring complex software development products. Argo Workflows is the most popular workflow execution engine for Kubernetes. It can run 1000s of workflows a day, each with 1000s of concurrent tasks. Our users say it is lighter-weight, faster, more powerful, and easier to use. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB