Search Results for "data processing" - Page 3

Showing 2018 open source projects for "data processing"

View related business solutions
  • Software Defined Storage Icon
    Software Defined Storage

    The layered architecture of QuantaStor provides solution engineers with unprecedented flexibility and application design options.

    QuantaStor is a unified Software-Defined Storage platform designed to scale up and out to make storage management easy while reducing overall enterprise storage costs.
    Learn More
  • The Secure And Reliable File Transfer Solution That You Control. Icon
    The Secure And Reliable File Transfer Solution That You Control.

    Helping IT professionals responsibly secure the world's data

    Cerberus offers a variety of secure file transfer solutions to fit businesses of any size or business sector, including finance, technology, education, publishing, law offices, local, state, and federal government agencies, hospitals and many more.
    Learn More
  • 1
    lxml

    lxml

    The lxml XML toolkit for Python

    A Python library for efficient XML and HTML processing, known for speed and compatibility. The lxml XML toolkit is a Pythonic binding for the C libraries libxml2 and libxslt. It is unique in that it combines the speed and XML feature completeness of these libraries with the simplicity of a native Python API, mostly compatible but superior to the well-known ElementTree API. The latest release works with all CPython versions from 3.6 to 3.12. See the introduction for more information about the...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 2
    GridDB

    GridDB

    GridDB is a next-generation open source database

    ...Multi-model architecture capable of supporting various data stores with time-series data-oriented and pluggable data stores for efficient real-time processing and management of huge amounts of time-series data at high frequency. Various architectural innovations, such as in-memory orientation with "memory as the main unit and disk as the secondary unit" and event-driven design with minimal overhead, have been incorporated to achieve processing capabilities that can handle petabyte-scale applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    iLovePDF Api

    iLovePDF Api

    iLovePDF Rest Api - PHP Library

    Develop and automate PDF processing tasks like Compress PDF, merging PDF, Split PDF, converting Office to PDF, PDF to JPG, Images to PDF, adding Page Numbers, Rotate PDF, Unlocking PDF, stamping a Watermark, and Repair PDF. Each one with several settings to get your desired results. Strong infrastructure to offer the best-dedicated processing power. You might know us from ilovepdf.com where we process millions of PDFs daily. We offer a simple and concise API Reference and Guide as well as...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 4
    CSV Lint

    CSV Lint

    CSV Lint plug-in for Notepad++ for syntax highlighting

    CSV Lint plug-in for Notepad++ for syntax highlighting, csv validation, automatic column and datatype detecting fixed width datasets, change datetime format, decimal separator, sort data, count unique values, convert to xml, json, sql etc. A plugin for data cleaning and working with messy data files. Use CSV Lint for metadata discovery, technical data validation, and reformatting on tabular data files. It is not meant to be a replacement for spreadsheet programs like Excel or SPSS, but rather it's a quality control tool to examine, verify or polish up a dataset before further processing.
    Downloads: 36 This Week
    Last Update:
    See Project
  • Process Street | Compliance Operations Platform Icon
    Process Street | Compliance Operations Platform

    Systemize execution. Prove compliance.

    Bring compliance and operations under one roof with an AI agent that automates workflows, policies that enforce rules, and a platform that delivers results.
    Learn More
  • 5
    Point Cloud Library

    Point Cloud Library

    A standalone, large scale, open project for 2D/3D image processing

    The Point Cloud Library (PCL) is a standalone, large scale, open project for 2D/3D image and point cloud processing. PCL is released under the terms of the BSD license, and thus free for commercial and research use. Whether you’ve just discovered PCL or you’re a long time veteran, this page contains links to a set of resources that will help consolidate your knowledge on PCL and 3D processing. An additional Wiki resource for developers is available too. To simplify both usage and...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 6
    cobalt

    cobalt

    Video and media downloader: Best way to save what you love

    Cobalt is an open-source media downloader and tool designed to provide a high-performance and privacy-focused alternative for interacting with online media content, particularly focused on downloading and processing media from various platforms. It emphasizes speed, reliability, and a clean user experience, allowing users to retrieve media without unnecessary tracking, ads, or intrusive elements commonly found in web-based tools. The project is built with performance in mind, leveraging efficient backend processing to handle requests quickly and consistently. ...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 7
    eXist-db

    eXist-db

    eXist Native XML Database and Application Platform

    eXist-db is an open-source, native XML database and application platform that provides a powerful environment for storing, querying, and managing XML documents. It is designed for complex data management needs, offering XQuery, XSLT, and RESTful web services for interacting with structured data.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    Apache Hudi

    Apache Hudi

    Upserts, Deletes And Incremental Processing on Big Data

    Apache Hudi (pronounced Hoodie) stands for Hadoop Upserts Deletes and Incrementals. Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage). Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Hudi reimagines slow old-school batch data processing with a powerful new incremental processing framework for low latency minute-level analytics. Hudi provides efficient upserts, by mapping a given hoodie key (record key + partition path) consistently to a file id, via an indexing mechanism. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    DOLMA

    DOLMA

    Data and tools for generating and inspecting OLMo pre-training data

    DOLMA (Data Optimization and Learning for Model Alignment) is a framework designed to manage large-scale datasets for training and fine-tuning language models efficiently.
    Downloads: 0 This Week
    Last Update:
    See Project
  • The only CRM built for B2C Icon
    The only CRM built for B2C

    Stop chasing transactions. Klaviyo turns customers into diehard fans—obsessed with your products, devoted to your brand, fueling your growth.

    Klaviyo unifies your customer profiles by capturing every event, and then lets you orchestrate your email marketing, SMS marketing, push notifications, WhatsApp, and RCS campaigns in one place. Klaviyo AI helps you build audiences, write copy, and optimize — so you can always send the right message at the right time, automatically. With real-time attribution and insights, you'll be able to make smarter, faster decisions that drive ROI.
    Learn More
  • 10
    KCloud‑Platform‑IoT

    KCloud‑Platform‑IoT

    KCloud-Platform-IoT

    KCloud-Platform-IoT is a comprehensive open-source IoT management platform built with Spring Cloud and Vue.js. It supports device registration, data collection, rule-based processing, and dashboard visualization. Designed for scalability and modularity, the platform is ideal for managing large IoT fleets in industrial or smart city environments.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    InfluxDB

    InfluxDB

    The open source time series database

    ...Time series is currently the fastest growing database category there is, and InfluxDB is here to ensure businesses can keep up. InfluxDB provides infrastructure and application monitoring, IoT monitoring and analytics and more. It has APIs for storing and querying data, processing it in the background for ETL or monitoring and alerting purposes. This data can also be visualized, explored and more to help businesses seize opportunities and make the best decisions. InfluxDB is easy to start and easy to scale. Learn more about it on https://www.influxdata.com/
    Downloads: 23 This Week
    Last Update:
    See Project
  • 12
    Lesan

    Lesan

    New way to create web server and NoSQL data model

    Lesan is a multilingual text processing and translation library designed for natural language processing (NLP) applications. It provides tools for text normalization, tokenization, and translation across multiple languages.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    GemGIS

    GemGIS

    Spatial data processing for geomodeling

    GemGIS is a Python-based, open-source geographic information processing library. It is capable of preprocessing spatial data such as vector data (shape files, geojson files, geopackages,…), raster data (tif, png,…), data obtained from online services (WCS, WMS, WFS) or XML/KML files (soon). Preprocessed data can be stored in a dedicated Data Class to be passed to the geomodeling package GemPy in order to accelerate the model-building process. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    WiFi DensePose

    WiFi DensePose

    Turn WiFi signals into real-time human pose estimation and detection

    WiFi DensePose is a production-oriented implementation of a WiFi-based human pose estimation system that enables real-time full-body tracking using wireless signals rather than cameras. The project demonstrates how commodity mesh routers and signal processing techniques can be leveraged to infer dense human pose information, even through obstacles such as walls. It is designed to showcase the emerging field of RF-based sensing, where machine learning models interpret wireless channel data to reconstruct human movement and posture. The repository includes components for data processing, model inference, and real-time visualization, making it suitable for research and experimental deployments. ...
    Downloads: 76 This Week
    Last Update:
    See Project
  • 15
    HStreamDB

    HStreamDB

    HStreamDB is an open-source, cloud-native streaming database

    HStreamDB is an open-source, cloud-native streaming database for IoT and beyond. Modernize your data stack for real-time applications. By subscribing to streams in HStreamDB, any update of the data stream will be pushed to your apps in real-time, and this promotes your apps to be more responsive. You can also replace message brokers with HStreamDB and everything you do with message brokers can be done better with HStreamDB. HStreamDB provides built-in support for event time-based stream processing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Open3D

    Open3D

    A modern library for 3D data processing

    Open3D is an open-source library that supports rapid development of software that deals with 3D data. The Open3D frontend exposes a set of carefully selected data structures and algorithms in both C++ and Python. The backend is highly optimized and is set up for parallelization. Open3D was developed from a clean slate with a small and carefully considered set of dependencies. It can be set up on different platforms and compiled from source with minimal effort. The code is clean, consistently...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 17
    html-loader

    html-loader

    HTML Loader

    Exports HTML as a string. HTML is minimized when the compiler demands. The true value enables the processing of all default elements and attributes, the false value disables the processing of all attributes. Allows you to specify which tags and attributes to process, filter them, filter urls and process sources starting with /. Allows to setup which tags and attributes to process and how, as well as the ability to filter some of them. Filter can also be used to extend the supported elements...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    Acl

    Acl

    A powerful server and network library, including coroutine

    The Acl (Advanced C/C++ Library) project a is powerful multi-platform network communication library and service framework, supporting LINUX, WIN32, Solaris, FreeBSD, MacOS, AndroidOS, iOS. Many applications written by Acl run on these devices with Linux, Windows, iPhone and Android and serve billions of users. There are some important modules in Acl project, including network communcation, server framework, application protocols, multiple coders, etc. The common protocols such as...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 19
    Dolphin Scheduler

    Dolphin Scheduler

    A distributed and extensible workflow scheduler platform

    Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available `out of the box`. Dedicated to solving the complex task dependencies in data processing, making the scheduler system out of the box for data processing. Decentralized multi-master and multi-worker, HA is supported by itself, overload processing. All process definition operations are visualized, Visualization process defines key information at a glance, One-click deployment. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    SigLens

    SigLens

    100x Efficient Log Management than Splunk

    Siglens is an open-source signal analysis toolkit designed for processing and visualizing time-series data, commonly used in scientific and engineering applications.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    ROOT

    ROOT

    Analyzing, storing and visualizing big data, scientifically

    ...ROOT provides a very efficient storage system for data models, that demonstrated to scale at the Large Hadron Collider experiments: Exabytes of scientific data are written in columnar ROOT format. ROOT comes with histogramming capabilities in an arbitrary number of dimensions, curve fitting, statistical modeling, and minimization, to allow the easy setup of a data analysis system that can query and process the data interactively or in batch mode, as well as a general parallel processing framework, RDataFrame, that can considerably speed up an analysis.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    Waves Platform Node

    Waves Platform Node

    Host connected to the Waves blockchain network

    ...Waves is an open source blockchain platform that offers a full blockchain ecosystem for building decentralised applications. Nodes are its critical components, performing several important functions such as processing and validating transactions, and generating and storing blocks. Nodes store full blockchain data, pass this data to other nodes, and check the validity of newly added blocks. Validation ensures that the blocks are all in the correct format, all hashes are computed correctly, that the new block contains the hash of the previous one, and that every transaction is validated and signed by the right parties.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 23
    tidytext

    tidytext

    Text mining using tidy tools

    tidytext brings tidy data principles to text mining by converting text into a tidy data frame format. It provides tools for tokenization, sentiment analysis, n‑gram creation, and term‑document matrices, enabling interoperability with dplyr, ggplot2, and other tidyverse workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Instill Core

    Instill Core

    Instill Core is a full-stack AI infrastructure tool for data

    Instill Core is an open-source, full-stack AI infrastructure platform designed to orchestrate data pipelines, machine learning models, and unstructured data processing into a unified, production-ready system. It provides an end-to-end solution that enables developers to build, deploy, and manage AI-powered applications without needing to manually stitch together multiple tools across the data and model lifecycle. The platform focuses heavily on handling unstructured data such as documents, images, audio, and video, transforming them into AI-ready formats through integrated ETL pipelines and processing workflows. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    python-small-examples

    python-small-examples

    Focus on creating classic Python small examples and cases

    python-small-examples is an open-source educational repository that contains hundreds of concise Python programming examples designed to illustrate practical coding techniques. The project focuses on teaching programming concepts through small, focused scripts that demonstrate common tasks in data processing, visualization, and general programming. Each example highlights a specific function or programming pattern so that learners can quickly understand how to apply Python features in real-world scenarios. The repository includes examples covering topics such as file processing, JSON manipulation, data visualization, and library usage. ...
    Downloads: 4 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB