Search Results for "data processing" - Page 4

Showing 1649 open source projects for "data processing"

View related business solutions
  • Dominate AI Search Results Icon
    Dominate AI Search Results

    Generative Al is shaping brand discovery. AthenaHQ ensures your brand leads the conversation.

    AthenaHQ is a cutting-edge platform for Generative Engine Optimization (GEO), designed to help brands optimize their visibility and performance across AI-driven search platforms like ChatGPT, Google AI, and more.
    Learn More
  • Taking the Paper Out of Work Icon
    Taking the Paper Out of Work

    For organizations that need powerful ECM and document automation software

    The Square 9 AI-powered intelligent document processing platform takes the paper out of work and makes it easier to get things done with digital workflows.
    Learn More
  • 1
    Kingfisher

    Kingfisher

    Lightweight, pure-Swift library for downloading images from the web

    Kingfisher is a powerful, pure-Swift library for downloading and caching images from the web. It provides you a chance to use a pure-Swift way to work with remote images in your next app. Asynchronous image downloading and caching. Loading image from either URLSession-based networking or local provided data. Useful image processors and filters provided. Multiple-layer hybrid cache for both memory and disk. Fine control on cache behavior. Customizable expiration date and size limit....
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    tidytext

    tidytext

    Text mining using tidy tools

    tidytext brings tidy data principles to text mining by converting text into a tidy data frame format. It provides tools for tokenization, sentiment analysis, n‑gram creation, and term‑document matrices, enabling interoperability with dplyr, ggplot2, and other tidyverse workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Waves Platform Node

    Waves Platform Node

    Host connected to the Waves blockchain network

    ...Waves is an open source blockchain platform that offers a full blockchain ecosystem for building decentralised applications. Nodes are its critical components, performing several important functions such as processing and validating transactions, and generating and storing blocks. Nodes store full blockchain data, pass this data to other nodes, and check the validity of newly added blocks. Validation ensures that the blocks are all in the correct format, all hashes are computed correctly, that the new block contains the hash of the previous one, and that every transaction is validated and signed by the right parties.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 4
    deepdoctection

    deepdoctection

    A Repo For Document AI

    ...For more specific text processing tasks use one of the many other great NLP libraries.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Stigg | SaaS Monetization and Entitlements API Icon
    Stigg | SaaS Monetization and Entitlements API

    For developers in need of a tool to launch pricing plans faster and build better buying experiences

    A monetization platform is a standalone middleware that sits between your application and your business applications, as part of the modern enterprise billing stack. Stigg unifies all the APIs and abstractions billing and platform engineers had to build and maintain in-house otherwise. Acting as your centralized source of truth, with a highly scalable and flexible entitlements management, rolling out any pricing and packaging change is now a self-service, risk-free, exercise.
    Learn More
  • 5
    python-small-examples

    python-small-examples

    Focus on creating classic Python small examples and cases

    python-small-examples is an open-source educational repository that contains hundreds of concise Python programming examples designed to illustrate practical coding techniques. The project focuses on teaching programming concepts through small, focused scripts that demonstrate common tasks in data processing, visualization, and general programming. Each example highlights a specific function or programming pattern so that learners can quickly understand how to apply Python features in real-world scenarios. The repository includes examples covering topics such as file processing, JSON manipulation, data visualization, and library usage. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    Benthos

    Benthos

    Fancy stream processing made operationally mundane

    Benthos is a high performance and resilient stream processor, able to connect various sources and sinks in a range of brokering patterns and perform hydration, enrichments, transformations and filters on payloads. It comes with a powerful mapping language, is easy to deploy and monitor, and ready to drop into your pipeline either as a static binary, docker image, or serverless function, making it cloud native as heck. Delivery guarantees can be a dodgy subject. Benthos processes and...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 7
    spaCy

    spaCy

    Industrial-strength Natural Language Processing (NLP)

    spaCy is a library built on the very latest research for advanced Natural Language Processing (NLP) in Python and Cython. Since its inception it was designed to be used for real world applications-- for building real products and gathering real insights. It comes with pretrained statistical models and word vectors, convolutional neural network models, easy deep learning integration and so much more. spaCy is the fastest syntactic parser in the world according to independent benchmarks, with...
    Downloads: 28 This Week
    Last Update:
    See Project
  • 8
    DSP.jl

    DSP.jl

    Filter design, periodograms, window functions

    DSP.jl provides a number of common digital signal processing routines in Julia.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Classical Language Toolkit (CLTK)

    Classical Language Toolkit (CLTK)

    The Classical Language Toolkit

    The Classical Language Toolkit (CLTK) is a Python library offering natural language processing support for classical languages, including Latin, Greek, and others.
    Downloads: 4 This Week
    Last Update:
    See Project
  • All-in-One Mental Health EHR Icon
    All-in-One Mental Health EHR

    Simplify your systems. Strengthen your cash flow. Start fresh with Ensora Health

    Ensora Health’s Mental Health EHR is designed for mental health professionals, therapists, and practice managers looking for a secure, user-friendly solution to streamline administrative tasks and improve efficiency in their practice management
    Learn More
  • 10
    Performance Co-Pilot (PCP)

    Performance Co-Pilot (PCP)

    Performance Co-Pilot

    Performance Co-Pilot (PCP) provides a framework and services to support system-level performance monitoring and management. It presents a unifying abstraction for all of the performance data in a system, and many tools for interrogating, retrieving and processing that data. PCP is a feature-rich, mature, extensible, cross-platform toolkit supporting both live and retrospective analysis. The distributed PCP architecture makes it especially useful for those seeking centralized monitoring of distributed processing.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    CGAL

    CGAL

    The Computational Geometry Algorithms Library

    ...These algorithms are useful in a wide range of applications, including computer aided design, robotics, molecular biology, medical imaging, geographic information systems and more. CGAL features a great range of data structures and algorithms, including Voronoi diagrams, cell complexes and polyhedra, triangulations, arrangements of curves, surface and volume mesh generation, spatial searching, alpha shapes, geometry processing, and many more. The use of these result in beautiful, visually complex and accurate representations.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    ClickHouse

    ClickHouse

    A fast open-source OLAP database management system

    ...ClickHouse also has exceptional hardware efficiency and a host of other features, including a feature-rich SQL database, vectorized query execution, real-time query processing and data ingestion, and more.
    Downloads: 22 This Week
    Last Update:
    See Project
  • 13
    OmniTools

    OmniTools

    Self-hosted collection of powerful web-based tools for everyday tasks

    ...It’s designed to replace the random assortment of “free online tools” people use for quick tasks, while avoiding ads, tracking, and the need to upload sensitive files to unknown servers. A key design choice is that file processing happens entirely on the client side, meaning your data stays in your browser instead of being sent to the backend. The tool catalog spans both technical and non-technical needs, including image, video, audio, PDF, text, date/time, math, and data format utilities like JSON/CSV/XML helpers. It’s also packaged for straightforward self-hosting, with a lightweight Docker image and simple run commands, so it can be deployed quickly on a homelab or internal network.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 14
    collapse

    collapse

    Advanced and Fast Data Transformation in R

    collapse is a high-performance R package designed for fast and efficient data transformation, aggregation, reshaping, and statistical computation. Built to offer a more performant alternative to dplyr and data.table, it is particularly well-suited for large datasets and econometric applications. It operates on base R data structures like data frames and vectors and uses highly optimized C++ code under the hood to deliver significant speed improvements. collapse also includes tools for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Sparrow

    Sparrow

    Structured data extraction and instruction calling with ML, LLM

    ...The architecture is modular, allowing developers to build customizable processing pipelines that integrate with external tools and data extraction frameworks. Sparrow also includes workflow orchestration tools that allow multiple extraction tasks to be combined into automated pipelines for large-scale document processing.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Edgee

    Edgee

    AI gateway with token compression for Claude Code, Codex, and more

    Edgee is an edge-native execution platform designed to run AI-driven logic and data processing directly at the network edge, reducing latency and improving responsiveness for modern applications. It enables developers to deploy functions and workflows closer to users, allowing real-time processing without relying heavily on centralized cloud infrastructure. The platform is built to support event-driven architectures, where actions are triggered by incoming requests, user behavior, or external signals. ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 17
    DataProfiler

    DataProfiler

    Extract schema, statistics and entities from datasets

    DataProfiler is an AI-powered tool for automatic data analysis and profiling, designed to detect patterns, anomalies, and schema inconsistencies in structured and unstructured datasets. The DataProfiler is a Python library designed to make data analysis, monitoring, and sensitive data detection easy. Loading Data with a single command, the library automatically formats & loads files into a DataFrame. Profiling the Data, the library identifies the schema, statistics, entities (PII / NPI), and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    mediasoup

    mediasoup

    Cutting Edge WebRTC Video Conferencing

    mediasoup is a Node.js library that provides a cutting-edge WebRTC server capable of handling real-time communications with efficient media routing and processing.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    Superlinked

    Superlinked

    Superlinked is a Python framework for AI Engineers

    Superlinked is a Python framework designed for AI engineers to build high-performance search and recommendation applications that combine structured and unstructured data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    LightAutoML

    LightAutoML

    Fast and customizable framework for automatic ML model creation

    LightAutoML is an automated machine learning (AutoML) framework optimized for efficient model training and hyperparameter tuning, focusing on both tabular and text data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    OCRmyPDF

    OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files

    OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.
    Downloads: 102 This Week
    Last Update:
    See Project
  • 22
    LiteParse

    LiteParse

    A fast, helpful, and open-source document parser

    ...It also includes mechanisms for validation and error handling, ensuring that outputs conform to expected schemas and reducing the need for manual postprocessing. The library is particularly useful for tasks such as data extraction, document processing, and building pipelines that require structured outputs from natural language input.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    Arnis

    Arnis

    Generate any location from the real world in Minecraft

    ...The tool handles large-scale geospatial processing and transforms raw mapping data into a format compatible with Minecraft world generation. Users can generate entire regions, including detailed urban layouts and natural terrain, making it useful for education, visualization, or creative world-building projects.
    Downloads: 161 This Week
    Last Update:
    See Project
  • 24
    MindNLP

    MindNLP

    Easy-to-use and high-performance NLP and LLM framework

    MindNLP is a natural language processing library built on the MindSpore framework, providing tools and models for various NLP tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    ArrayFire

    ArrayFire

    ArrayFire, a general purpose GPU library

    ArrayFire is a general-purpose tensor library that simplifies the process of software development for the parallel architectures found in CPUs, GPUs, and other hardware acceleration devices. The library serves users in every technical computing market. Data structures in ArrayFire are smartly managed to avoid costly memory transfers and to take advantage of each performance feature provided by the underlying hardware. The community of ArrayFire developers invites you to build with us if...
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB