Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX
OpenGL Mathematics (GLM)
Python ETL framework for stream processing, real-time analytics, LLM
A curated list of data mining papers about fraud detection
Device management, data collection, processing and visualization
Data-Centric Pipelines and Data Versioning
Stream Processing and Complex Event Processing Engine
Data Science Guide With Videos And Materials
A simple interface for working with TeX documents
Training data (data labeling, annotation, workflow) for all data types
Blazing-fast Data-Wrangling toolkit
Official HDF5® Library Repository
Docker image used to run data processing workloads
A GPU-accelerated library containing highly optimized building blocks
Production-ready data processing made easy and shareable
Unified programming model for Batch and Streaming
Addax is a versatile open-source ETL tool
Flink CDC is a streaming data integration tool
Miller is like awk, sed, cut, join, and sort for name-indexed data
Open source libraries and APIs to build custom preprocessing pipelines
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models
Local-first AI chat analysis tool for insights from conversation data
GridDB is a next-generation open source database
The lxml XML toolkit for Python
iLovePDF Rest Api - PHP Library