EEGLAB is an open source signal processing environment
Concurrent and multi-stage data ingestion and data processing
Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX
A curated list of data mining papers about fraud detection
Unified programming model for Batch and Streaming
OpenGL Mathematics (GLM)
Data-Centric Pipelines and Data Versioning
A simple interface for working with TeX documents
Data Science Guide With Videos And Materials
Deep Research framework, combining language models with tools
Blazing-fast Data-Wrangling toolkit
A ranked list of awesome Python open-source libraries
Efficient library for processing 3D data
Training data (data labeling, annotation, workflow) for all data types
Official HDF5® Library Repository
Docker image used to run data processing workloads
A GPU-accelerated library containing highly optimized building blocks
Data and tools for generating and inspecting OLMo pre-training data
Production-ready data processing made easy and shareable
Addax is a versatile open-source ETL tool
Miller is like awk, sed, cut, join, and sort for name-indexed data
Instill Core is a full-stack AI infrastructure tool for data
A distributed and extensible workflow scheduler platform
Flink CDC is a streaming data integration tool
Analyzing, storing and visualizing big data, scientifically