Open source libraries and APIs to build custom preprocessing pipelines
AI-Powered Data Processing: Use LOTUS to process all of your datasets
Instill Core is a full-stack AI infrastructure tool for data
Superlinked is a Python framework for AI Engineers
Parse files for optimal RAG
Claude Code skill for generating production-quality SVG+PNG technical
Extract schema, statistics and entities from datasets
A fast, helpful, and open-source document parser
Autonomous LLM agent for end-to-end data science workflows
Vector database for scalable similarity search and AI applications
The open source mesh processing system
Central interface to connect your LLM's with external data
A modular graph-based Retrieval-Augmented Generation (RAG) system
Context database designed specifically for AI Agents
CrateDB is a distributed and scalable SQL database
Python module for parsing semi-structured text into python tables
AI-data warehouse to enrich, transform and analyze unstructured data
A system for agentic LLM-powered data processing and ETL
Lightweight library for scraping web-sites with LLMs
Your Second Brain supercharged by Generative AI
Fluentd: Unified Logging Layer (project under CNCF)
No-code LLM Platform to launch APIs and ETL Pipelines
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine
Fast and efficient unstructured data extraction
Web framework designed for speed, security, and SEO