Open source libraries and APIs to build custom preprocessing pipelines
Instill Core is a full-stack AI infrastructure tool for data
AI-Powered Data Processing: Use LOTUS to process all of your datasets
Superlinked is a Python framework for AI Engineers
Extract schema, statistics and entities from datasets
Parse files for optimal RAG
Claude Code skill for generating production-quality SVG+PNG technical
A fast, helpful, and open-source document parser
Autonomous LLM agent for end-to-end data science workflows
Vector database for scalable similarity search and AI applications
Central interface to connect your LLM's with external data
A modular graph-based Retrieval-Augmented Generation (RAG) system
A system for agentic LLM-powered data processing and ETL
Context database designed specifically for AI Agents
AI-data warehouse to enrich, transform and analyze unstructured data
Your Second Brain supercharged by Generative AI
Fast and efficient unstructured data extraction
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine
A persistent, network resilient, full text search library
A @ClickHouse fork that supports high-performance vector search
An extensible framework for Personal Data Management
Training data (data labeling, annotation, workflow) for all data types
Airweave lets agents search any app
Deterministic LLMs Outputs for AI Applications and AI Agents
Open-source choice to scale, assess and maintain natural language data