Open source libraries and APIs to build custom preprocessing pipelines
Instill Core is a full-stack AI infrastructure tool for data
AI-Powered Data Processing: Use LOTUS to process all of your datasets
Extract schema, statistics and entities from datasets
Parse files for optimal RAG
Superlinked is a Python framework for AI Engineers
Central interface to connect your LLM's with external data
Vector database for scalable similarity search and AI applications
A fast, helpful, and open-source document parser
Context database designed specifically for AI Agents
Autonomous LLM agent for end-to-end data science workflows
A system for agentic LLM-powered data processing and ETL
A modular graph-based Retrieval-Augmented Generation (RAG) system
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine
AI-data warehouse to enrich, transform and analyze unstructured data
Claude Code skill for generating production-quality SVG+PNG technical
Fast and efficient unstructured data extraction
An extensible framework for Personal Data Management
Training data (data labeling, annotation, workflow) for all data types
No-code LLM Platform to launch APIs and ETL Pipelines
Airweave lets agents search any app
A persistent, network resilient, full text search library
Deterministic LLMs Outputs for AI Applications and AI Agents
A @ClickHouse fork that supports high-performance vector search
Your Second Brain supercharged by Generative AI