File Parser optimised for LLM Ingestion with no loss
Edit PDF files with Nano Banana
A system for agentic LLM-powered data processing and ETL
Code repository for PDFStitcher, a utility to stitch together PDFs
Multi-tool for semantic search
Automate the management and configuration of infrastructures at scale
ID-based RAG FastAPI: Integration with Langchain and PostgreSQL
Chinese version of Google open source project style guide
Python bindings for MuPDF's rendering library.
Unified framework for building enterprise RAG pipelines
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine
Accurate × Fast × Comprehensive
DeepCode: Open Agentic Coding
Generate audiobooks from EPUBs, PDFs and text with captions
Public repository for Agent Skills
Document Index for Vectorless, Reasoning-based RAG
Library for OCR-related tasks powered by Deep Learning
Interact with your documents using the power of GPT
Fully featured framework for fast, easy and documented API development
BISHENG is an open LLM devops platform for next generation apps
LongBench v2 and LongBench (ACL 25'&24')
Research-oriented chatbot framework
CLI tool to extract (meta)data from PDF and manipulate PDF files
OCR model for complex documents with layout-aware structured outputs
A Python SOAP client