RAPTOR is a retrieval architecture designed to improve retrieval-augmented generation systems by organizing documents into hierarchical structures that enable more effective context retrieval. Traditional RAG systems typically retrieve small text chunks independently, which can limit a model’s ability to understand broader document context. RAPTOR addresses this limitation by recursively embedding, clustering, and summarizing documents to create a tree-structured hierarchy of information. Each level of the tree represents summaries at different levels of abstraction, allowing retrieval to operate at both detailed and high-level conceptual layers. During inference, the system can navigate this hierarchical representation to retrieve information that best matches the user’s query while preserving broader contextual understanding. This approach improves question-answering performance on complex tasks that require reasoning across long documents or multiple sources.

Features

  • Hierarchical document representation using recursive summarization
  • Tree-structured retrieval enabling multi-level information access
  • Integration with retrieval-augmented language model pipelines
  • Embedding clustering and abstraction of document segments
  • Improved long-document reasoning and contextual retrieval
  • Research implementation for advanced RAG architectures

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow RAPTOR

RAPTOR Web Site

Other Useful Business Software
DataHub is the leading open-source data catalog helping teams discover, understand, and govern their data assets. Icon
DataHub is the leading open-source data catalog helping teams discover, understand, and govern their data assets.

Modern Data Catalog and Metadata Platform

Built on an open source foundation with a thriving community of 13,000+ members, DataHub gives you unmatched flexibility to customize and extend without vendor lock-in. DataHub Cloud is a modern metadata platform with REST and GraphQL APIs that optimize performance for complex queries, essential for AI-ready data management and ML lifecycle support.
Learn More
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of RAPTOR!

Additional Project Details

Programming Language

Python

Related Categories

Python Large Language Models (LLM)

Registered

2026-03-06