A high-quality tool for convert PDF to Markdown and JSON
Get your documents ready for gen AI
Multilingual Document Layout Parsing in a Single Vision-Language Model
An on-premises, OCR-free unstructured data extraction
Contexts Optical Compression
A Repo For Document AI
Open source semantic search and text analytics for large document sets
OCR software, free and offline
Canvas-based WYSIWYG rich text editor with advanced layout tools
The SILE Typesetter — Simon’s Improved Layout Engine
Library for OCR-related tasks powered by Deep Learning
OCR model for complex documents with layout-aware structured outputs
Map location picker component for Android
Assist in organizing your piles of documents
Accurate × Fast × Comprehensive
Enhances Tesseract OCR output using LLMs (local or API)
R Markdown Résumés and CVs
OCR expert VLM powered by Hunyuan's native multimodal architecture
Open-Source Python3 tool for recognizing layouts, tables, and math
Extract and convert data from any document, images, pdfs, word doc
CLI tool to extract (meta)data from PDF and manipulate PDF files
PDF Parser for AI-ready data. Automate PDF accessibility
Collabora Online is a collaborative online office suite
Semantic search and workflows for medical/scientific papers
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning