Page 3 | document free download

Showing 349 open source projects for "document"

View related business solutions

Python Clear Filters & Widen Search

The Most Powerful Software Platform for EHSQ and ESG Management
Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.

Learn More
Award-Winning Medical Office Software Designed for Your Specialty
Succeed and scale your practice with cloud-based, data-backed, AI-powered healthcare software.

RXNT is an ambulatory healthcare technology pioneer that empowers medical practices and healthcare organizations to succeed and scale through innovative, data-backed, AI-powered software.

Learn More
1

Zeep

A Python SOAP client

...Support for WS-Addressing headers. Support for WSSE (UserNameToken / x.509 signing) Support for asyncio via httpx. Experimental support for XOP messages. Zeep inspects the WSDL document and generates the corresponding code to use the services and types in the document. This provides an easy-to-use programmatic interface to a SOAP server. Parsing the XML documents is done by using the lxml library. This is the most performant and compliant Python XML library currently available. This results in major speed benefits when processing large SOAP responses. ...

Downloads: 2 This Week

Last Update: 2025-09-15
See Project
2

Sphinx

Main repository for the Sphinx documentation builder

...HTML (including Windows HTML Help), LaTeX (for printable PDF versions), ePub, Texinfo, manual pages, plain text. Semantic markup and automatic links for functions, classes, citations, glossary terms and similar pieces of information. Easy definition of a document tree, with automatic links to siblings, parents and children. General index as well as a language-specific module index. Automatic highlighting using the Pygments highlighter. Automatic testing of code snippets, the inclusion of docstrings from Python modules (API docs), and more.

Downloads: 26 This Week

Last Update: 2025-12-31
See Project
3

PaperAI

Semantic search and workflows for medical/scientific papers

PaperAI is an open-source framework for searching and analyzing scientific papers, particularly useful for researchers looking to extract insights from large-scale document collections.

Downloads: 6 This Week

Last Update: 2025-07-01
See Project
4

huggingface_hub

The official Python client for the Huggingface Hub

The huggingface_hub library allows you to interact with the Hugging Face Hub, a platform democratizing open-source Machine Learning for creators and collaborators. Discover pre-trained models and datasets for your projects or play with the thousands of machine-learning apps hosted on the Hub. You can also create and share your own models, datasets, and demos with the community. The huggingface_hub library provides a simple way to do all these things with Python.

Downloads: 10 This Week

Last Update: 3 days ago
See Project
Failed Payment Recovery for Subscription Businesses
For subscription companies searching for a failed payment recovery solution to grow revenue, and retain customers.

FlexPay’s innovative platform uses multiple technologies to achieve the highest number of retained customers, resulting in reduced involuntary churn, longer life span after recovery, and higher revenue. Leading brands like LegalZoom, Hooked on Phonics, and ClinicSense trust FlexPay to recover failed payments, reduce churn, and increase customer lifetime value.

Learn More
5

OWL

Optimized Workforce Learning for General Multi-Agent Assistance

...Unlike single-agent systems, it treats task completion as a collaborative workforce where agents take on specialized roles (planning, execution, analysis) and coordinate via a modular multi-agent architecture that supports flexible teamwork across domains. OWL delivers state-of-the-art performance on benchmarks like GAIA and emphasizes real-time decision-making, web automation, rich search integration, document parsing, and multi-tool workflows, making it suitable for tasks ranging from information retrieval to interactive automation.

Downloads: 0 This Week

Last Update: 2 days ago
See Project
6

fireworks-tech-graph

Claude Code skill for generating production-quality SVG+PNG technical

...The project emphasizes scalability and adaptability, allowing it to handle large datasets and evolving knowledge bases. By structuring information into graph form, it enables more meaningful navigation and discovery compared to traditional document-based systems.

Downloads: 29 This Week

Last Update: 6 days ago
See Project
7

EasyOCR

Ready-to-use OCR with 80+ supported languages

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc. EasyOCR is a python module for extracting text from image. It is a general OCR that can read both natural scene text and dense text in document. We are currently supporting 80+ languages and expanding. Second-generation models: multiple times smaller size, multiple times faster inference, additional characters and comparable accuracy to the first generation models. EasyOCR will choose the latest model by default but you can also specify which model to use. Model weights for the chosen language will be automatically downloaded or you can download them manually from the model hub. ...

Downloads: 39 This Week

Last Update: 2024-09-24
See Project
8

All-in-RAG

Big Model Application Development Practice 1

...The repository provides a structured learning path that covers both theoretical foundations and practical implementation steps for RAG systems. It explains the full development pipeline required to create knowledge-aware AI assistants, including data preparation, document indexing, vector embedding generation, and retrieval strategies. The project also explores advanced topics such as hybrid retrieval methods, query optimization, and evaluation techniques for improving system accuracy. Alongside theoretical explanations, the repository includes hands-on exercises and example projects that demonstrate how to build production-ready RAG systems. ...

Downloads: 0 This Week

Last Update: 2026-03-17
See Project
9

Auto-Deep-Research

Your Fully-Automated Personal AI Assistant

...Users provide a research topic or multifaceted goal, and the system autonomously breaks the objective down into subtasks like literature collection, critical summarization, cross-comparison, citation extraction, metric evaluation, and structured writing. Auto-Deep-Research integrates retrieval from academic and web sources, processes document corpora for relevance and key insights, and organizes outputs into coherent chapters or sections according to research standards. It also embeds validation loops, where intermediate drafts are self-checked for consistency, coverage, and alignment with sound reasoning practices, reducing reliance on raw generation alone.

Downloads: 0 This Week

Last Update: 2026-02-03
See Project
Premier Construction Software
Premier is a global leader in financial construction ERP software.

Rated #1 Construction Accounting Software by Forbes Advisor in 2022 & 2023. Our modern SAAS solution is designed to meet the needs of General Contractors, Developers/Owners, Homebuilders & Specialty Contractors.

Learn More
10

GLM-4.6V

GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

...Unlike many vision-language models that treat images and text separately or require intermediate conversions, GLM-4.6V allows inputs such as images, screenshots or document pages directly as part of its reasoning pipeline — and can output or act via tools seamlessly, bridging perception and execution. Its architecture supports a very large context window (on the order of 128K tokens during training), which lets it handle complex multimodal inputs like long documents, multi-page reports, or video transcripts, while maintaining coherence across extended content. ...

Downloads: 0 This Week

Last Update: 2026-04-06
See Project
11

WeasyPrint

The awesome document factory

WeasyPrint is a smart solution helping people to create PDF documents. You can generate gorgeous statistical reports, invoices, tickets, and anything you want as long as you have some webdesign skills! Design your documents just as you design your websites! WeasyPrint follows the widely used HTML and CSS specifications from the W3C. You can use your usual web tools, languages and frameworks, but for print. Creating high-quality digital documents requires features that you love to use as...

Downloads: 26 This Week

Last Update: 2026-02-06
See Project
12

Papis

Powerful and highly extensible command-line based document

Papis is a powerful and highly extensible CLI document and bibliography manager. With Papis, you can search your library for books and papers, add documents and notes, import and export to and from other formats, and much much more. Papis uses a human-readable and easily hackable .yaml file to store each entry's bibliographical data. It strives to be easy to use while providing a wide range of features.

Downloads: 0 This Week

Last Update: 2026-02-08
See Project
13

DeepSeek-OCR 2

Visual Causal Flow

DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents with rich spatial structure. ...

Downloads: 9 This Week

Last Update: 2026-02-03
See Project
14

BEIR

A Heterogeneous Benchmark for Information Retrieval

BEIR is a benchmark framework for evaluating information retrieval models across various datasets and tasks, including document ranking and question answering.

Downloads: 1 This Week

Last Update: 2025-06-04
See Project
15

LLM-Aided OCR Project

Enhances Tesseract OCR output using LLMs (local or API)

...The project is particularly useful for digitizing historical documents, research papers, and scanned materials where traditional OCR often struggles. It also includes tools for processing batches of images or documents, enabling automated document digitization workflows.

Downloads: 0 This Week

Last Update: 2026-03-22
See Project
16

Hallucination Leaderboard

Leaderboard Comparing LLM Performance at Producing Hallucinations

...The project provides a standardized benchmark that evaluates different models using a dedicated hallucination detection system known as the Hallucination Evaluation Model. Each model is tested on document summarization tasks to measure how often generated responses introduce information that is not supported by the original source material. The results are published as a leaderboard that allows researchers and developers to compare model reliability and factual consistency. By focusing on hallucination rates rather than traditional metrics such as accuracy or fluency, the benchmark highlights an important aspect of AI system safety and trustworthiness. ...

Downloads: 0 This Week

Last Update: 2026-03-20
See Project
17

Krixik

Documentation for the Krixik Python client

Small/specialized AI models are an oft-necessary complement—or alternative—to "big AI" offerings. However, infrastructure for small AI tends to be underwhelming, so building with specialized AI can be difficult, time-consuming, and even expensive. Iterating with different models, and particularly with different combinations of these models, can thus be rendered unfeasible.

Downloads: 0 This Week

Last Update: 2024-11-05
See Project
18

Zerox OCR

PDF to Markdown with vision models

A dead simple way of OCR-ing a document for AI ingestion. Documents are meant to be a visual representation after all. With weird layouts, tables, charts, etc. The vision models just make sense. ZeroX is an open-source machine learning framework designed for fast experimentation and production deployment, optimized for speed and ease of use.

Downloads: 2 This Week

Last Update: 2024-12-18
See Project
19

GLM-4.5V

GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

...It embodies the design philosophy of mixing visual and textual modalities into a unified model capable of general-purpose reasoning, content understanding, and generation, while already supporting a wide variety of tasks: from image captioning and visual question answering to content recognition, GUI-based agents, video understanding, and long-document interpretation. GLM-4.5V emerged from a training framework that leverages scalable reinforcement learning (with curriculum sampling) to boost performance across tasks ranging from STEM problem solving to long-context reasoning, giving it broad applicability beyond narrow benchmarks. When it was released, it achieved state-of-the-art results on a large collection of public multimodal benchmarks for open-source models.

Downloads: 1 This Week

Last Update: 2026-04-06
See Project
20

DB-GPT

Revolutionizing Database Interactions with Private LLM Technology

DB-GPT is an experimental open-source project that uses localized GPT large models to interact with your data and environment. With this solution, you can be assured that there is no risk of data leakage, and your data is 100% private and secure.

Downloads: 5 This Week

Last Update: 2026-03-27
See Project
21

borb

borb is a library for reading, creating and manipulating PDF files

borb is a library for creating and manipulating PDF files in python. borb is a pure python library to read, write, and manipulate PDF documents. It represents a PDF document as a JSON-like data structure of nested lists, dictionaries and primitives (numbers, string, booleans, etc) This is currently a one-man project, so the focus will always be to support those use-cases that are more common in favor of those that are rare.

Downloads: 4 This Week

Last Update: 2026-03-16
See Project
22

gensim

Topic Modelling for Humans

Gensim is a Python library for topic modeling, document indexing, and similarity retrieval with large corpora. The target audience is the natural language processing (NLP) and information retrieval (IR) community.

Downloads: 2 This Week

Last Update: 2025-10-16
See Project
23

Raglite

RAGLite is a Python toolkit for Retrieval-Augmented Generation

Raglite is a lightweight framework for building Retrieval-Augmented Generation (RAG) pipelines with minimal configuration. It connects large language models to vector databases for context-aware responses, enabling developers to prototype and deploy RAG systems quickly. Raglite focuses on simplicity and modularity for fast experimentation.

Downloads: 1 This Week

Last Update: 2025-06-11
See Project
24

macOS Security Compliance

macOS Security Compliance Project

The macOS Security Compliance Project is an open source effort to provide a programmatic approach to generating security guidance. The configuration settings in this document were derived from National Institute of Standards and Technology (NIST) Special Publication (SP) 800-53, Security and Privacy Controls for Information Systems and Organizations, Revision 5. This is a joint project of federal operational IT Security staff from the National Institute of Standards and Technology (NIST), National Aeronautics and Space Administration (NASA), Defense Information Systems Agency (DISA), and Los Alamos National Laboratory (LANL).

Downloads: 1 This Week

Last Update: 2025-12-18
See Project
25
$JupyterLab LaTeX$

JupyterLab LaTeX

JupyterLab extension for live editing of LaTeX documents

An extension for JupyterLab which allows for live-editing of LaTeX documents. To use, right-click on an open .tex document within JupyterLab, and select Show LaTeX Preview. This extension includes both a notebook server extension (which interfaces with the LaTeX compiler) and a lab extension (which provides the UI for the LaTeX preview). The Python package named jupyterlab_latex provides both of them as a prebuilt extension.

Downloads: 1 This Week

Last Update: 2025-12-17
See Project