83 projects for "apache pdf" with 1 filter applied:

  • Simplify Purchasing For Your Business Icon
    Simplify Purchasing For Your Business

    Manage what you buy and how you buy it with Order.co, so you have control over your time and money spent.

    Simplify every aspect of buying for your business in Order.co. From sourcing products to scaling purchasing across locations to automating your AP and approvals workstreams, Order.co is the platform of choice for growing businesses.
    Learn More
  • Collect! is a highly configurable debt collection software Icon
    Collect! is a highly configurable debt collection software

    Everything that matters to debt collection, all in one solution.

    The flexible & scalable debt collection software built to automate your workflow. From startup to enterprise, we have the solution for you.
    Learn More
  • 1
    Apache OpenOffice

    Apache OpenOffice

    The free and Open Source productivity suite

    ...OpenOffice is also able to export files in PDF format. OpenOffice has supported extensions, in a similar manner to Mozilla Firefox, making easy to add new functionality to an existing OpenOffice installation.
    Leader badge
    Downloads: 224,127 This Week
    Last Update:
    See Project
  • 2
    Asciidoc Editor based on JavaFX 20

    Asciidoc Editor based on JavaFX 20

    Asciidoc Editor and Toolchain written with JavaFX 19

    Asciidoc FX is a WYSIWYG editor for the Asciidoc markup language. You can build PDF, Epub, and HTML books, documents, and slides. Supported Operating Systems and Builds shows the list of available builds with links for reference. If you are looking for the very latest version, visit the link in the note above to be guaranteed of downloading the latest and greatest version of AsciidocFX. AsciidocFX converts documents via the AsciidoctorJ library.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    MarkPDFDown

    MarkPDFDown

    A high-quality PDF to Markdown tool based on large language model

    MarkPDFdown is an open-source document processing tool designed to convert PDF files into structured Markdown output that can be easily used for documentation, content pipelines, and AI processing workflows. The project focuses on extracting text, formatting, and structural information from complex PDF documents and transforming that information into clean Markdown that preserves the original hierarchy of headings, paragraphs, tables, and lists. By producing Markdown rather than raw text,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    DeepSeek-OCR 2

    DeepSeek-OCR 2

    Visual Causal Flow

    DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents...
    Downloads: 13 This Week
    Last Update:
    See Project
  • Rezku Point of Sale Icon
    Rezku Point of Sale

    Designed for Real-World Restaurant Operations

    Rezku is an all-inclusive ordering platform and management solution for all types of restaurant and bar concepts. You can now get a fully custom branded downloadable smartphone ordering app for your restaurant exclusively from Rezku.
    Learn More
  • 5
    Resume-Matcher

    Resume-Matcher

    Improve your resumes with Resume Matcher

    Resume-Matcher is a command-line application that compares resumes against job descriptions using natural language processing. It provides a compatibility score based on keyword relevance and highlights areas where the resume aligns—or doesn't—with the target role. Designed for job seekers and HR professionals, it helps improve resume tailoring and streamlines candidate screening.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Extractous

    Extractous

    Fast and efficient unstructured data extraction

    Extractous is a Rust-based unstructured data extraction library focused on fast local parsing of documents and other content-heavy files. Its purpose is to extract text and metadata efficiently from formats such as PDF, Word, HTML, email archives, images, and more, without depending on external APIs or separate parsing servers. The project emphasizes performance and low memory usage, and its maintainers describe it as a local-first alternative to heavier extraction stacks. For broader format support, the system combines its Rust core with ahead-of-time compiled Apache Tika shared libraries, which allows it to extend parsing coverage while still avoiding traditional server-based overhead. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    PRDownloader

    PRDownloader

    A file downloader library for Android with pause and resume support

    A file downloader library for Android with pause and resume support. PRDownloader can be used to download any type of files like image, video, pdf, apk and etc. This file downloader library supports pause and resume while downloading a file. Supports large file download. This downloader library has a simple interface to make download request. We can check if the status of downloading with the given download Id. PRDownloader gives callbacks for everything like onProgress, onCancel, onStart,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    QPDF

    QPDF

    PDF transformation/manipulation program + library

    QPDF is a C++ library and set of programs that inspect and manipulate the structure of PDF files. It can encrypt and linearize files, expose the internals of a PDF file, and do many other operations useful to end users and PDF developers.
    Leader badge
    Downloads: 1,026 This Week
    Last Update:
    See Project
  • 9
    ArXiv MCP Server

    ArXiv MCP Server

    A Model Context Protocol server for searching and analyzing arXiv

    arxiv-mcp-server bridges AI assistants and the arXiv repository through a clean MCP interface, enabling search, metadata retrieval, and content access without bespoke scraping. With simple tools like “search” and “fetch,” an agent can find papers, pull abstracts, and download PDFs for downstream summarization or analysis. The project includes packaging and CI to publish to PyPI, plus tests and linting for reliability. Issue threads show feature requests such as extracting embedded LaTeX and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • The AI workplace management platform Icon
    The AI workplace management platform

    Plan smart spaces, connect teams, manage assets, and get insights with the leading AI-powered operating system for the built world.

    By combining AI workflows, predictive intelligence, and automated insights, OfficeSpace gives leaders a complete view of how their spaces are used and how people work. Facilities, IT, HR, and Real Estate teams use OfficeSpace to optimize space utilization, enhance employee experience, and reduce portfolio costs with precision.
    Learn More
  • 10
    Controllable-RAG-Agent

    Controllable-RAG-Agent

    This repository provides an advanced RAG

    Controllable-RAG-Agent is an advanced Retrieval-Augmented Generation (RAG) system designed specifically for complex, multi-step question answering over your own documents. Instead of relying solely on simple semantic search, it builds a deterministic control graph that acts as the “brain” of the agent, orchestrating planning, retrieval, reasoning, and verification across many steps. The pipeline ingests PDFs, splits them into chapters, cleans and preprocesses text, then constructs vector...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    FOray

    Modular XSL-FO Implementation for Java.

    FOray is an open-source XSL-FO publishing system that is suitable for converting XML content into PDF and other document formats. Although not yet fully conformant with the XSL-FO standard, it is very useful for many applications.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    Provides optical character recognition (OCR) solutions for Vietnamese language.
    Leader badge
    Downloads: 188 This Week
    Last Update:
    See Project
  • 13
    C# ECG Toolkit

    C# ECG Toolkit

    ECG Toolkit support for: SCP-ECG, DICOM, HL7 aECG, ISHNE & MUSE-XML

    C# ECG Toolkit is an open source software toolkit to convert, view and print electrocardiograms. The toolkit is developed using C# .NET Framework 2.0 and later (code also supports netstandard2.0). Support for ECG formats: SCP-ECG, DICOM, HL7 aECG, ISHNE, MUSE-XML and OmronECG.
    Leader badge
    Downloads: 15 This Week
    Last Update:
    See Project
  • 14
    EspoCRM - Open Source CRM

    EspoCRM - Open Source CRM

    Moving in the right direction together!

    EspoCRM software is that it’s fully customizable. We strive to create a solution that fits different business and industry needs, without having to rely on a “one size fits all” approach or make you spend a fortune on customization. Demo: https://www.espocrm.com/demo/ Installation: https://docs.espocrm.com/administration/installation/ Customer relationship management (CRM) software is developing every day due to ever-changing global business environment and rapid advances in...
    Downloads: 169 This Week
    Last Update:
    See Project
  • 15
    Knowledge + Chat

    Knowledge + Chat

    Knowledge is a tool for saving, searching, accessing, and chatting

    Knowledge is an open-source desktop application designed for organizing, exploring, and interacting with information gathered from websites, documents, and other digital sources. The platform allows users to collect knowledge from multiple formats including web pages, PDF files, videos, and other documents and organize them into structured projects and subprojects. These sources can then be visualized and explored using several views such as graph, grid, table, and calendar interfaces,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    File System Crawler for Elasticsearch

    File System Crawler for Elasticsearch

    Elasticsearch File System Crawler (FS Crawler)

    This crawler helps to index binary documents such as PDF, Open Office, MS Office. Local file system (or a mounted drive) crawling and indexing new files, updating existing ones, and removing old ones. Remote file system over SSH/FTP crawling. REST interface to let you “upload” your binary documents to elastic search.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Open Source Data Quality and Profiling

    Open Source Data Quality and Profiling

    World's first open source data quality & data preparation project

    This project is dedicated to open source data quality and data preparation solutions. Data Quality includes profiling, filtering, governance, similarity check, data enrichment alteration, real time alerting, basket analysis, bubble chart Warehouse validation, single customer view etc. defined by Strategy. This tool is developing high performance integrated data management platform which will seamlessly do Data Integration, Data Profiling, Data Quality, Data Preparation, Dummy Data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    PdfJumbler
    A simple tool to rearrange/merge/delete pages from PDF files. The modular backend system uses either JPedal or JPod to display PDFs and iText or Apache PDFBox to save them. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Development of this project has moved to GitHub. Please check https://github.com/mgropp/pdfjumbler for current releases! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    Downloads: 8 This Week
    Last Update:
    See Project
  • 19
    Free Academic Timetable Software

    Free Academic Timetable Software

    Free web timetabling software for education and training providers.

    A user friendly web-based timetabling software designed for all types of education and training providers to schedule classes, facilities, trainers and split classes into groups. It was designed by an academic professional with over 5 years experience in education timetabling systems and 14 years experience in the education and training sector. It is easy to set up and is really user friendly for everyone. Comes with manuals, training videos and is free to use and install, but not for...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    PDFReporter

    PDFReporter

    Generating documents and reports, offline enabled and reliable.

    The library is a fork of the popular open source Jasper Reports and supports the common features provided by Jasper Reports, but offline and for mobile apps. The PDFReporter library supports iOS, Java and Android library. For your document and report design you use the PDFReporter Studio where you can visualize your data. If you want to use the library commercially please visit our official webpage.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    OpenEXI

    OpenEXI

    EXI implementations in Java and C#

    Open source .Net (C#) / Java implementation of the W3C Efficient XML Interchange (EXI) format specification. As a corollary to XML, EXI is an alternative, very efficient format that has all of the mechanics of XML, but is much more compact and is faster to exchange. - README (about Nagasena EXI implemenation) https://www.dropbox.com/s/adh83u9z1x1czv6/README.txt?dl=0 - Nagasena EXI grammar interchange format (PDF) https://www.dropbox.com/s/etrpuchaddplq2s/EXIGram.pdf?dl=0 -...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 22

    eXtensible Text Framework (XTF)

    Framework for search and display of heterogenous document collections.

    NOTICE: This code repository is deprecated. Please visit https://github.com/cdlib/xtf for the latest updates. Obsolete Description: The eXtensible Text Framework (XTF) is an architecture that supports searching across collections of heterogeneous textual data (XML, PDF, HTML, text, and more), and the presentation of results and documents in a highly configurable manner. Includes highly customized versions of the proven open-source components Lucene and Saxon.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    S.E.E.R. 2 is a full featured SCADA / Historian / Aggregate Analysis System developed to work as a 'front end' for mod_openopc. Written in pure PHP (HTML 4.01 Transitional), and driven by a web-based user interface for universal deployment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    Generic REST Fetcher

    Execute a cascading sequence of REST calls to a common end-point

    REST API's often require multiple calls to achieve a business objective. Eg get a list of objects, then get data for each object. This application allows you to transform data received from one REST call to determine other methods to call. All this is configured via an XML file. A simple example is fetching data for all respondents to a Survey Monkey survey. You need to get the set of respondent ids, then call the API passing in the list of respondent ids. See the README.md file on...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Le chansonnier

    Le chansonnier

    Your electronic guitar songbook

    "Le chansonnier" is a program allowing to create, manage, display and publish a guitar songbook. It is able to produce nice PDF documents containing song and chords along with chord diagrams.
    Downloads: 114 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next
MongoDB Logo MongoDB