Download Latest Version 2026.03.26 source code.tar.gz (36.2 MB)
Email in envelope

Get an email when there's a new version of DocWire SDK

Home / 2025.05.22
Name Modified Size InfoDownloads / Week
Parent folder
2025.05.22 source code.tar.gz 2025-05-23 36.6 MB
2025.05.22 source code.zip 2025-05-23 36.9 MB
README.md 2025-05-23 2.4 kB
Totals: 3 Items   73.6 MB 0

This release introduces significant enhancements to PDF processing, including image extraction and OCR integration, alongside major internal refactorings that modernize the core data flow and parser architecture. It also includes several fixes related to thread safety, library linking, and testing infrastructure, particularly improving test discovery and execution on Windows.

From PDFs, images now take flight,
Through refined chains, data flows bright.
With steadier tests and safer threads,
Docwire advances, new paths it treads.
🖼️🔗⚙️

  • Features
  • PDF Processing: Added extraction of images from PDF files. Extracted images can now be processed by subsequent chain elements, including content type detection and OCR.
  • Writers: Updated HTML and plain text writers to support image tags, including data URLs for embedded images and text derived from OCR.
  • Testing: Implemented automatic tests for the new PDF image extraction and OCRing capabilities.

  • Improvements

  • Core Architecture: Significantly refactored the data processing mechanism within chain elements. This modernizes the core data flow, enhances clarity on processing progression (continue, skip, stop), and allows more flexible tag emission, including parsers sending data back for reprocessing.
  • Parser Architecture: Refactored parsers to directly implement ChainElement and utilize enhanced data_source checks, eliminating the Parser base class.
  • Testing & CI: Enhanced CI by adding explicit runs of automatic test discovery to catch issues that ctest might silently ignore.
  • Testing & CI: Improved error reporting in CI by separating the execution of API automatic tests from example runs.
  • Code Organization: Moved HTMLWriter and HTMLExporter classes from docwire_core to the docwire_html library.

  • Fixes

  • Core: Ensured thread-safe initialization of parser MIME type vectors, including a specific fix for PSTParser.
  • Testing: Resolved test discovery issues on Windows by implementing a custom main() function for automatic tests instead of linking gtest_main.
  • Build: Addressed linking issues with the docwire_html library.
  • Build: Fixed mailio library linking to ensure compatibility with version 0.25.1 following a vcpkg upgrade.
Source: README.md, updated 2025-05-23