Audience

Developers, researchers, and engineers wanting a tool to accurately parse and understand complex documents, layouts, and visual-text content at scale

About GLM-OCR

GLM-OCR is a multimodal optical character recognition model and open source repository that provides accurate, efficient, and comprehensive document understanding by combining text and visual modalities into a unified encoder–decoder architecture derived from the GLM-V family. Built with a visual encoder pre-trained on large-scale image–text data and a lightweight cross-modal connector feeding into a GLM-0.5B language decoder, the model supports layout detection, parallel region recognition, and structured output for text, tables, formulas, and complicated real-world document formats. It introduces Multi-Token Prediction (MTP) loss and stable full-task reinforcement learning to improve training efficiency, recognition accuracy, and generalization, achieving state-of-the-art benchmarks on major document understanding tasks.

Pricing

Starting Price:
Free
Free Version:
Free Version available.

Integrations

API:
Yes, GLM-OCR offers API access
No integrations listed.

Ratings/Reviews

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Company Information

Z.ai
Founded: 2019
China
github.com/zai-org/GLM-OCR

Videos and Screen Captures

GLM-OCR Screenshot 1
Other Useful Business Software
Rezku Point of Sale Icon
Rezku Point of Sale

Designed for Real-World Restaurant Operations

Rezku is an all-inclusive ordering platform and management solution for all types of restaurant and bar concepts. You can now get a fully custom branded downloadable smartphone ordering app for your restaurant exclusively from Rezku.
Learn More

Product Details

Platforms Supported
Cloud
Training
Documentation
Support
Online

GLM-OCR Frequently Asked Questions

Q: What kinds of users and organization types does GLM-OCR work with?
Q: What languages does GLM-OCR support in their product?
Q: Does GLM-OCR have an API?
Q: What type of training does GLM-OCR provide?
Q: How much does GLM-OCR cost?

GLM-OCR Product Features

OCR

Convert to PDF
Zone Selection Tool
ID Scanning
Multi-Language
Indexing
Metadata Extraction
Image Pre-processing
Text Editor
Multiple Output Formats
Batch Processing