Showing 30 open source projects for "tesseract ocr"

View related business solutions
  • Rezku Point of Sale Icon
    Rezku Point of Sale

    Designed for Real-World Restaurant Operations

    Rezku is an all-inclusive ordering platform and management solution for all types of restaurant and bar concepts. You can now get a fully custom branded downloadable smartphone ordering app for your restaurant exclusively from Rezku.
    Learn More
  • Cloudbrink Personal SASE service Icon
    Cloudbrink Personal SASE service

    For companies looking for low maintenance, secure, high performance connectivity for hybrid and remote workers

    Cloudbrink’s Personal SASE is a high-performance connectivity and security service that delivers a lightning-fast, in-office experience to the modern hybrid workforce anywhere. Combining high-performance ZTNA with Automated Moving Target Defense (AMTD), and Personal SD-WAN all connections are ultra-secure.
    Learn More
  • 1
    Tesseract OCR

    Tesseract OCR

    Open Source OCR Engine

    Tesseract is an open source OCR or optical character recognition engine and command line program. OCR is a technology that allows for the recognition of text characters within a digital image. With the latest version of Tesseract, there is a greater focus on line recognition, however it still supports the legacy Tesseract OCR engine which recognizes character patterns.
    Downloads: 3,117 This Week
    Last Update:
    See Project
  • 2
    Tesseract.js

    Tesseract.js

    A pure Javascript Multilingual OCR

    Tesseract.js is a pure Javascript port of the popular Tesseract OCR engine. Tesseract.js' library supports more than 100 languages, automatic text orientation and script detection, a simple interface for reading paragraph, word, and character bounding boxes. Tesseract.js can run either in a browser and on a server with NodeJS. Tesseract.js is a javascript library that gets words in almost any spoken language out of images.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 3
    LLM-Aided OCR Project

    LLM-Aided OCR Project

    Enhances Tesseract OCR output using LLMs (local or API)

    LLM Aided OCR is an open-source system designed to improve optical character recognition accuracy by combining traditional OCR tools with large language models. The project addresses common OCR challenges such as distorted text, unusual fonts, historical documents, and complex layouts that often produce inaccurate results with standard OCR pipelines. The system first extracts raw text using OCR engines and then applies language models to analyze and correct recognition errors based on...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Extractous

    Extractous

    Fast and efficient unstructured data extraction

    ...For broader format support, the system combines its Rust core with ahead-of-time compiled Apache Tika shared libraries, which allows it to extend parsing coverage while still avoiding traditional server-based overhead. It also supports OCR for images and scanned documents through Tesseract, making it useful for document ingestion pipelines that include image-based or scanned inputs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Searching for a better way to ship ecommerce? We can help Icon
    Searching for a better way to ship ecommerce? We can help

    ShipHero gives you the tools that give you ecommerce fulfillment super powers.

    ShipHero is built for multi-channel commerce. With a few clicks, you can connect your stores. ShipHero will download new products, as well as sync existing ones. When changes are made to your inventory all connected stores will be updated.
    Learn More
  • 5
    Paperless-ngx

    Paperless-ngx

    A community-supported supercharged version of paperless

    Paperless-ngx is a community-supported open-source document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 6
    Provides optical character recognition (OCR) solutions for Vietnamese language.
    Leader badge
    Downloads: 187 This Week
    Last Update:
    See Project
  • 7
    OculiX

    OculiX

    Visual Automation IDE — automate anything you see on screen

    OculiX is the evolution of SikuliX, actively maintained with the full agreement of its original creator RaiMan. Automate any desktop application using image recognition (OpenCV) and OCR (Tesseract + PaddleOCR). No access to source code or DOM required — if you can see it, you can automate it. Key features: - Guided step-by-step recorder with live code preview - Image recognition via OpenCV 4.10 - Dual OCR: Tesseract (built-in) + PaddleOCR (neural, high precision) - Local and remote automation via integrated VNC - SSH tunnels via embedded JSch - Cross-platform: Windows, macOS (Apple Silicon M1-M4), Linux - Scripting: Jython, JRuby, Java, PowerShell, AppleScript - Java 17 recommended (Java 8+ supported) - Full CI/CD with automated builds for all platforms Used worldwide for test automation, RPA, and visual regression testing. ...
    Downloads: 118 This Week
    Last Update:
    See Project
  • 8
    A GUI to ease the process of producing a multipage PDF from a scan. gscan2pdf should work on almost any Linux/BSD machine.
    Leader badge
    Downloads: 156 This Week
    Last Update:
    See Project
  • 9
    Multiuser HylaFAX PHP/MySQL Web interface for viewing faxes online, downloading & emailing in PDF format, and categorizing & archiving all sent and received faxes.
    Downloads: 34 This Week
    Last Update:
    See Project
  • HR Outsourcing Built for Small and Midsize Businesses Icon
    HR Outsourcing Built for Small and Midsize Businesses

    Payroll. Benefits. Compliance. Technology. All in one place.

    TriNet is a leading provider of HR outsourcing solutions built for small and midsize businesses. Its platform combines payroll, benefits, risk management, compliance, and HR technology in one integrated system. Through its PEO (Professional Employer Organization) and HR Plus (ASO) offerings, TriNet helps companies streamline HR administration, stay compliant, and access enterprise-level benefits. Businesses can run payroll efficiently, manage compliance with complex state and federal regulations, and offer competitive employee benefits with ease. The company’s intuitive HR platform also automates time tracking, leave requests, and onboarding. With TriNet, organizations can focus on growth while ensuring their people and processes are supported by expert HR guidance.
    Learn More
  • 10
    gImageReader

    gImageReader

    A graphical frontend to tesseract-ocr

    gImageReader is a simple Gtk/Qt front-end to tesseract. Features include: - Import PDF documents and images from disk, scanning devices, clipboard and screenshots - Process multiple images and documents in one go - Manual or automatic recognition area definition - Recognize to plain text or to hOCR documents - Recognized text displayed directly next to the image - Post-process the recognized text, including spellchecking - Generate PDF documents from hOCR documents **Note**:...
    Leader badge
    Downloads: 76 This Week
    Last Update:
    See Project
  • 11
    Linux-Intelligent-Ocr-Solution

    Linux-Intelligent-Ocr-Solution

    Easy-OCR solution and Tesseract trainer for GNU/Linux

    Linux-intelligent-ocr-solution Lios is a free and open source software for converting print in to text using either scanner or a camera, It can also produce text out of scanned images from other sources such as Pdf, Image, Folder containing Images or screenshot. Program is given total accessibility for visually impaired. A Tesseract Trainer GUI is also shipped with this package.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12

    cuneiformplus

    Fork of OCR software cuneiform

    Fork of OCR software cuneiform Original software see: https://launchpad.net/cuneiform-linux by Cognitive Technologies and Jussi Pakkanen Other Open Source OCR stuff see * Tesseract by Ray Smith (using the Leptonica image library) * GOCR * OCRAD
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    SwiftOCR

    SwiftOCR

    Fast and simple OCR library written in Swift

    SwiftOCR is a fast and simple OCR library written in Swift. It uses a neural network for image recognition. As of now, SwiftOCR is optimized for recognizing short, one-line long alphanumeric codes (e.g. DI4C9CM). We currently support iOS and OS X. If you want to recognize normal text like a poem or a news article, go with Tesseract, but if you want to recognize short, alphanumeric codes (e.g. gift cards), I would advise you to choose SwiftOCR because that's where it exceeds. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    pdfsandwich generates "sandwich" OCR pdf files, i.e. pdf files which contain only images (but no editable text) will be processed by optical character recognition (OCR) and the text will be added to each page invisibly "behind" the images. pdfsandwich is a command line tool which is supposed to be useful to OCR scanned books or journals. It is able to recognize the page layout even for multicolumn text.
    Leader badge
    Downloads: 310 This Week
    Last Update:
    See Project
  • 15
    A Java JNA wrapper for Tesseract OCR API
    Leader badge
    Downloads: 66 This Week
    Last Update:
    See Project
  • 16

    cbrTekStraktor

    an application to automatically extract text from comic books.

    cbrTekStraktor is an application to automatically extract text from the text bubbles or speech balloons present in comic book reader files (CBR). Its prime goal is to perform analysis on the texts of comic books. cbrTekStraktor can however also be used for scanlation or similar purposes. The application also enables to manually define text areas in CBR files. The application comprises a simple graphical editor for further processing the extracted text. The text extraction is...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 17
    This paper represent a development and deployment and/or Implementation of Optical Character Recognition (OCR) to translate images of typewritten or handwritten characters into electronically editable format by preserving font properties. OCR can do this by applying pattern matching algorithm. The Recognized characters are stored in editable format. Thus OCR make the computer read the printed documents discarding noise. Keywords- Optical Character Recognition, Image convert to character,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Toxin OCR

    Toxin OCR

    Android ocr app using tessaract engine

    Toxin Finder is an android app which uses google's tesseract ocr engine in order to capture an image of a product's ingredient list and return a list of harmful ingredients.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    yagf

    yagf

    YAGF is a tesseract and cuneiform wrapper and helper*

    YAGF is a graphical front-end for cuneiform and tesseract OCR tools. With YAGF you can open already scanned image files or obtain new images via XSane (scanning results are automatically passed to YAGF). Once you have a scanned image you can prepare it for recognition, select particular image areas for recognition, set the recognition language and so on. Recognized text is displayed in a editor window where it can be corrected, saved to disk or copied to clipboard.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 20
    Sanskrit / Hindi - Tesseract OCR

    Sanskrit / Hindi - Tesseract OCR

    Devanagari fonts traineddata for Tesseract OCR

    Read https://sourceforge.net/projects/tesseracthindi/files/OCRHindi_using_VietOCR_and_Tesseract.pdf/download for how to use vietocr gui for OCR of Hindi and Sanskrit texts using tesseract-ocr ***** Please see https://github.com/Shreeshrii/ imagessan and imageshin for newer box/tiff pairs, traineddata files, ocr evaluation statistics and ground truth files with images for Sanskrit and Hindi. ***** Following is OLD information - saved only for archival purposes. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    Turn your scanner into a free document reader for invoices (e.g. for e-banking) with the help of tesseract-ocr available for many unix (and also windows) platforms.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    lector

    lector

    An interface to tesseract ocr

    An interface to tesseract ocr
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Tesseract-gui
    Tessract-GUI is not a front-end for tesseract-ocr. It is just a graphical way to use it with simple image manipulation thru ImageMagick.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 24

    qt-box-editor

    QT4 editor of tesseract-ocr box files

    QT Box Editor is tool for adjusting tesseract-ocr box files. Aim of this project is to provide easy and efficient way for editing regardless file size.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    tesseract-ocr alternative download

    tesseract-ocr alternative download

    Alternative download for tesseract-ocr project

    Alternative download for tesseract-ocr project
    Leader badge
    Downloads: 1,611 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB