Awesome multilingual OCR toolkits based on PaddlePaddle
Robust Speech Recognition via Large-Scale Weak Supervision
Speech recognition module for Python
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
Library for OCR-related tasks powered by Deep Learning
Toolkit for conversational AI
Handwritten Text Recognition (HTR) system implemented with TensorFlow
Contexts Optical Compression
OCR software, free and offline
The behavior guidance framework for customer-facing LLM agents
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
A full spaCy pipeline and models for scientific/biomedical documents
A ranked list of awesome machine learning Python libraries
Han Language Processing
Open source annotation tool for machine learning practitioners
NLP Cloud serves high performance pre-trained or custom models for NER
Underthesea - Vietnamese NLP Toolkit
kaldi-asr/kaldi is the official location of the Kaldi project
Open-source industrial-grade ASR models
Training data (data labeling, annotation, workflow) for all data types
Audio foundation model excelling in audio understanding
Persian NLP Toolkit
Automatic Speech Recognition with Word-level Timestamps
Accurate × Fast × Comprehensive
OCRmyPDF adds an OCR text layer to scanned PDF files