Document (PDF, Word, PPTX ...) extraction and parse API
A playground to generate images from any text prompt using SD
Hypernetworks that adapt LLMs for specific benchmark tasks
Practical productivity tools for Claude Code, Codex-CLI
Readest is a modern, feature-rich ebook reader
Text and image to video generation: CogVideoX and CogVideo
OCR offline image text recognition command line windows program
Awesome multilingual OCR toolkits based on PaddlePaddle
Qwen3-TTS is an open-source series of TTS models
Deep Research framework, combining language models with tools
A single Gradio + React WebUI with extensions for ACE-Step
Text mining using tidy tools
Chat with it via text and voice
Generate audiobooks from EPUBs, PDFs and text with captions
A TTS that fits in your CPU (and pocket)
Reading book source
Code for openai.fm, a demo for the OpenAI Speech API
A robust, efficient, low-latency speech-to-text library
Screenshots, word marking, OCR, AI, translation software
Canvas-based WYSIWYG rich text editor with advanced layout tools
The media player for language learning, with dual subtitles
World's first open-source, agentic video production system
Framework for building real-time voice and multimodal AI agents
Web presentation editor replicating many PowerPoint features online
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX