Official Repo For "Sa2VA: Marrying SAM2 with LLaVA
AI tool that converts GitHub repositories into interactive diagrams
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Multilingual Document Layout Parsing in a Single Vision-Language Model
Installable / Portable Python Distribution for Everyone.
The book "Performance Analysis and Tuning on Modern CPU"
OCR expert VLM powered by Hunyuan's native multimodal architecture
Handwritten Text Recognition (HTR) system implemented with TensorFlow
Let agents classify your bank transactions
Fast, powerful, git-native ticket tracking in a single bash script
Inference script for Oasis 500M
ICLR2024 Spotlight: curation/training code, metadata, distribution
Flexible Photo Recrafting While Preserving Your Identity
A command-line utility for taking automated screenshots of websites
Basic Website Studio (Tkinter Edition)
A software construction tool
maXbox is a script tool engine, compiler and source lib all in one exe
Visual Automation IDE — automate anything you see on screen
*VoxShare* is a simple Python-based push-to-talk multicast voice chat
Guiding Instruction-based Image Editing via Multimodal Large Language
Spyder IDE plugin providing separate chat pane for AI Assistance
HF/VHF spectrosopy code for the rx888mk2 direct-sampling receiver
Windows 11 only — includes 16 sections with text and visual reports.
Convert files like docx, xlsx, pptx, html, and more to MarkDown
Professional financial tools: 16 sections with text and visual reports