Qwen-Image is a powerful image generation foundation model
Foundation model for image generation
General-purpose image editing model that delivers high-fidelity
Implementation of Make-A-Video, new SOTA text to video generator
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Focus on prompting and generating
Comprehensive Markdown plugin built for Django
A Powerful Native Multimodal Model for Image Generation
OCRmyPDF adds an OCR text layer to scanned PDF files
Official inference repo for FLUX.2 models
Implementation of Imagen, Google's Text-to-Image Neural Network
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Official MiniMax Model Context Protocol (MCP) server
This repo contains the code for 1D tokenizer and generator
Label Studio is a multi-type data labeling and annotation tool
CLIP, Predict the most relevant text snippet given an image
Stable Diffusion WebUI optimized for AMD GPUs with editing tools
Official inference repo for FLUX.1 models
A python tool that uses GPT-4, FFmpeg, and OpenCV
Stable Diffusion web UI
InvokeAI is a leading creative engine for Stable Diffusion models
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Collection of Gemma 3 variants that are trained for performance
Easily compute clip embeddings and build a clip retrieval system