Qwen-Image is a powerful image generation foundation model
Autoregressive Model Beats Diffusion
Chat & pretrained large vision language model
AI-powered code assistant for Vim. OpenAI and ChatGPT plugin for Vim
Capable of understanding text, audio, vision, video
A Pioneering Open-Source Alternative to GPT-4o
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Qwen3-omni is a natively end-to-end, omni-modal LLM
Chinese and English multimodal conversational language model
Tensor search for humans
Multilingual sentence & image embeddings with BERT
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System
Phi-3.5 for Mac: Locally-run Vision and Language Models
Large-language-model & vision-language-model based on Linear Attention
Open source libraries and APIs to build custom preprocessing pipelines
Open source demo platform where you can easily showcase your AI models
LISA: Reasoning Segmentation via Large Language Model
Skywork-R1V is an advanced multimodal AI model series
Refer and Ground Anything Anywhere at Any Granularity
Gemma open-weight LLM library, from Google DeepMind
Guiding Instruction-based Image Editing via Multimodal Large Language
An open-source framework for training large multimodal models