A SOTA open-source image editing model
A Production-ready Reinforcement Learning AI Agent Library
Stable Diffusion with Core ML on Apple Silicon
GPT4V-level open-source multi-modal model based on Llama3-8B
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Foundation model for image generation
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Pokee Deep Research Model Open Source Repo
Implementation of the Surya Foundation Model for Heliophysics
Project Lyra: Open Generative 3D World Models
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Capable of understanding text, audio, vision, video
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
A trainable PyTorch reproduction of AlphaFold 3
Multi-modal large language model designed for audio understanding
Large Multimodal Models for Video Understanding and Editing
The official PyTorch implementation of Google's Gemma models
General-purpose image editing model that delivers high-fidelity
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
A state-of-the-art open visual language model
Multimodal embedding and reranking models built on Qwen3-VL
New family of code large language models (LLMs)
Unified Multimodal Understanding and Generation Models