Image generation model with single-stream diffusion transformer
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Strong, Economical, and Efficient Mixture-of-Experts Language Model
Towards Real-World Vision-Language Understanding
An AI-powered security review GitHub Action using Claude
My personal Claude Code configuration
Models for object and human mesh reconstruction
Easy Docker setup for Stable Diffusion with user-friendly UI
Instructions on how to use the Realtime API on Microcontrollers
Analyze computation-communication overlap in V3/R1
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
PyTorch implementation of JiT
The official PyTorch implementation of Google's Gemma models
The ChatGPT Retrieval Plugin lets you easily find personal documents
Python example app from the OpenAI API quickstart tutorial
DeepSeek LLM: Let there be answers
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
800,000 step-level correctness labels on LLM solutions to MATH problem
Large-scale xAI model for local inference with SGLang, Grok-2.5
Text-to-image model optimized for artistic quality and safe generation
Powerful 14B-base multimodal model — flexible base for fine-tuning