Generating Immersive, Explorable, and Interactive 3D Worlds
A Unified Framework for Text-to-3D and Image-to-3D Generation
Autoregressive Model Beats Diffusion
Offline inference engine for art, real-time voice conversations
Multimodal-Driven Architecture for Customized Video Generation
Collection of Gemma 3 variants that are trained for performance
Chat & pretrained large vision language model
Towards Real-World Vision-Language Understanding
Contexts Optical Compression
Easily compute clip embeddings and build a clip retrieval system
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
Flexible Photo Recrafting While Preserving Your Identity
AI-powered code assistant for Vim. OpenAI and ChatGPT plugin for Vim
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences
Capable of understanding text, audio, vision, video
Stable Diffusion built-in to Blender
Stable Diffusion web UI
A Pioneering Open-Source Alternative to GPT-4o
Implementation of Phenaki Video, which uses Mask GIT
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Fast stable diffusion on CPU and AI PC
Easy-to-use and powerful NLP library with Awesome model zoo
AI video generator optimized for low VRAM and older GPUs use
ImageBind One Embedding Space to Bind Them All
Diffusion Transformer with Fine-Grained Chinese Understanding