CLIP, Predict the most relevant text snippet given an image
New family of code large language models (LLMs)
Wan2.2: Open and Advanced Large-Scale Video Generative Model
4M: Massively Multimodal Masked Modeling
Python inference and LoRA trainer package for the LTX-2 audio–video
ChatGPT interface with better UI
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
An experimental version of DeepSeek model
PyTorch code and models for the DINOv2 self-supervised learning
A Powerful Native Multimodal Model for Image Generation
Block Diffusion for Ultra-Fast Speculative Decoding
LLM-based Reinforcement Learning audio edit model
Pretrained time-series foundation model developed by Google Research
ICLR2024 Spotlight: curation/training code, metadata, distribution
The ChatGPT Retrieval Plugin lets you easily find personal documents
Open-source, high-performance Mixture-of-Experts large language model
Official code for Style Aligned Image Generation via Shared Attention
Fine-tuning ChatGLM-6B with PEFT
A minimal PyTorch re-implementation of the OpenAI GPT
Reference implementation of the Transformer architecture optimized
Large-scale autoregressive pixel model for image generation by OpenAI
A library for Multilingual Unsupervised or Supervised word Embeddings