CLIP, Predict the most relevant text snippet given an image
Reference PyTorch implementation and models for DINOv3
Implementation of "MobileCLIP" CVPR 2024
PyTorch code and models for the DINOv2 self-supervised learning
PyTorch implementation of MAE
Per-Pixel Classification is Not All You Need for Semantic Segmentation
Reproduces results of "Fixing the train-test resolution discrepancy"