Build multimodal AI applications with cloud-native stack
Parallax is a distributed model serving framework
I Agent designed to interact with ROS1- and ROS2-based robotics system
Open-source evaluation toolkit of large multi-modality models (LMMs)
Outcome driven agent development framework that evolves
End-to-end pipeline converting generative videos
Collection of Gemma 3 variants that are trained for performance
Provider-agnostic, open-source evaluation infrastructure
Official Repo For "Sa2VA: Marrying SAM2 with LLaVA
LLM-based agent for general purpose software engineering tasks
Large Multimodal Models for Video Understanding and Editing
Implementation of "MobileCLIP" CVPR 2024
An open sourced end-to-end VLM-based GUI Agent
Superfast AI decision making and processing of multi-modal data
Usable Implementation of "Bootstrap Your Own Latent" self-supervised
Determined, deep learning training platform
Deep learning optimization library making distributed training easy
Data loaders and abstractions for text and NLP
Reading book source
AI agent that streamlines the entire process of data analysis
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
14-stage Fusion Pipeline for LLM token compression
Open multimodal web agent built by Ai2
Machine learning on FPGAs using HLS
Advanced NLP with spaCy: A free online course