gpt-4o for windows, macos and linux
A framework to enable multimodal models to operate a computer
3D reconstruction software
Open Source Differentiable Computer Vision Library
Curated list of classic, high-quality computer science books
Agent S: an open agentic framework that uses computers like a human
Medical imaging toolkit for deep learning
Effortless data labeling with AI support from Segment Anything
Control Any Computer Using LLMs
Datasets, transforms and models specific to Computer Vision
Python SDK for the Computer Use model Lux, developed by OpenAGI
The repository provides code for running inference with SAM 2
Agent Zero AI framework
Training data (data labeling, annotation, workflow) for all data types
The open-source tool for building high-quality datasets
A natural language interface for computers
Fast image augmentation library and an easy-to-use wrapper
We write your reusable computer vision tools
The Cradle framework is a first attempt at General Computer Control
Hub of ready-to-use datasets for ML models
Phi-3.5 for Mac: Locally-run Vision and Language Models
AI tool for automating desktop tasks via natural language input
Automate browser-based workflows with LLMs and Computer Vision
Making large AI models cheaper, faster and more accessible
Implementation of Vision Transformer, a simple way to achieve SOTA