ClawTeam: Agent Swarm Intelligence (One Command → Full Automation)
Capable of understanding text, audio, vision, video
The open-source tool for building high-quality datasets
Biomni: a general-purpose biomedical AI agent
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
A research prototype of a human-centered web agent
Korea Investment & Securities Open API Github
Open Source Generative Process Automation
Models for the spaCy Natural Language Processing (NLP) library
SpikingJelly is an open-source deep learning framework
Synthetic Data Generation for tabular, relational and time series data
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Collection of Kaggle Solutions and Ideas
A simple yet powerful agent framework that delivers with models
Large Multimodal Models for Video Understanding and Editing
ChatGPT extension for scientific research work
AI-powered video clipping and highlight generation
Open source platform for managing, testing, and deploying AI apps
Unsupervised Learning for Image Registration
Official Repo For "Sa2VA: Marrying SAM2 with LLaVA
Virtual AI anchor that combines state-of-the-art technology
computer vision projects | Fun AI projects related to computer vision
Obsei is a low code AI powered automation tool
AI-powered PC monitoring that explains. Not shows numbers/spikes.
CoTracker is a model for tracking any point (pixel) on a video