Voice Recognition to Text Tool
Official Repo For "Sa2VA: Marrying SAM2 with LLaVA
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Python package built to ease deep learning on graph
Provides convenient access to the Anthropic REST API from any Python 3
Pluggable SOTA multi-object tracking modules for segmentation
Superduper: Integrate AI models and machine learning workflows
Python library and CLI tool to interface with Google Translate
Code for running inference with the SAM 3D Body Model 3DB
Machine learning metrics for distributed, scalable PyTorch application
Qwen2.5-VL is the multimodal large language model series
Gracefully face hCaptcha challenge with multimodal llms
RF-DETR is a real-time object detection and segmentation
JAX-based neural network library
Create HTML profiling reports from pandas DataFrame objects
Visual intelligence for your home.
A python module to repair invalid JSON from LLMs
Motion-controllable Video Generation via Latent Trajectory Guidance
Simple and easily configurable grid world environments
Code release for Cut and Learn for Unsupervised Object Detection
Official implementation of Watermark Anything with Localized Messages
Driving with Graph Visual Question Answering
Open source demo platform where you can easily showcase your AI models
Collections of robotics environments
code for Mesh R-CNN, ICCV 2019