Open multimodal web agent built by Ai2
Open-source MCP server that gives your coding agent
A sound cloning tool with a web interface, using your voice
Tools like web browser, computer access and code runner for LLMs
Modular AI image and video generation web UI with extensible tools
Qwen3-Coder is the code version of Qwen3
A simple native web interface that uses ChatTTS to synthesize text
Context-aware desktop AI assistant that understands screen content
AI tool converting video/audio into structured documents instantly
A fast TTS architecture with conditional flow matching
Linkedin Automation Tool
Gracefully face hCaptcha challenge with multimodal llms
AI tool for real-time monitoring and analysis of Goofish listings
Automate native Android apps with AI using accessibility APIs
Python SDK for the Computer Use model Lux, developed by OpenAGI
Stable Diffusion web UI
Speech-AI-Forge is a project developed around TTS generation model
Fast-stable-diffusion + DreamBooth
Get started w/ building Fullstack Agents using Gemini 2.5 & LangGraph
A library to communicate with ChatGPT, Claude, Copilot, Gemini
Open Source Computer Vision Library
Visual localization made easy with hloc
TensorFlow documentation
Generate photo-realistic textures based on source images
Your gateway to GPT writing