Public repository for Agent Skills
TTS with kokoro and onnx runtime
Code for running inference and finetuning with SAM 3 model
Robust Speech Recognition via Large-Scale Weak Supervision
Structure-from-Motion and Multi-View Stereo
Audio Plugin for Audio to MIDI transcription using deep learning
An open-source alternative to Claude Cowork, powered by opencode
The media player for language learning, with dual subtitles
157 models, 30 providers, one command to find what runs on hardware
Implementation of TurboQuant (ICLR 2026)
Official inference repo for FLUX.2 models
NVR with realtime local object detection for IP cameras
Clippy, now with some AI
Singing Voice Synthesis via Shallow Diffusion Mechanism
A latent text-to-image diffusion model
Python inference and LoRA trainer package for the LTX-2 audio–video
AI video generator optimized for low VRAM and older GPUs use
Kimi Code CLI is your next CLI agent
Use Microsoft Edge's online text-to-speech service from Python
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models
An extensible framework for Personal Data Management
Image generation model with single-stream diffusion transformer
All-in-one AI companion! Desktop girlfriend + virtual streamer
High-Resolution Image Synthesis with Latent Diffusion Models