Tiny vision language model
Refine and quantize messy AI pixel art into clean, perfect pixels
Open-sourced unified customization model
Synthetic data generators for tabular and time-series data
State-of-the-art diffusion models for image and audio generation
Multilingual sentence & image embeddings with BERT
Native and Compact Structured Latents for 3D Generation
Efficient Retrieval Augmentation and Generation Framework
Stable Diffusion built-in to Blender
Stable-diffusion-webui-pixelization
Toolkit for conversational AI
SUPIR upscaling wrapper for ComfyUI
Official inference repo for FLUX.2 models
Towards Human-Sounding Speech
Toolkit for audio, music, and speech generation
A converter for seamless transformation of files, data, and media ...
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Implementation of NÜWA, attention network for text to video synthesis
A latent text-to-image diffusion model
State-of-the-art deep learning based audio codec
Based on the Disco Diffusion, version of the AI art creation software
Deep Learning papers reading roadmap for anyone who are eager to learn
Local image generation using VQGAN-CLIP or CLIP guided diffusion
Almost state of art text generation library
The source code of CVPR 2019 paper "Deep Exemplar-based Colorization"