Unofficial Parallel WaveGAN
Best practice TTS based on BERT and VITS
Simple and powerful voice changer for Linux, written with Python & GTK
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis
Implementation of Nougat Neural Optical Understanding
Chinese voice dialogue robot/smart speaker project
Singing voice change based on whisper, lora for singing voice clone
A webui for different audio related Neural Networks
Official PyTorch Implementation of "Scalable Diffusion Models"
Personal AI Assistant For Windows , Linux
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
A walk along memory lane
Implementation of NÜWA, attention network for text to video synthesis
Large dataset of coding contests designed for AI and ML model training
Singing Voice Synthesis via Shallow Diffusion Mechanism
Real-time music generation using stable diffusion techniques AI
A latent text-to-image diffusion model
Point cloud diffusion for 3D model synthesis
Codebase for Diffusion Models Beat GANS on Image Synthesis
[WIP] VoiceSmith makes training text to speech models easy
A Deep-Learning-Based Chinese Speech Recognition System
A Python/Pytorch app for easily synthesising human voices
GLIDE: a diffusion-based text-conditional image synthesis model
3D-aware GANs based on NeRF (arXiv)
Generative Adversarial Transformers