PyTorch extensions for fast R&D prototyping and Kaggle farming
Robust Speech Recognition via Large-Scale Weak Supervision
Provides code for running inference with the SegmentAnything Model
Data manipulation and transformation for audio signal processing
A simple but complete full-attention transformer
Accurate × Fast × Comprehensive
A SOTA open-source image editing model
Industrial-level controllable zero-shot text-to-speech system
End-to-end speech processing toolkit
LLM training code for MosaicML foundation models
Fast inference engine for Transformer models
Open-source industrial-grade ASR models
Pretrained time-series foundation model developed by Google Research
The unofficial python package that returns response of Google Bard
A Conversational Speech Generation Model
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis
Consistency Distilled Diff VAE
Basaran, an open-source alternative to the OpenAI text completion API
Neural machine translation and sequence learning using TensorFlow
Implementation of NÜWA, attention network for text to video synthesis
Text-conditional image generation model based on OpenAI's unCLIP
CPT: A Pre-Trained Unbalanced Transformer
Singing Voice Synthesis via Shallow Diffusion Mechanism
Open-source pre-training implementation of Google's LaMDA in PyTorch
Code release for "Masked-attention Mask Transformer