Global weather forecasting model using graph neural networks and JAX
Qwen3-ASR is an open-source series of ASR models
Industrial-level controllable zero-shot text-to-speech system
Recovering the Visual Space from Any Views
CLIP, Predict the most relevant text snippet given an image
A Production-ready Reinforcement Learning AI Agent Library
Controllable & emotion-expressive zero-shot TTS
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Code for the paper Hybrid Spectrogram and Waveform Source Separation
Let us control diffusion models
llama.go is like llama.cpp in pure Golang
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Reference implementation of the Transformer architecture optimized
Reproduces results of "Fixing the train-test resolution discrepancy"
Learning Continuous Signed Distance Functions for Shape Representation
Dual LSTM Encoder for Dialog Response Generation