Generate Any 3D Scene in Seconds
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Qwen3-TTS is an open-source series of TTS models
Video Object and Interaction Deletion
OpenTinker is an RL-as-a-Service infrastructure for foundation models
A Powerful Native Multimodal Model for Image Generation
Ultra-Efficient LLMs on End Device
Advancing Open-source World Models
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Industrial-level controllable zero-shot text-to-speech system
Generating Immersive, Explorable, and Interactive 3D Worlds
FAIR Sequence Modeling Toolkit 2
A Customizable Image-to-Video Model based on HunyuanVideo
Qwen3-Coder is the code version of Qwen3
ChatGPT interface with better UI
Sharp Monocular Metric Depth in Less Than a Second
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
Achieving 3+ generation speedup on reasoning tasks
Easy Docker setup for Stable Diffusion with user-friendly UI
RGBD video generation model conditioned on camera input
Qwen2.5-VL is the multimodal large language model series
An Efficient Agentic Model for Computer Use
Audio foundation model excelling in audio understanding
Qwen3-ASR is an open-source series of ASR models
Block Diffusion for Ultra-Fast Speculative Decoding