Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Open-Source Financial Large Language Models
ChatGLM-6B: An Open Bilingual Dialogue Language Model
Phi-3.5 for Mac: Locally-run Vision and Language Models
Generate Any 3D Scene in Seconds
Qwen3-TTS is an open-source series of TTS models
A Powerful Native Multimodal Model for Image Generation
Video Object and Interaction Deletion
The Clay Foundation Model - An open source AI model and interface
Qwen3-Coder is the code version of Qwen3
Ultra-Efficient LLMs on End Device
Advancing Open-source World Models
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Industrial-level controllable zero-shot text-to-speech system
Generating Immersive, Explorable, and Interactive 3D Worlds
OpenTinker is an RL-as-a-Service infrastructure for foundation models
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
FAIR Sequence Modeling Toolkit 2
A Customizable Image-to-Video Model based on HunyuanVideo
ChatGPT interface with better UI
Sharp Monocular Metric Depth in Less Than a Second
Qwen2.5-VL is the multimodal large language model series
RGBD video generation model conditioned on camera input
Achieving 3+ generation speedup on reasoning tasks
Easy Docker setup for Stable Diffusion with user-friendly UI