Port of Facebook's LLaMA model in C/C++
Official inference repo for FLUX.2 models
MiniMax M2.1, a SOTA model for real-world dev & agents.
Awesome multilingual OCR toolkits based on PaddlePaddle
Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference
Flux 2 image generation model pure C inference
Clean and efficient FP8 GEMM kernels with fine-grained scaling
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
FAIR Sequence Modeling Toolkit 2
AlphaFold 3 inference pipeline
VMZ: Model Zoo for Video Modeling
Fast, Sharp & Reliable Agentic Intelligence
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Foundational Models for State-of-the-Art Speech and Text Translation
Open-source large language model family from Tencent Hunyuan
FlashMLA: Efficient Multi-head Latent Attention Kernels
Hackable and optimized Transformers building blocks
Real-time behaviour synthesis with MuJoCo, using Predictive Control
Powerful open source image generation model
Runtime extension of Proximus enabling Deployment on AMD Ryzen™ AI
A fast, local neural text to speech system
llama.go is like llama.cpp in pure Golang
Locally run an Instruction-Tuned Chat-Style LLM