Port of Facebook's LLaMA model in C/C++
Inference Llama 2 in one file of pure C
Run models like Kimi-K2.5, GLM-5, DeepSeek, gpt-oss, Gemma, Qwen etc.
llama and other large language models on iOS and MacOS offline
Run a 1-billion parameter LLM on a $10 board with 256MB RAM
Local AI file organization with categorization and rename suggestions
Llama 2 Everywhere (L2E)
Locally run an Instruction-Tuned Chat-Style LLM