Port of Facebook's LLaMA model in C/C++
Build your own AI friend
Run Local LLMs on Any Device. Open-source
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
OpenVINO™ Toolkit repository
Mooncake is the serving platform for Kimi
Alibaba's high-performance LLM inference engine for diverse apps
Speech-to-text, text-to-speech, and speaker recognition
Unsupervised text tokenizer for Neural Network-based text generation
Distribute and run LLMs with a single file
High-speed Large Language Model Serving for Local Deployment
An Easy-to-Use and High-Performance AI Deployment Framework
Official inference framework for 1-bit LLMs
TT-NN operator library, and TT-Metalium low level kernel programming
Emscripten: An LLVM-to-WebAssembly Compiler
Bolt is a deep learning library with high performance
Open Source Computer Vision Library
Production ready toolkit to run AI locally
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Framework for building AI-powered interactive digital humans and agent
Offline speech recognition API for Android, iOS, Raspberry Pi
LLMs as Copilots for Theorem Proving in Lean
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Code for Cicero, an AI agent that plays the game of Diplomacy
Pure C++ implementation of several models for real-time chatting