Lightweight inference library for ONNX files, written in C++
Everything you need to build state-of-the-art foundation models
Simplifies the local serving of AI models from any source
Pure C++ implementation of several models for real-time chatting
Build Production-ready Agentic Workflow with Natural Language
Bring the notion of Model-as-a-Service to life
Operating LLMs in production
A scalable inference server for models optimized with OpenVINO
Large Language Model Text Generation Inference
Adversarial Robustness Toolbox (ART) - Python Library for ML security
An easy-to-use LLMs quantization package with user-friendly apis
Fast inference engine for Transformer models
DoWhy is a Python library for causal inference
Library for OCR-related tasks powered by Deep Learning
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
Official inference library for Mistral models
Bayesian inference with probabilistic programming
A set of Docker images for training and serving models in TensorFlow
AICI: Prompts as (Wasm) Programs
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
Uplift modeling and causal inference with machine learning algorithms
PyTorch library of curated Transformer models and their components
Deep learning optimization library: makes distributed training easy
Standardized Serverless ML Inference Platform on Kubernetes
Private Open AI on Kubernetes