The Triton Inference Server provides an optimized cloud
C++ library for high performance inference on NVIDIA GPUs
ONNX Runtime: cross-platform, high performance ML inferencing
AIMET is a library that provides advanced quantization and compression
A general-purpose probabilistic programming system
Bayesian inference with probabilistic programming
Ready-to-use OCR with 80+ supported languages
Uplift modeling and causal inference with machine learning algorithms
Serve, optimize and scale PyTorch models in production
The official Python client for the Huggingface Hub
DoWhy is a Python library for causal inference
Jupyter notebook tutorials for OpenVINO
Unified Model Serving Framework
Gaussian processes in TensorFlow
Single-cell analysis in Python
Easy-to-use deep learning framework with 3 key features
Uncover insights, surface problems, monitor, and fine tune your LLM
Serving system for machine learning models
Training and deploying machine learning models on Amazon SageMaker
OpenMLDB is an open-source machine learning database
A Pythonic framework to simplify AI service building
A library for accelerating Transformer models on NVIDIA GPUs
Tensor library for machine learning
Python-free Rust inference server
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method