A high-throughput and memory-efficient inference and serving engine
Easiest and laziest way for building multi-agent LLMs applications
PyTorch extensions for fast R&D prototyping and Kaggle farming
Gaussian processes in TensorFlow
LLM training code for MosaicML foundation models
Unified Model Serving Framework
Low-latency REST API for serving text-embeddings
Tensor search for humans
Images to inference with no labeling
High quality, fast, modular reference implementation of SSD in PyTorch
Framework that is dedicated to making neural data processing
Serve machine learning models within a Docker container
Deploy a ML inference service on a budget in 10 lines of code