Browse free open source Python LLM Inference Tools and projects below. Use the toggles on the left to filter open source Python LLM Inference Tools by OS, license, language, programming language, and project status.
Deploy a ML inference service on a budget in 10 lines of code
Open platform for training, serving, and evaluating language models
Implementation of model parallel autoregressive transformers on GPUs
Gaussian processes in TensorFlow
Build your chatbot within minutes on your favorite device
Toolbox of models, callbacks, and datasets for AI/ML researchers
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
OpenMMLab Model Deployment Framework
OpenMMLab Video Perception Toolbox
Framework for Accelerating LLM Generation with Multiple Decoding Heads
Lightweight Python library for adding real-time multi-object tracking
Easy-to-use Speech Toolkit including Self-Supervised Learning model
PyTorch extensions for fast R&D prototyping and Kaggle farming
Implementation of "Tree of Thoughts
A lightweight vision library for performing large object detection
High quality, fast, modular reference implementation of SSD in PyTorch
Serve machine learning models within a Docker container
Toolkit for allowing inference and serving with MXNet in SageMaker
Sequence-to-sequence framework, focused on Neural Machine Translation
Libraries for applying sparsification recipes to neural networks
A toolkit to optimize ML models for deployment for Keras & TensorFlow
Probabilistic reasoning and statistical analysis in TensorFlow
Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion
Open-source tool designed to enhance the efficiency of workloads
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere