Ready-to-use OCR with 80+ supported languages
Run Local LLMs on Any Device. Open-source
A high-throughput and memory-efficient inference and serving engine
Deep learning optimization library: makes distributed training easy
Standardized Serverless ML Inference Platform on Kubernetes
GPU environment management and cluster orchestration
AIMET is a library that provides advanced quantization and compression
A library for accelerating Transformer models on NVIDIA GPUs
Uncover insights, surface problems, monitor, and fine tune your LLM
Easy-to-use Speech Toolkit including Self-Supervised Learning model
Data manipulation and transformation for audio signal processing
Open-source tool designed to enhance the efficiency of workloads
Powering Amazon custom machine learning chips
Python Package for ML-Based Heterogeneous Treatment Effects Estimation
LLM training code for MosaicML foundation models
Replace OpenAI GPT with another LLM in your app
Low-latency REST API for serving text-embeddings
The Triton Inference Server provides an optimized cloud
PyTorch library of curated Transformer models and their components
Trainable models and NN optimization tools
Simplifies the local serving of AI models from any source
Integrate, train and manage any AI models and APIs with your database
Openai style api for open large language models
An MLOps framework to package, deploy, monitor and manage models
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method