Structured outputs for llms
Accelerate local LLM inference and finetuning
A high-throughput and memory-efficient inference and serving engine
Uncertainty Quantification for Language Models, is a Python package
Gemma open-weight LLM library, from Google DeepMind
PandasAI is a Python library that integrates generative AI
PyTorch library of curated Transformer models and their components
Access large language models from the command-line
Synthetic data curation for post-training and data extraction
A python module to repair invalid JSON from LLMs
Scalable data pre processing and curation toolkit for LLMs
Open source libraries and APIs to build custom preprocessing pipelines
LLM abstractions that aren't obstructions
Accessible large language models via k-bit quantization for PyTorch
Easy token price estimates for 400+ LLMs. TokenOps
Building applications with LLMs through composability
AirLLM 70B inference with single 4GB GPU
The Security Toolkit for LLM Interactions
Tools for merging pretrained large language models
Replace OpenAI GPT with another LLM in your app
⚡ Building applications with LLMs through composability ⚡
DepGraph: Towards Any Structural Pruning
Schema-Guided Reasoning (SGR) has agentic system design
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Advanced techniques for RAG systems