Operating LLMs in production
Adding guardrails to large language models
Implement CPU from scratch and play with large model deployments
PandasAI is a Python library that integrates generative AI
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Build multimodal language agents for fast prototype and production
Evaluate your LLM's response with Prometheus and GPT4
CNCF Sandbox Project
A high-throughput and memory-efficient inference and serving engine
Phi-3.5 for Mac: Locally-run Vision and Language Models
Qwen3-Coder is the code version of Qwen3
Easy token price estimates for 400+ LLMs. TokenOps
Build a large language model from 0 only with Python foundation
Universal LLM Deployment Engine with ML Compilation
Language-model investigation agent with a terminal UI
High-performance inference framework for large language models
A modular graph-based Retrieval-Augmented Generation (RAG) system
Interact with your documents using the power of GPT
Anomaly detection related books, papers, videos, and toolboxes
Simple, Pythonic building blocks to evaluate LLM applications
Chat with your documents using local AI
Chat with your SQL database
An orchestration framework for agentic AI and LLM applications
Gemma open-weight LLM library, from Google DeepMind
Easy-to-use LLM fine-tuning framework (LLaMA-2, BLOOM, Falcon