A high-performance ML model serving framework, offers dynamic batching
270+ Claude Code plugins with 739 agent skills
E2B Desktop Sandbox for LLMs. E2B Sandbox
Recipes to train reward model for RLHF
Official Repo for ICML 2024 paper
Deploy your agentic worfklows to production
the terminal client for Ollama
OpenDAN is an open source Personal AI OS
Data Infrastructure providing an approach to multimodal AI workloads
On the Structural Pruning of Large Language Models
Linkedin Automation Tool
Get started w/ building Fullstack Agents using Gemini 2.5 & LangGraph
Ongoing research training transformer models at scale
High-performance Inference and Deployment Toolkit for LLMs and VLMs
Low-latency REST API for serving text-embeddings
A modular Agentic RAG built with LangGraph
An efficient forwarding service designed for LLMs
Learn to build your Second Brain AI assistant with LLMs
Run PyTorch LLMs locally on servers, desktop and mobile
Extension of Google Research’s PaperBanana
A lightweight framework for building LLM-based agents
Structured data extraction and instruction calling with ML, LLM
Framework to easily create LLM powered bots over any dataset
Inference Llama 2 in one file of pure C
Qwen3-Coder is the code version of Qwen3