State-of-the-art Parameter-Efficient Fine-Tuning
Replace OpenAI GPT with another LLM in your app
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
GPU environment management and cluster orchestration
Open standard for machine learning interoperability
An Open-Source Programming Framework for Agentic AI
Deep Learning API and Server in C++14 support for Caffe, PyTorch
MII makes low-latency and high-throughput inference possible
Connect home devices into a powerful cluster to accelerate LLM
Build Production-ready Agentic Workflow with Natural Language
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
A Pythonic framework to simplify AI service building
Prem provides a unified environment to develop AI applications
Uplift modeling and causal inference with machine learning algorithms
PyTorch library of curated Transformer models and their components
State-of-the-art diffusion models for image and audio generation
Implementation of model parallel autoregressive transformers on GPUs
Operating LLMs in production
A RWKV management and startup tool, full automation, only 8MB
Multilingual Automatic Speech Recognition with word-level timestamps
Images to inference with no labeling
Bolt is a deep learning library with high performance
A GPU-accelerated library containing highly optimized building blocks
Deep learning optimization library: makes distributed training easy
CPU/GPU inference server for Hugging Face transformer models