Non-Pydantic, Non-JSON Schema, efficient AutoPrompting
PyTorch library of curated Transformer models and their components
High-performance inference framework for large language models
ChatGLM3 series: Open Bilingual Chat LLMs | Open Source Bilingual Chat
Low-latency REST API for serving text-embeddings
A modular graph-based Retrieval-Augmented Generation (RAG) system
Bridging LLM and Recommender System
Llama Chinese community, real-time aggregation
Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge
Revolutionizing Database Interactions with Private LLM Technology
Everything you need to build state-of-the-art foundation models
SQL-native memory layer enabling persistent context for AI agents
Semantic cache for LLMs. Fully integrated with LangChain
LLM based data scientist, AI native data application
An orchestration framework for agentic AI and LLM applications
ChatGLM2-6B: An Open Bilingual Chat LLM
A high-performance ML model serving framework, offers dynamic batching
Make your agents learn from experience
Agent toolkit providing semantic retrieval and editing capabilities
Phi-3.5 for Mac: Locally-run Vision and Language Models
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
An LLM-powered knowledge curation system that researches topics
Open-Source Financial Large Language Models
Lemonade helps users run local LLMs with the highest performance