Open Source Large Language Models (LLM) - Page 2

Sort By:

Large Language Models (LLM)

View 362 business solutions

Large Language Models (LLM) Clear Filters

SoftCo: Enterprise Invoice and P2P Automation Software
For companies that process over 20,000 invoices per year

SoftCo Accounts Payable Automation processes all PO and non-PO supplier invoices electronically from capture and matching through to invoice approval and query management. SoftCoAP delivers unparalleled touchless automation by embedding AI across matching, coding, routing, and exception handling to minimize the number of supplier invoices requiring manual intervention. The result is 89% processing savings, supported by a context-aware AI Assistant that helps users understand exceptions, answer questions, and take the right action faster.

Learn More
AestheticsPro Medical Spa Software
Our new software release will dramatically improve your medspa business performance while enhancing the customer experience

AestheticsPro is the most complete Aesthetics Software on the market today. HIPAA Cloud Compliant with electronic charting, integrated POS, targeted marketing and results driven reporting; AestheticsPro delivers the tools you need to manage your medical spa business. It is our mission To Provide an All-in-One Cutting Edge Software to the Aesthetics Industry.

Learn More
1

rtk

CLI proxy that reduces LLM token consumption

rtk is an open-source command-line proxy designed to optimize interactions between AI coding agents and the terminal by reducing unnecessary token consumption. When AI assistants execute shell commands during software development tasks, the resulting terminal output often contains large amounts of repetitive or irrelevant information that can overwhelm the model’s context window. RTK intercepts these command outputs and compresses them into concise summaries before sending them to the language model. This process helps maintain important information while removing redundant data such as boilerplate logs, long directory listings, or repetitive test outputs. By minimizing the amount of noise sent to the AI model, the tool improves reasoning quality and allows longer development sessions within the same context window. The system is implemented as a lightweight Rust binary that runs locally and integrates easily with common AI coding environments.

Downloads: 33 This Week

Last Update: 11 hours ago
See Project
2

super-agent-party

All-in-one AI companion! Desktop girlfriend + virtual streamer

Super Agent Party is an open-source experimental framework designed to demonstrate collaborative multi-agent AI systems interacting within a shared environment. The project explores how multiple specialized AI agents can coordinate to solve complex tasks by communicating with each other and sharing information. Instead of relying on a single monolithic model, the framework organizes agents with different roles or capabilities that cooperate to achieve goals. Each agent may handle different responsibilities such as planning, execution, reasoning, or knowledge retrieval, allowing the system to tackle more complex problems than a single agent might handle alone. The platform is primarily intended as a research and demonstration environment for experimenting with agent collaboration strategies. Developers can use it to study coordination patterns, communication protocols, and task decomposition in multi-agent systems.

Downloads: 32 This Week

Last Update: 5 days ago
See Project
3

Kimi K2

Kimi K2 is the large language model series developed by Moonshot AI

Kimi K2 is Moonshot AI’s advanced open-source large language model built on a scalable Mixture-of-Experts (MoE) architecture that combines a trillion total parameters with a subset of ~32 billion active parameters to deliver powerful and efficient performance on diverse tasks. It was trained on an enormous corpus of over 15.5 trillion tokens to push frontier capabilities in coding, reasoning, and general agentic tasks while addressing training stability through novel optimizer and architecture design strategies. The model family includes variants like a foundational base model that researchers can fine-tune for specific use cases and an instruct-optimized variant primed for general-purpose chat and agent-style interactions, offering flexibility for both experimentation and deployment. With its high-dimensional attention mechanisms and expert routing, Kimi-K2 excels across benchmarks in live coding, math reasoning, and problem solving.

Downloads: 30 This Week

Last Update: 2026-01-27
See Project
4

Qwen3

Qwen3 is the large language model series developed by Qwen team

Qwen3 is a cutting-edge large language model (LLM) series developed by the Qwen team at Alibaba Cloud. The latest updated version, Qwen3-235B-A22B-Instruct-2507, features significant improvements in instruction-following, reasoning, knowledge coverage, and long-context understanding up to 256K tokens. It delivers higher quality and more helpful text generation across multiple languages and domains, including mathematics, coding, science, and tool usage. Various quantized versions, tools/pipelines provided for inference using quantized formats (e.g. GGUF, etc.). Coverage for many languages in training and usage, alignment with human preferences in open-ended tasks, etc.

1 Review

Downloads: 29 This Week

Last Update: 2026-01-09
See Project
Outbound sales software
Unified cloud-based platform for dialing, emailing, appointment scheduling, lead management and much more.

Adversus is an outbound dialing solution that helps you streamline your call strategies, automate manual processes, and provide valuable insights to improve your outbound workflows and efficiency.

Learn More
5

MLC LLM

Universal LLM Deployment Engine with ML Compilation

MLC LLM is a machine learning compiler and deployment framework designed to enable efficient execution of large language models across a wide range of hardware platforms. The project focuses on compiling models into optimized runtimes that can run natively on devices such as GPUs, mobile processors, browsers, and edge hardware. By leveraging machine learning compilation techniques, mlc-llm produces high-performance inference engines that maintain consistent APIs across platforms. The system supports deployment on environments including Linux, macOS, Windows, iOS, Android, and web browsers while utilizing different acceleration technologies such as CUDA, Vulkan, Metal, and WebGPU. It also provides OpenAI-compatible APIs that allow developers to integrate locally deployed models into existing AI applications without major code changes.

Downloads: 28 This Week

Last Update: 2026-03-09
See Project
6

TuyaOpen

Next-gen AI+IoT framework for T2/T3/T5AI/ESP32/and more

TuyaOpen is an open-source AI-enabled Internet of Things development framework designed to simplify the creation and deployment of smart connected devices. The platform provides a cross-platform C and C++ software development kit that supports a wide range of hardware platforms including Tuya microcontrollers, ESP32 boards, Raspberry Pi devices, and other embedded systems. It offers a unified development environment where developers can build devices capable of communicating with IoT cloud services while integrating AI capabilities and intelligent automation features. The system includes built-in networking support for communication protocols such as Wi-Fi, Bluetooth, and Ethernet, allowing devices to connect securely to remote services and applications. TuyaOpen also integrates with Tuya’s broader cloud ecosystem, enabling developers to manage device authentication, firmware updates, device activation, and remote monitoring from centralized services.

Downloads: 28 This Week

Last Update: 2026-03-09
See Project
7

BAML

The AI framework that adds the engineering to prompt engineering

BAML is an open-source framework and domain-specific language designed to bring structured engineering practices to prompt development for large language model applications. Instead of treating prompts as unstructured text, BAML introduces a schema-driven approach where prompts are defined as typed functions with explicit inputs and outputs. This design allows developers to treat language model interactions as predictable software components rather than ad-hoc prompt strings. The framework enables developers to define prompt logic in a dedicated language while integrating it into applications written in various programming languages such as Python, TypeScript, Ruby, and Go. BAML also allows developers to specify which models are used for each prompt and how outputs should be validated or structured. By converting prompt engineering into a more formal programming workflow, the framework improves reliability, debugging, and maintainability of AI systems.

Downloads: 27 This Week

Last Update: 2026-03-11
See Project
8

tt-metal

TT-NN operator library, and TT-Metalium low level kernel programming

tt-metal, also referred to in its documentation as TT-Metalium, is Tenstorrent’s low-level software development kit for programming applications on Tenstorrent AI accelerators. The project is designed for developers who need direct access to the company’s Tensix processor architecture, exposing a programming model that is closer to hardware control than high-level inference frameworks. Instead of following a traditional GPU model centered on massive thread parallelism, the platform is built around a grid of specialized compute nodes called Tensix cores, each with local SRAM, dedicated compute units, and multiple RISC-V control processors. The SDK provides the abstractions and APIs needed to manage data movement, compute kernels, memory coordination, and execution flow across this architecture.

Downloads: 27 This Week

Last Update: 3 days ago
See Project
9

WhisperJAV

Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD

WhisperJAV is an open-source speech transcription pipeline designed specifically for generating subtitles for Japanese adult video content. The project addresses challenges that standard speech recognition models face when transcribing this type of audio, which often includes low signal-to-noise ratios and large numbers of non-verbal vocalizations. Traditional automatic speech recognition systems can misinterpret these sounds as words, leading to inaccurate transcripts. WhisperJAV introduces a specialized pipeline that separates text generation from timestamp alignment, allowing the system to generate transcripts and then align them with audio using forced alignment techniques. The framework supports several speech recognition models, including Qwen-based ASR systems and fine-tuned Whisper models trained on domain-specific dialogue.

Downloads: 26 This Week

Last Update: 4 days ago
See Project
10

AI as Workspace

An elegant AI chat client. Full-featured, lightweight

AI as Workspace, short for AI as Workspace, is an open-source AI client application that provides a unified interface for interacting with multiple large language models and AI tools within a single workspace environment. The platform is designed as a lightweight yet powerful desktop or web application that organizes AI interactions through structured workspaces. Instead of managing individual chat sessions separately, users can group conversations, artifacts, and tasks within customizable workspaces that support different projects or contexts. AIaW supports multiple AI providers and models through a flexible interface compatible with common API formats used by services such as OpenAI-style endpoints. The application also includes a plugin system that allows developers to extend the platform with additional capabilities such as automation tools, integrations, or custom AI utilities.

Downloads: 25 This Week

Last Update: 14 hours ago
See Project
11

BrowserGym

A Gym environment for web task automation

BrowserGym is an open framework for web task automation research that exposes browser interaction as a Gym-style environment for training and evaluating agents. It is intended for researchers building web agents rather than for end users looking for a consumer automation product. The project provides a common environment where agents can interact with websites, execute tasks, and be evaluated against standardized benchmarks. One of its main strengths is that it bundles several important benchmarks by default, including MiniWoB, WebArena, VisualWebArena, WorkArena, AssistantBench, WebLINX, and OpenApps. This gives researchers a unified way to compare agent behavior across diverse web environments and task types without stitching together separate evaluation stacks. BrowserGym is also designed to be extensible, and the repository notes that creating new benchmarks mainly involves inheriting its abstract task interface.

Downloads: 24 This Week

Last Update: 2026-03-09
See Project
12

HolmesGPT

CNCF Sandbox Project

HolmesGPT is an open-source AI agent designed to help DevOps and site reliability engineering teams diagnose and resolve production incidents. The system aggregates signals from observability tools such as logs, metrics, alerts, and distributed traces, then analyzes them using large language models to identify potential root causes. Rather than requiring engineers to manually correlate large volumes of monitoring data, HolmesGPT automatically synthesizes evidence and presents explanations in natural language. The project is developed by Robusta and has been accepted as a Cloud Native Computing Foundation Sandbox project, highlighting its relevance to the cloud-native ecosystem. It is designed to operate as an automated troubleshooting assistant that can analyze incidents continuously and support on-call engineers during outages.

Downloads: 24 This Week

Last Update: 4 days ago
See Project
13

Qwen3-Coder

Qwen3-Coder is the code version of Qwen3

Qwen3-Coder is the latest and most powerful agentic code model developed by the Qwen team at Alibaba Cloud. Its flagship version, Qwen3-Coder-480B-A35B-Instruct, features a massive 480 billion-parameter Mixture-of-Experts architecture with 35 billion active parameters, delivering top-tier performance on coding and agentic tasks. This model sets new state-of-the-art benchmarks among open models for agentic coding, browser-use, and tool-use, matching performance comparable to leading models like Claude Sonnet. Qwen3-Coder supports an exceptionally long context window of 256,000 tokens, extendable to 1 million tokens using Yarn, enabling repository-scale code understanding and generation. It is capable of handling 358 programming languages, from common to niche, making it versatile for a wide range of development environments. The model integrates a specially designed function call format and supports popular platforms such as Qwen Code and CLINE for agentic coding workflows.

1 Review

Downloads: 24 This Week

Last Update: 2026-03-24
See Project
14

Strix

Open-source AI hackers to find and fix your app’s vulnerabilities

Strix is an open source agent-driven security platform that uses autonomous AI agents to identify, investigate, and validate vulnerabilities in software applications. The system is designed to mimic the behavior of real attackers by executing dynamic testing and verifying findings through proof-of-concept exploitation. Unlike traditional vulnerability scanners that rely heavily on static analysis, Strix agents actively run code, probe systems, and attempt exploitation to confirm whether vulnerabilities are genuinely exploitable. The platform is intended for developers and security teams that need rapid security assessments without the overhead of manual penetration testing engagements. Strix can orchestrate multiple cooperating agents that divide investigation tasks and collaboratively analyze complex applications or infrastructure.

Downloads: 24 This Week

Last Update: 2026-03-23
See Project
15

Alpa

Training and serving large-scale neural networks

Alpa is a system for training and serving large-scale neural networks. Scaling neural networks to hundreds of billions of parameters has enabled dramatic breakthroughs such as GPT-3, but training and serving these large-scale neural networks require complicated distributed system techniques. Alpa aims to automate large-scale distributed training and serving with just a few lines of code.

Downloads: 23 This Week

Last Update: 2023-03-23
See Project
16

llama.cpp Python Bindings

Python bindings for llama.cpp

llama-cpp-python provides Python bindings for llama.cpp, enabling the integration of LLaMA (Large Language Model Meta AI) language models into Python applications. This facilitates the use of LLaMA's capabilities in natural language processing tasks within Python environments.

Downloads: 23 This Week

Last Update: 2026-04-03
See Project
17

Unity MCP

AI-powered bridge connecting LLMs and advanced AI agents

Unity-MCP is an open-source integration that connects artificial intelligence assistants with the Unity game development environment through the Model Context Protocol. The project enables AI tools such as coding assistants and autonomous agents to interact directly with Unity projects, allowing them to analyze scenes, modify assets, and generate code within the development environment. By exposing Unity editor functionality through MCP tools, the plugin allows external AI systems to understand the structure of a game project and manipulate it programmatically. Developers can use natural language prompts to instruct AI assistants to create objects, modify scenes, or generate gameplay scripts automatically. The system supports both editor-level automation and runtime integration, meaning AI models can also be used inside compiled games for dynamic behavior such as interactive characters or debugging tools.

Downloads: 22 This Week

Last Update: 2 days ago
See Project
18

MetaGPT

The Multi-Agent Framework

The Multi-Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo. Assign different roles to GPTs to form a collaborative software entity for complex tasks. MetaGPT takes a one-line requirement as input and outputs user stories / competitive analysis/requirements/data structures / APIs / documents, etc. Internally, MetaGPT includes product managers/architects/project managers/engineers. It provides the entire process of a software company along with carefully orchestrated SOPs.

Downloads: 20 This Week

Last Update: 2025-03-02
See Project
19

Mooncake

Mooncake is the serving platform for Kimi

Mooncake is an open-source infrastructure platform designed to optimize large language model serving by focusing on efficient management and transfer of model data and KV cache. The platform was originally developed as part of the serving infrastructure for the Kimi large language model system. Its architecture centers on a high-performance transfer engine that provides unified data transfer across different storage and networking technologies. This engine enables efficient movement of tensors and model data across heterogeneous environments such as GPU memory, system memory, and distributed storage systems. Mooncake also introduces distributed key-value cache storage that allows inference systems to reuse previously computed attention states, significantly improving throughput in large-scale deployments. The system supports advanced networking technologies such as RDMA and NVMe over Fabric, enabling high-speed communication across clusters.

Downloads: 20 This Week

Last Update: 2026-04-01
See Project
20

AingDesk

AI assistant that supports knowledge bases, model APIs

AingDesk is an open-source desktop and server-based AI assistant platform designed to provide a user-friendly environment for interacting with language models and building AI-powered tools. The software enables users to run local AI models or connect to external model APIs through a unified interface. One of its primary goals is to simplify the process of building knowledge-based assistants by allowing users to create local knowledge bases that the AI can search and analyze. The system supports additional features such as web search, intelligent agent workflows, and multi-model conversations within a single session. AingDesk can be deployed locally on personal machines or installed as a server using containerized environments. Its design emphasizes accessibility, making it suitable for both beginners and experienced developers who want to experiment with AI tools.

Downloads: 19 This Week

Last Update: 2026-03-06
See Project
21

Langflow

Low-code app builder for RAG and multi-agent AI applications

Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database.

Downloads: 19 This Week

Last Update: 2026-04-01
See Project
22

PrivateGPT

Interact with your documents using the power of GPT

PrivateGPT is a production-ready, privacy-first AI system that allows querying of uploaded documents using LLMs, operating completely offline in your own environment. It provides contextual generative AI capabilities without sending data externally. Now maintained under Zylon.ai with enterprise deployment options (air gapped, cloud, or on-prem).

Downloads: 19 This Week

Last Update: 2025-07-29
See Project
23

Rogue

AI Agent Evaluator & Red Team Platform

Rogue is an open-source evaluation and red-team framework designed to test the reliability, safety, and policy compliance of AI agents. The platform automatically interacts with an AI agent by generating dynamic scenarios and multi-turn conversations that simulate real-world interactions. Instead of relying solely on static test scripts, Rogue uses an agent-as-a-judge architecture where one agent probes another agent to detect failures or unexpected behaviors. The system allows developers to define specific scenarios, expected outcomes, and business rules so that the framework can verify whether an agent behaves according to required policies. During testing, Rogue records conversations and produces detailed reports that explain whether the agent passed or failed each scenario. These reports include reasoning and evidence, helping developers understand why a particular failure occurred.

Downloads: 19 This Week

Last Update: 2026-03-17
See Project
24

DecryptPrompt

Summarize Prompt & LLM papers, open source data & models

DecryptPrompt is an open-source research repository dedicated to organizing and summarizing academic research related to prompts and large language models. The project collects papers, technical reports, and research materials that explore prompting techniques, model architectures, and reasoning strategies used in modern AI systems. It serves as a structured knowledge base where developers and researchers can quickly find key papers about topics such as chain-of-thought reasoning, prompt optimization, reasoning frameworks, and model training techniques. The repository organizes research into thematic sections that cover different prompting methodologies and reasoning paradigms used in LLM development. Many of the resources focus on understanding how prompts influence model behavior and how prompting strategies can improve reasoning or efficiency.

Downloads: 18 This Week

Last Update: 4 days ago
See Project
25

Superset LLM

Run an army of Claude Code, Codex, etc. on your machine

Superset is a development environment and terminal-based platform designed to orchestrate multiple AI coding agents simultaneously within a single workspace. The tool enables developers to run many autonomous coding agents in parallel without the typical overhead of manually managing multiple terminals, repositories, or branches. Each agent task is isolated in its own Git worktree, ensuring that code changes from different agents do not interfere with each other while allowing developers to track their progress independently. The platform includes built-in monitoring capabilities so users can observe the activity of each agent, receive notifications when tasks are completed, and quickly review changes produced by automated coding workflows. Superset also integrates tools for reviewing code differences, editing generated outputs, and managing the development environment directly from the interface.

Downloads: 18 This Week

Last Update: 12 hours ago
See Project