Open Source Python Large Language Models (LLM) - Page 5

Python Large Language Models (LLM)

View 362 business solutions

Browse free open source Python Large Language Models (LLM) and projects below. Use the toggles on the left to filter open source Python Large Language Models (LLM) by OS, license, language, programming language, and project status.

  • Rezku Point of Sale Icon
    Rezku Point of Sale

    Designed for Real-World Restaurant Operations

    Rezku is an all-inclusive ordering platform and management solution for all types of restaurant and bar concepts. You can now get a fully custom branded downloadable smartphone ordering app for your restaurant exclusively from Rezku.
    Learn More
  • The AI workplace management platform Icon
    The AI workplace management platform

    Plan smart spaces, connect teams, manage assets, and get insights with the leading AI-powered operating system for the built world.

    By combining AI workflows, predictive intelligence, and automated insights, OfficeSpace gives leaders a complete view of how their spaces are used and how people work. Facilities, IT, HR, and Real Estate teams use OfficeSpace to optimize space utilization, enhance employee experience, and reduce portfolio costs with precision.
    Learn More
  • 1
    Guidance

    Guidance

    A guidance language for controlling large language models

    Guidance is an efficient programming paradigm for steering language models. With Guidance, you can control how output is structured and get high-quality output for your use case—while reducing latency and cost vs. conventional prompting or fine-tuning. It allows users to constrain generation (e.g. with regex and CFGs) as well as to interleave control (conditionals, loops, tool use) and generation seamlessly.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    H2O LLM Studio

    H2O LLM Studio

    Framework and no-code GUI for fine-tuning LLMs

    Welcome to H2O LLM Studio, a framework and no-code GUI designed for fine-tuning state-of-the-art large language models (LLMs). You can also use H2O LLM Studio with the command line interface (CLI) and specify the configuration file that contains all the experiment parameters. To finetune using H2O LLM Studio with CLI, activate the pipenv environment by running make shell. With H2O LLM Studio, training your large language model is easy and intuitive. First, upload your dataset and then start training your model. Start by creating an experiment. You can then monitor and manage your experiment, compare experiments, or push the model to Hugging Face to share it with the community.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    HuixiangDou

    HuixiangDou

    Overcoming Group Chat Scenarios with LLM-based Technical Assistance

    HuixiangDou is an open-source large language model assistant designed specifically for technical question answering in group chat environments. The project addresses a common problem in developer communities where discussion channels become overwhelmed by repeated or irrelevant questions. To solve this issue, HuixiangDou implements a multi-stage pipeline that analyzes incoming messages, filters irrelevant conversations, and selectively generates responses when the assistant determines it can provide useful information. This design allows the system to participate in group discussions without flooding the chat with unnecessary messages. The assistant uses retrieval and ranking methods along with language model reasoning to produce accurate answers for technical topics such as computer vision and machine learning projects. It can be integrated into messaging platforms such as WeChat or other team collaboration tools to assist developer communities.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    LLM Foundry

    LLM Foundry

    LLM training code for MosaicML foundation models

    Introducing MPT-7B, the first entry in our MosaicML Foundation Series. MPT-7B is a transformer trained from scratch on 1T tokens of text and code. It is open source, available for commercial use, and matches the quality of LLaMA-7B. MPT-7B was trained on the MosaicML platform in 9.5 days with zero human intervention at a cost of ~$200k. Large language models (LLMs) are changing the world, but for those outside well-resourced industry labs, it can be extremely difficult to train and deploy these models. This has led to a flurry of activity centered on open-source LLMs, such as the LLaMA series from Meta, the Pythia series from EleutherAI, the StableLM series from StabilityAI, and the OpenLLaMA model from Berkeley AI Research.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Skillfully - The future of skills based hiring Icon
    Skillfully - The future of skills based hiring

    Realistic Workplace Simulations that Show Applicant Skills in Action

    Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
    Learn More
  • 5
    LLaMA Models

    LLaMA Models

    Utilities intended for use with Llama models

    This repository serves as the central hub for the Llama foundation model family, consolidating model cards, licenses and use policies, and utilities that support inference and fine-tuning across releases. It ties together other stack components (like safety tooling and developer SDKs) and provides canonical references for model variants and their intended usage. The project’s issues and releases reflect an actively used coordination point for the ecosystem, where guidance, utilities, and compatibility notes are published. It complements separate repos that carry code and demos (for example inference kernels or cookbook content) by keeping authoritative metadata and specs here. Model lineages and size variants are documented externally (e.g., Llama 3.x and beyond), with this repo providing the “single source of truth” links and utilities. In practice, teams use llama-models as a reference when selecting variants, aligning licenses, and wiring in helper scripts for deployment.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    LLaVA

    LLaVA

    Visual Instruction Tuning: Large Language-and-Vision Assistant

    Visual instruction tuning towards large language and vision models with GPT-4 level capabilities.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 7
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI architectures are typically large and require a lot of data and compute for training. NeMo uses PyTorch Lightning for easy and performant multi-GPU/multi-node mixed-precision training. Supported models: Jasper, QuartzNet, CitriNet, Conformer-CTC, Conformer-Transducer, Squeezeformer-CTC, Squeezeformer-Transducer, ContextNet, LSTM-Transducer (RNNT), LSTM-CTC. NGC collection of pre-trained speech processing models.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 8
    OM1

    OM1

    Modular AI runtime for robots

    OM1 is an open-source AI platform designed to build autonomous agents capable of interacting with digital environments and completing complex tasks. The project focuses on creating a modular architecture where language models can coordinate with external tools, APIs, and knowledge sources to accomplish multi-step objectives. Instead of operating as simple conversational systems, OM1 agents can plan actions, retrieve information, and execute tasks across different services. The framework integrates reasoning modules, planning strategies, and tool interfaces that allow agents to operate in dynamic environments. Developers can extend the system by connecting new tools, services, or data sources to the agent architecture. The platform also includes mechanisms for coordinating workflows and managing the state of ongoing tasks.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    OpenDAN

    OpenDAN

    OpenDAN is an open source Personal AI OS

    OpenDAN is an open-source Personal AI OS , that consolidates various AI modules in one place for your personal use. The goal of OpenDAN (Open and Do Anything Now with AI) is to create a Personal AI OS , which provides a runtime environment for various Al modules as well as protocols for interoperability between them. With OpenDAN, users can securely collaborate with various AI modules using their private data to create powerful personal AI agents, such as butlers, lawyers, doctors, teachers, assistants, girl or boyfriends.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Premier Construction Software Icon
    Premier Construction Software

    Premier is a global leader in financial construction ERP software.

    Rated #1 Construction Accounting Software by Forbes Advisor in 2022 & 2023. Our modern SAAS solution is designed to meet the needs of General Contractors, Developers/Owners, Homebuilders & Specialty Contractors.
    Learn More
  • 10
    PaperBanana

    PaperBanana

    Extension of Google Research’s PaperBanana

    PaperBanana is an open-source agentic framework designed to automatically generate publication-quality academic diagrams and statistical plots directly from text descriptions. The project focuses on helping researchers, educators, and data scientists transform conceptual descriptions of figures into structured visual outputs suitable for research papers, presentations, and technical reports. Instead of manually designing charts or diagrams using traditional visualization tools, users can describe the desired figure in natural language and allow the system to generate the visual representation automatically. PaperBanana integrates modern multimodal AI models capable of interpreting instructions and producing graphics that follow academic conventions. The framework supports multiple AI providers including OpenAI, Azure OpenAI services, and Google Gemini, allowing users to run the system with different model backends.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    Pixeltable

    Pixeltable

    Data Infrastructure providing an approach to multimodal AI workloads

    Pixeltable is an open-source Python data infrastructure framework designed to support the development of multimodal AI applications. The system provides a declarative interface for managing the entire lifecycle of AI data pipelines, including storage, transformation, indexing, retrieval, and orchestration of datasets. Unlike traditional architectures that require multiple tools such as databases, vector stores, and workflow orchestrators, Pixeltable unifies these functions within a table-based abstraction. Developers define data transformations and AI operations using computed columns on tables, allowing pipelines to evolve incrementally as new data or models are added. The framework supports multimodal content including images, video, text, and audio, enabling applications such as retrieval-augmented generation systems, semantic search, and multimedia analytics.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 12
    Prometheus-Eval

    Prometheus-Eval

    Evaluate your LLM's response with Prometheus and GPT4

    Prometheus-Eval is an open-source framework designed to evaluate the outputs of large language models using specialized evaluator models known as Prometheus. The project provides tools, datasets, and scripts that allow developers and researchers to measure the quality of LLM responses through automated scoring rather than relying solely on human evaluators. It implements an “LLM-as-a-judge” approach in which a dedicated language model analyzes instruction–response pairs and assigns scores or rankings based on predefined evaluation criteria. The repository includes a Python package that provides a straightforward interface for running evaluations and integrating them into model development pipelines. It also provides training data and utilities for fine-tuning evaluator models so they can assess outputs according to custom scoring rubrics such as helpfulness, accuracy, or style.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    ROSA

    ROSA

    I Agent designed to interact with ROS1- and ROS2-based robotics system

    ROSA, short for Robot Operating System Agent, is an AI-powered software assistant developed by NASA’s Jet Propulsion Laboratory to simplify interaction with robotic systems that use the Robot Operating System (ROS). The project provides a natural language interface that allows developers and operators to interact with robots by issuing commands or queries in conversational language. Built on top of frameworks such as LangChain and modern large language models, ROSA translates user instructions into actions that can be executed within ROS1 or ROS2 environments. This capability enables users to inspect system status, diagnose issues, and control robot behavior without manually navigating complex command-line tools or configuration files. The system integrates with robotics software stacks and exposes operational tools that allow AI agents to analyze system logs, inspect sensors, or trigger robot tasks.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    STORM

    STORM

    An LLM-powered knowledge curation system that researches topics

    STORM is an open-source virtual assistant framework developed by Stanford's OVAL lab. It is designed for creating natural language interfaces and assistants that can interact with APIs, databases, and services in a modular way.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 15
    Sparrow

    Sparrow

    Structured data extraction and instruction calling with ML, LLM

    Sparrow is an open-source platform designed to extract structured information from documents, images, and other unstructured data sources using machine learning and large language models. The system focuses on transforming complex documents such as invoices, receipts, forms, and scanned pages into structured formats like JSON that can be processed by downstream applications. It combines several components, including OCR pipelines, vision-language models, and LLM-based reasoning modules to identify and extract meaningful data fields from heterogeneous document layouts. The architecture is modular, allowing developers to build customizable processing pipelines that integrate with external tools and data extraction frameworks. Sparrow also includes workflow orchestration tools that allow multiple extraction tasks to be combined into automated pipelines for large-scale document processing.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    Torch Pruning

    Torch Pruning

    DepGraph: Towards Any Structural Pruning

    Torch-Pruning is an open-source toolkit designed to optimize deep neural networks by performing structural pruning directly within PyTorch models. The library focuses on reducing the size and computational cost of neural networks by removing redundant parameters and channels while maintaining model performance. It introduces a graph-based algorithm called DepGraph that automatically identifies dependencies between layers, allowing parameters to be pruned safely across complex architectures. This dependency analysis makes it possible to prune large networks such as transformers, convolutional networks, and diffusion models without breaking the computational graph. Torch-Pruning physically removes parameters rather than masking them, which results in smaller and faster models during both training and inference. The toolkit supports a wide variety of architectures used in computer vision and large language models, making it a flexible solution for model compression tasks.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 17
    Vanna 2.0

    Vanna 2.0

    Chat with your SQL database

    Vanna is an open-source Python framework that enables natural language interaction with databases by converting user questions into executable SQL queries using large language models. The framework uses a retrieval-augmented generation architecture that learns from database schemas, documentation, and past query examples to generate accurate queries tailored to a specific dataset. Vanna can be integrated into many environments, including notebooks, web applications, messaging platforms, and data dashboards, making it flexible for analytics and data exploration workflows. The system streams query results, visualizations, and summaries directly to user interfaces, allowing non-technical users to interact with complex data systems through conversational queries. It also includes enterprise-grade features such as user-aware security, permission enforcement, and query auditing for production deployments.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    bitsandbytes

    bitsandbytes

    Accessible large language models via k-bit quantization for PyTorch

    bitsandbytes is an open-source library designed to make training and inference of large neural networks more efficient by dramatically reducing memory usage. Built primarily for the PyTorch ecosystem, the library introduces advanced quantization techniques that allow models to operate using reduced numerical precision while maintaining high accuracy. These optimizations enable large language models and other deep learning architectures to run on hardware with limited memory resources, including consumer-grade GPUs. The project includes specialized optimizers and quantized matrix operations that significantly reduce the memory footprint of training and inference workloads. By lowering the hardware requirements needed to work with large models, bitsandbytes helps make modern AI development more accessible to researchers and engineers. The library has become widely used in machine learning pipelines that rely on parameter-efficient training techniques and low-precision inference.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 19
    uqlm

    uqlm

    Uncertainty Quantification for Language Models, is a Python package

    UQLM is a Python library developed to detect hallucinations and quantify uncertainty in the outputs of large language models. The system implements a variety of uncertainty quantification techniques that assign confidence scores to model responses. These scores help developers determine how likely a generated answer is to contain errors or fabricated information. The library includes both black-box and white-box approaches to uncertainty estimation. Black-box methods evaluate model outputs through multiple generations or comparative analysis, while white-box methods rely on token probabilities produced during inference. UQLM also supports ensemble strategies and model-as-judge approaches for evaluating responses. By combining multiple uncertainty metrics, the system provides more reliable indicators of when language model outputs may be unreliable.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 20
    vim-ai

    vim-ai

    AI-powered code assistant for Vim. OpenAI and ChatGPT plugin for Vim

    vim-ai is an AI-powered assistant plugin for Vim and Neovim that brings language-model features directly into the editor. It allows users to generate code or text, edit selections in place, and carry on interactive chat-style conversations without leaving the terminal editing environment. The plugin is built around OpenAI-compatible APIs, which means it can work not only with OpenAI itself but also with compatible proxies and alternative providers. Its command set covers text completion, editing, chat continuation, image generation, and debugging utilities, making it more versatile than a narrow autocomplete add-on. The repository also highlights support for custom roles, vision features such as image-to-text, and an emerging provider-plugin model for extending compatibility further. A notable design point is that it only sends content the user explicitly selects or includes in prompts, which helps users control what is shared with the external model.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    CodeGeeX2

    CodeGeeX2

    CodeGeeX2: A More Powerful Multilingual Code Generation Model

    CodeGeeX2 is the second-generation multilingual code generation model from ZhipuAI, built upon the ChatGLM2-6B architecture and trained on 600B code tokens. Compared to the first generation, it delivers a significant boost in programming ability across multiple languages, outperforming even larger models like StarCoder-15B in some benchmarks despite having only 6B parameters. The model excels at code generation, translation, summarization, debugging, and comment generation, and it supports over 100 programming languages. With improved inference efficiency, quantization options, and multi-query/flash attention, CodeGeeX2 achieves faster generation speeds and lightweight deployment, requiring as little as 6GB GPU memory at INT4 precision. Its backend powers the CodeGeeX IDE plugins for VS Code, JetBrains, and other editors, offering developers interactive AI assistance with features like infilling and cross-file completion.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    LLM Council

    LLM Council

    LLM Council works together to answer your hardest questions

    LLM Council is a creative open-source web application by Andrej Karpathy that lets you consult multiple large language models together to answer questions more reliably than querying a single model. Instead of relying on one provider, this application sends your query simultaneously to several LLMs supported via OpenRouter, collects each model’s independent response, and then orchestrates a multi-stage evaluation where the models critique and rank each other’s outputs anonymously. After this peer-review process, a designated “Chairman” model synthesizes a final consolidated answer drawing on the strengths and insights of all participants. The interface looks like a familiar chat app but under the hood it implements this ensemble and consensus workflow to reduce bias and leverage diverse reasoning styles.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 23
    LLaMA-MoE

    LLaMA-MoE

    Building Mixture-of-Experts from LLaMA with Continual Pre-training

    LLaMA-MoE is an open-source project that builds mixture-of-experts language models from LLaMA through expert partitioning and continual pre-training. The repository is centered on making MoE research more accessible by offering smaller and more affordable models with only about 3.0 to 3.5 billion activated parameters, which helps reduce deployment and experimentation costs. Its architecture works by splitting LLaMA feed-forward networks into sparse experts and adding gating mechanisms so that only selected experts are activated during inference and training. The project is not just a model release, but also a research framework that includes multiple expert construction methods, several gating strategies, and tooling for continual pre-training on filtered SlimPajama-based datasets. It also emphasizes training efficiency through features such as FlashAttention-v2 integration and fast streaming dataset loading, which are important for large-scale experimentation.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 24
    LOTUS

    LOTUS

    AI-Powered Data Processing: Use LOTUS to process all of your datasets

    LOTUS is an open-source framework and query engine designed to enable efficient processing of structured and unstructured datasets using large language models. The system provides a declarative programming model that allows developers to express complex AI data operations using high-level commands rather than manually orchestrating model calls. It offers a Python interface with a Pandas-like API, making it familiar for data scientists and engineers already working with data analysis libraries. The core concept of the framework is the use of semantic operators, which extend traditional relational database operations to support reasoning over text and other unstructured data. These operators allow tasks such as semantic filtering, ranking, clustering, and summarization to be expressed directly within data processing pipelines. The LOTUS engine automatically optimizes how language models are used during execution, which can significantly improve performance and reduce computational cost.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    LangCheck

    LangCheck

    Simple, Pythonic building blocks to evaluate LLM applications

    Simple, Pythonic building blocks to evaluate LLM applications.
    Downloads: 5 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB