Showing 13 open source projects for "learning vector quantization"

View related business solutions
  • Rezku Point of Sale Icon
    Rezku Point of Sale

    Designed for Real-World Restaurant Operations

    Rezku is an all-inclusive ordering platform and management solution for all types of restaurant and bar concepts. You can now get a fully custom branded downloadable smartphone ordering app for your restaurant exclusively from Rezku.
    Learn More
  • Inventory and Order Management Software for Multichannel Sellers Icon
    Inventory and Order Management Software for Multichannel Sellers

    Avoid stockouts, overselling, and losing control as your business grows.

    We are the most powerful inventory and order management platform for Amazon, Walmart, and multichannel product sellers. Centralize orders, product information, and fulfillment operations to run more efficiently, sell more products, and stay compliant with marketplace requirements so you can grow profitably.
    Learn More
  • 1
    bitsandbytes

    bitsandbytes

    Accessible large language models via k-bit quantization for PyTorch

    bitsandbytes is an open-source library designed to make training and inference of large neural networks more efficient by dramatically reducing memory usage. Built primarily for the PyTorch ecosystem, the library introduces advanced quantization techniques that allow models to operate using reduced numerical precision while maintaining high accuracy. These optimizations enable large language models and other deep learning architectures to run on hardware with limited memory resources, including consumer-grade GPUs. The project includes specialized optimizers and quantized matrix operations that significantly reduce the memory footprint of training and inference workloads. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Tencent-Hunyuan-Large

    Tencent-Hunyuan-Large

    Open-source large language model family from Tencent Hunyuan

    Tencent-Hunyuan-Large is the flagship open-source large language model family from Tencent Hunyuan, offering both pre-trained and instruct (fine-tuned) variants. It is designed with long-context capabilities, quantization support, and high performance on benchmarks across general reasoning, mathematics, language understanding, and Chinese / multilingual tasks. It aims to provide competitive capability with efficient deployment and inference. FP8 quantization support to reduce memory usage...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    SWIFT LLM

    SWIFT LLM

    Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs

    SWIFT LLM is a comprehensive framework developed within the ModelScope ecosystem for training, fine-tuning, evaluating, and deploying large language models and multimodal models. The platform provides a full machine learning pipeline that supports tasks ranging from model pre-training to reinforcement learning alignment techniques. It integrates with popular inference engines such as vLLM and LMDeploy to accelerate deployment and runtime performance. The framework also includes support for...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Ludwig AI

    Ludwig AI

    Low-code framework for building custom LLMs, neural networks

    ...Think building blocks for deep learning.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Skillfully - The future of skills based hiring Icon
    Skillfully - The future of skills based hiring

    Realistic Workplace Simulations that Show Applicant Skills in Action

    Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
    Learn More
  • 5
    MatMul-Free LM

    MatMul-Free LM

    Implementation for MatMul-free LM

    ...Since matrix multiplication is one of the most computationally expensive components of modern language models, the project explores alternative computational strategies that reduce hardware requirements while maintaining comparable performance. The architecture relies on quantization-aware training and lightweight operations to replace conventional dense matrix multiplications with more efficient alternatives. These optimizations can significantly reduce memory consumption and potentially improve computational efficiency during both training and inference. The repository provides implementations of models at several parameter scales and includes tools for experimenting with the architecture using modern machine learning frameworks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    All-in-RAG

    All-in-RAG

    Big Model Application Development Practice 1

    All-in-RAG is an open-source educational project designed to teach developers how to build applications using retrieval-augmented generation techniques. The repository provides a structured learning path that covers both theoretical foundations and practical implementation steps for RAG systems. It explains the full development pipeline required to create knowledge-aware AI assistants, including data preparation, document indexing, vector embedding generation, and retrieval strategies. The project also explores advanced topics such as hybrid retrieval methods, query optimization, and evaluation techniques for improving system accuracy. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    nano-graphrag

    nano-graphrag

    A simple, easy-to-hack GraphRAG implementation

    nano-graphrag is a lightweight implementation of the GraphRAG approach designed to simplify experimentation with graph-based retrieval-augmented generation systems. GraphRAG expands traditional RAG pipelines by constructing knowledge graphs from documents and using relationships between entities to improve the quality and reasoning of AI responses. The nano-GraphRAG project focuses on reducing complexity by providing a compact and readable codebase that preserves the core functionality of...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    FastDeploy

    FastDeploy

    High-performance Inference and Deployment Toolkit for LLMs and VLMs

    FastDeploy is an open-source inference and deployment toolkit designed to simplify the process of running and serving deep learning models across a wide range of hardware platforms. Developed within the PaddlePaddle ecosystem, the toolkit focuses on providing high-performance deployment capabilities for modern AI models including large language models and vision-language systems. The platform enables developers to deploy trained models quickly using optimized inference pipelines that support...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    marqo

    marqo

    Tensor search for humans

    A tensor-based search and analytics engine that seamlessly integrates with your applications, websites, and workflows. Marqo is a versatile and robust search and analytics engine that can be integrated into any website or application. Due to horizontal scalability, Marqo provides lightning-fast query times, even with millions of documents. Marqo helps you configure deep-learning models like CLIP to pull semantic meaning from images. It can seamlessly handle image-to-image, image-to-text and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • SoftCo: Enterprise Invoice and P2P Automation Software Icon
    SoftCo: Enterprise Invoice and P2P Automation Software

    For companies that process over 20,000 invoices per year

    SoftCo Accounts Payable Automation processes all PO and non-PO supplier invoices electronically from capture and matching through to invoice approval and query management. SoftCoAP delivers unparalleled touchless automation by embedding AI across matching, coding, routing, and exception handling to minimize the number of supplier invoices requiring manual intervention. The result is 89% processing savings, supported by a context-aware AI Assistant that helps users understand exceptions, answer questions, and take the right action faster.
    Learn More
  • 10
    GLM-4

    GLM-4

    GLM-4 series: Open Multilingual Multimodal Chat LMs

    GLM-4 is a family of open models from ZhipuAI that spans base, chat, and reasoning variants at both 32B and 9B scales, with long-context support and practical local-deployment options. The GLM-4-32B-0414 models are trained on ~15T high-quality data (including substantial synthetic reasoning data), then post-trained with preference alignment, rejection sampling, and reinforcement learning to improve instruction following, coding, function calling, and agent-style behaviors. The...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    DB-GPT

    DB-GPT

    Revolutionizing Database Interactions with Private LLM Technology

    DB-GPT is an experimental open-source project that uses localized GPT large models to interact with your data and environment. With this solution, you can be assured that there is no risk of data leakage, and your data is 100% private and secure.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 12
    Pixeltable

    Pixeltable

    Data Infrastructure providing an approach to multimodal AI workloads

    Pixeltable is an open-source Python data infrastructure framework designed to support the development of multimodal AI applications. The system provides a declarative interface for managing the entire lifecycle of AI data pipelines, including storage, transformation, indexing, retrieval, and orchestration of datasets. Unlike traditional architectures that require multiple tools such as databases, vector stores, and workflow orchestrators, Pixeltable unifies these functions within a...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    Qwen3 Embedding

    Qwen3 Embedding

    Designed for text embedding and ranking tasks

    Qwen3-Embedding is a model series from the Qwen family designed specifically for text embedding and ranking tasks. It builds upon the Qwen3 base/dense models and offers several sizes (0.6B, 4B, 8B parameters), for both embedding and reranking, with high multilingual capability, long‐context understanding, and reasoning. It achieves state-of-the-art performance on benchmarks like MTEB (Multilingual Text Embedding Benchmark) and supports instruction-aware embedding (i.e. embedding task...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB