inference free download

Showing 24 open source projects for "inference"

View related business solutions

Artificial Intelligence JavaScript Clear Filters & Widen Search

AestheticsPro Medical Spa Software
Our new software release will dramatically improve your medspa business performance while enhancing the customer experience

AestheticsPro is the most complete Aesthetics Software on the market today. HIPAA Cloud Compliant with electronic charting, integrated POS, targeted marketing and results driven reporting; AestheticsPro delivers the tools you need to manage your medical spa business. It is our mission To Provide an All-in-One Cutting Edge Software to the Aesthetics Industry.

Learn More
The Most Powerful Software Platform for EHSQ and ESG Management
Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.

Learn More
1

Text Embeddings Inference

High-performance inference server for text embeddings models API layer

Text Embeddings Inference is a high-performance server designed to serve text embedding models efficiently in production environments. It focuses on delivering fast and scalable embedding generation by leveraging optimized inference techniques and modern hardware acceleration. It is built to support transformer-based embedding models, making it suitable for tasks such as semantic search, clustering, and retrieval-augmented systems.

Downloads: 7 This Week

Last Update: 2026-03-23
See Project
2

Open WebUI

User-friendly AI Interface

Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. It supports various LLM runners like Ollama and OpenAI-compatible APIs, with a built-in inference engine for Retrieval Augmented Generation (RAG), making it a powerful AI deployment solution. Key features include effortless setup via Docker or Kubernetes, seamless integration with OpenAI-compatible APIs, granular permissions and user groups for enhanced security, responsive design across devices, and full Markdown and LaTeX support for enriched interactions. ...

Downloads: 158 This Week

Last Update: 2026-03-27
See Project
3

DeepCamera

Open-Source AI Camera. Empower any camera/CCTV

DeepCamera empowers your traditional surveillance cameras and CCTV/NVR with machine learning technologies. It provides open-source facial recognition-based intrusion detection, fall detection, and parking lot monitoring with the inference engine on your local device. SharpAI-hub is the cloud hosting for AI applications that helps you deploy AI applications with your CCTV camera on your edge device in minutes. SharpAI yolov7_reid is an open-source Python application that leverages AI technologies to detect intruders with traditional surveillance cameras. The source code is here It leverages Yolov7 as a person detector, FastReID for person feature extraction, Milvus the local vector database for self-supervised learning to identify unseen persons, Labelstudio to host images locally and for further usage such as label data and train your own classifier. ...

Downloads: 12 This Week

Last Update: 2026-03-20
See Project
4

NemoClaw

NVIDIA plugin for secure installation of OpenClaw

...It installs and configures the NVIDIA OpenShell runtime, which provides a secure environment for running autonomous AI agents. NemoClaw enables users to launch sandboxed agent environments that control network access, file permissions, and inference requests through policy-based security. The platform integrates with AI models such as NVIDIA Nemotron and supports multiple inference backends including cloud APIs, local NIM deployments, and vLLM. Through its command-line interface, developers can deploy, monitor, and manage AI assistants running inside isolated sandboxes. By combining sandbox orchestration, agent management, and AI model integration, NemoClaw provides a secure foundation for building and operating autonomous AI assistants.

Downloads: 5 This Week

Last Update: 11 hours ago
See Project
Simplify Purchasing For Your Business
Manage what you buy and how you buy it with Order.co, so you have control over your time and money spent.

Simplify every aspect of buying for your business in Order.co. From sourcing products to scaling purchasing across locations to automating your AP and approvals workstreams, Order.co is the platform of choice for growing businesses.

Learn More
5

Harbor LLM

Run a full local LLM stack with one command using Docker

...With a single command, users can start preconfigured tools like Ollama and Open WebUI, enabling chat, workflows, and integrations immediately. Harbor supports multiple inference engines, including llama.cpp and vLLM, and connects them seamlessly to user interfaces. It also includes tools for web retrieval, image generation, voice interaction, and workflow automation. Built on Docker, Harbor allows services to run in isolated containers while communicating over a local network. It is intended for local development and experimentation rather than production deployment, giving developers a flexible way to explore AI systems, test configurations, and manage complex LLM stacks without manual wiring or setup overhead.

Downloads: 16 This Week

Last Update: 2 days ago
See Project
6

Text-to-image Playground

A playground to generate images from any text prompt using SD

...The platform demonstrates how large generative models can be integrated into user-friendly tools for creative exploration and rapid prototyping. It also serves as a reference architecture for building full-stack generative AI applications that connect model inference pipelines with web interfaces.

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
7

Model Explorer

A modern model graph visualizer and debugger

Model Explorer is a visual tool for exploring, debugging, and optimizing ML models deployed on edge devices. Developed by Google AI Edge, it offers a browser-based interface to inspect layer-wise performance, memory usage, and inference timing of TensorFlow Lite and other supported models. It’s a powerful utility for developers optimizing models for constrained environments.

Downloads: 3 This Week

Last Update: 2026-02-09
See Project
8

LLM Course

Course to get into Large Language Models (LLMs)

...Learners get exposure to multiple adaptation strategies—LoRA/QLoRA, instruction fine-tuning, and alignment techniques—so they can choose approaches that fit their hardware and budgets. The materials also cover inference optimization and quantization to make serving LLMs feasible on commodity GPUs or even CPUs, which is crucial for side projects and startups. Evaluation is treated as a first-class topic, with examples of automatic and human-in-the-loop methods to catch regressions and verify quality beyond simple loss values. By the end, students have a mental model and a practical toolkit for iterating on datasets, training configs, etc.

Downloads: 0 This Week

Last Update: 2026-02-05
See Project
9

Jaaz

Open source multimodal creative AI assistant with infinite canvas tool

...It combines AI agents with visual editing tools, allowing users to generate media through prompts, sketches, or simple instructions. Jaaz supports multiple AI models and can integrate both local and cloud-based inference systems, enabling flexible creative workflows. Jaaz emphasizes privacy and local-first operation, allowing creators to run AI models locally so that their data does not leave their device. It also includes collaborative planning tools such as visual layouts and storyboard organization to support complex creative projects. By combining generative AI with a canvas-based interface, the project aims to provide a creative platform.

Downloads: 14 This Week

Last Update: 2026-03-17
See Project
Skillfully - The future of skills based hiring
Realistic Workplace Simulations that Show Applicant Skills in Action

Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.

Learn More
10

Operit AI

Powerful Android AI agent with tools, automation, and Linux shell

Operit is a full-featured AI assistant and agent platform designed specifically for Android devices, aiming to go far beyond traditional chat-based interfaces. It integrates deep system-level capabilities with a wide range of tools, allowing the AI to perform real tasks such as file management, automation, and system control directly on the device. A standout aspect of the project is its built-in Ubuntu 24 environment, which enables users to run Linux commands, scripts, and development tools...

Downloads: 15 This Week

Last Update: 2026-03-18
See Project
11

Groq Desktop

Local Groq Desktop chat app with MCP support

...Developers can also use groq-desktop-beta as a lightweight interface to test prompts, media inputs, or function-calling capabilities before embedding them into larger projects. The project offers installable builds (including via Homebrew on macOS) and supports easy setup, giving quick access to Groq’s inference services without needing to spin up a full backend.

Downloads: 10 This Week

Last Update: 2025-12-12
See Project
12

Cognita

Open source RAG framework for building scalable modular AI apps

Cognita is an open source framework designed to help developers build, organize, and deploy Retrieval-Augmented Generation (RAG) applications in a structured and production-ready way. It addresses the gap between quick experimentation in notebooks and the complexity of deploying scalable AI systems by introducing a modular and API-driven architecture. Cognita provides reusable components such as parsers, data loaders, embedders, retrievers, and query controllers, allowing teams to customize...

Downloads: 3 This Week

Last Update: 5 days ago
See Project
13

Supertonic

Lightning-fast, on-device TTS, running natively via ONNX

Supertonic is a lightning-fast, on-device text-to-speech system built around ONNX Runtime for maximum speed and portability. It focuses on running entirely locally, eliminating the need for cloud APIs and providing low latency and strong privacy guarantees, even on constrained devices like Raspberry Pi boards and e-readers. The core model is highly compact at around 66 million parameters, yet benchmarks show it can generate speech up to 167× faster than real time on modern consumer hardware...

Downloads: 5 This Week

Last Update: 2026-01-06
See Project
14

ChatAnyLLM

Private AI chat for local models, OpenClaw, and custom endpoints.

ChatAnyLLM is a desktop application providing a unified interface for local inference engines (OpenClaw, Ollama, LM Studio) and cloud providers like OpenRouter. It features an extensible architecture allowing users to manually configure any OpenAI-compatible API endpoint, enabling support for third-party providers such as OpenClaw, Groq or Cerebras. Designed for data sovereignty, the application persists conversation history locally and secures credentials through system-level encryption. ...

1 Review

Downloads: 11 This Week

Last Update: 3 days ago
See Project
15

pipeless

A computer vision framework to create and deploy apps in minutes

...You provide some functions that are executed for new video frames and Pipeless takes care of everything else. You can easily use industry-standard models, such as YOLO, or load your custom model in one of the supported inference runtimes. Pipeless ships some of the most popular inference runtimes, such as the ONNX Runtime, allowing you to run inference with high performance on CPU or GPU out-of-the-box. You can deploy your Pipeless application with a single command to edge and IoT devices or the cloud.

Downloads: 23 This Week

Last Update: 2024-02-23
See Project
16

gpu_poor

Calculate token/s & GPU memory requirement for any LLM

gpu_poor is an open-source tool designed to help developers determine whether their hardware is capable of running a specific large language model and to estimate the performance they can expect from it. The project focuses on calculating GPU memory requirements and predicted inference speed for different models, hardware configurations, and quantization strategies. By analyzing factors such as model size, context length, batch size, and GPU specifications, the system estimates how much VRAM will be required and how fast tokens can be generated during inference. The tool also provides a detailed breakdown of where GPU memory is allocated, including model weights, KV cache, activations, and other runtime overhead. ...

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
17

commit-autosuggestions

A tool that AI automatically recommends commit messages

...However, most code changes are not made only by add of the code, and some parts of the code are deleted. We plan to slowly conquer languages that are not currently supported. To run this project, you need a flask-based inference server (GPU) and a client (commit module). If you don't have a GPU, don't worry, you can use it through Google Colab.

Downloads: 0 This Week

Last Update: 2023-03-23
See Project
18

SQLFlow

SQL compiler bridging databases and machine learning workflows

SQLFlow is an open source project designed to bridge the gap between traditional SQL-based data processing and modern machine learning workflows by extending SQL syntax with AI capabilities. It acts as a compiler that translates SQL programs into executable workflows, enabling users to train, evaluate, and deploy machine learning models directly from SQL statements. It integrates with multiple database engines such as MySQL, Hive, and MaxCompute, while also supporting machine learning...

Downloads: 10 This Week

Last Update: 5 hours ago
See Project
19

CrypTen

A framework for Privacy Preserving Machine Learning

...The framework supports both encryption and decryption of tensors and operations such as addition and multiplication over encrypted values. Although not yet production-ready, CrypTen focuses on advancing real-world secure ML applications, such as training and inference over private datasets, without exposing sensitive data.

Downloads: 0 This Week

Last Update: 2025-10-08
See Project
20

TenorSpace.js

Neural network 3D visualization framework

...TensorSpace is a neural network 3D visualization framework designed for not only showing the basic model structure but also presenting the processes of internal feature abstractions, intermediate data manipulations and final inference generations. By applying TensorSpace API, it is more intuitive to visualize and understand any pre-trained models built by TensorFlow, Keras, TensorFlow.js, etc.

Downloads: 1 This Week

Last Update: 2022-02-18
See Project
21

Show Facebook Computer Vision Tags

Chrome Extension that displays automated image tags from Facebook

...Since Facebook uses a computer-vision model to analyse user-uploaded images and generate alt-text tags for accessibility (e.g., “Image may contain: golf, grass, outdoor and nature”), this extension surfaces those hidden tags directly in the UI—revealing what kind of information Facebook infers about images (objects present, activities being done, environment). The purpose is educational and somewhat cautionary: to help users understand the scope of visual inference and privacy issues. Once installed, the extension overlays those tags on images in the timeline, making visible what is typically hidden metadata. The project is relatively lightweight but has garnered attention due to its privacy transparency angle.

Downloads: 0 This Week

Last Update: 2025-11-14
See Project
22

Aliza Gaming API

An extensible development framework for roleplay games.

AlizaGameAPI is a robust, open-source Java-based framework designed to streamline and enhance the development of 2D and 3D games. It offers a comprehensive set of tools, utilities, and libraries, empowering developers to create immersive and dynamic gaming experiences with ease. Key Features: Modular Architecture, Rich Graphics and UI Components, Comprehensive Game Logic and Character Management, Environment and World-Building Tools, Statistical and Mathematical Utilities, Enhanced...

Downloads: 0 This Week

Last Update: 2024-07-31
See Project
23

ECG Fuzzy Expert System

This project develops a web-based (JSP) Fuzzy Rule-Based Expert System for analyzing ECG (electro cardio gram) signals & diagnosing Tachi-Arrhythmias. Proj. main blocks: inference engine, knowledgebase, KB editor, explanation, and feature extraction.

2 Reviews

Downloads: 2 This Week

Last Update: 2015-07-14
See Project
24

EulerMoz

EulerMoz is an inference engine supporting logic based proofs based on EulerSharp project.

Downloads: 0 This Week

Last Update: 2013-04-05
See Project