Page 2 | nvidia free download

Showing 100 open source projects for "nvidia"

View related business solutions

Python Clear Filters & Widen Search

The AI workplace management platform
Plan smart spaces, connect teams, manage assets, and get insights with the leading AI-powered operating system for the built world.

By combining AI workflows, predictive intelligence, and automated insights, OfficeSpace gives leaders a complete view of how their spaces are used and how people work. Facilities, IT, HR, and Real Estate teams use OfficeSpace to optimize space utilization, enhance employee experience, and reduce portfolio costs with precision.

Learn More
Premier Construction Software
Premier is a global leader in financial construction ERP software.

Rated #1 Construction Accounting Software by Forbes Advisor in 2022 & 2023. Our modern SAAS solution is designed to meet the needs of General Contractors, Developers/Owners, Homebuilders & Specialty Contractors.

Learn More
1

TensorRT LLM

TensorRT LLM provides users with an easy-to-use Python API

TensorRT-LLM is an open-source high-performance inference library specifically designed to optimize and accelerate large language model deployment on NVIDIA GPUs. It provides a Python-based API built on top of PyTorch that allows developers to define, customize, and deploy LLMs efficiently across a variety of hardware configurations, from single GPUs to large multi-node clusters. The library focuses on maximizing throughput and minimizing latency through advanced techniques such as quantization, custom attention kernels, and optimized memory management strategies. ...

Downloads: 8 This Week

Last Update: 2026-03-20
See Project
2

Triton

Development repository for the Triton language and compiler

...Triton enables users to write optimized kernels for machine learning workloads while maintaining readability and control over performance-critical aspects like memory access patterns and parallel execution. The project leverages LLVM and MLIR to compile code into efficient GPU instructions, supporting both NVIDIA and AMD hardware. It is widely used in research and production environments where custom tensor operations are required, offering both high performance and developer-friendly syntax.

Downloads: 4 This Week

Last Update: 2026-03-20
See Project
3

autoresearch-win-rtx

AI agents running research on single-GPU nanochat training

autoresearch-win-rtx is a Windows-based implementation of the autoresearch framework designed to run autonomous AI research loops on consumer NVIDIA RTX GPUs. It adapts the original autoresearch concept to a Windows environment, enabling users to perform iterative machine learning optimization without requiring specialized Linux or data center setups. The system revolves around a small set of core files, including a training script that is continuously modified by an AI agent, along with supporting utilities for data preparation and evaluation. ...

Downloads: 2 This Week

Last Update: 2026-03-30
See Project
4

CodeGeeX

CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)

...CodeGeeX also powers IDE plugins for VS Code and JetBrains, offering features like code completion, translation, debugging, and annotation. The model supports Ascend 910 and NVIDIA GPUs, with optimizations like quantization and FasterTransformer acceleration for faster inference.

Downloads: 9 This Week

Last Update: 3 days ago
See Project
The Most Powerful Software Platform for EHSQ and ESG Management
Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.

Learn More
5

clone-voice

A sound cloning tool with a web interface, using your voice

...The app is designed to be very easy to use: you download a precompiled package, double-click app.exe, and it launches a browser-based web interface where you control cloning and synthesis. It does not require an NVIDIA GPU to run basic tasks, although GPU acceleration can be used when available, making it accessible on modest machines. The tool supports around sixteen languages, including Chinese, English, Japanese, Korean, French, German, Italian, and others, and can capture reference voices directly from a microphone or from uploaded audio.

Downloads: 12 This Week

Last Update: 2025-11-28
See Project
6

CuPy

A NumPy-compatible array library accelerated by CUDA

CuPy is an open source implementation of NumPy-compatible multi-dimensional array accelerated with NVIDIA CUDA. It consists of cupy.ndarray, a core multi-dimensional array class and many functions on it. CuPy offers GPU accelerated computing with Python, using CUDA-related libraries to fully utilize the GPU architecture. According to benchmarks, it can even speed up some operations by more than 100X. CuPy is highly compatible with NumPy, serving as a drop-in replacement in most cases. ...

Downloads: 52 This Week

Last Update: 2026-02-20
See Project
7

JAX Toolbox

Public CI, Docker images for popular JAX libraries

...By offering curated environments and tested configurations, it reduces compatibility issues and accelerates development workflows for both research and production. The repository also includes performance-optimized examples that demonstrate best practices for leveraging NVIDIA hardware effectively. Its integration with container-based workflows makes it suitable for reproducible experiments and scalable deployments across different environments.

Downloads: 1 This Week

Last Update: 7 days ago
See Project
8

FastKoko

Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model

FastKoko is a self-hosted text-to-speech server built around the Kokoro-82M model and exposed through a FastAPI backend. It is designed to be easy to deploy via Docker, with separate CPU and GPU images so that users can choose between pure CPU inference and NVIDIA GPU acceleration. The project exposes an OpenAI-compatible speech endpoint, which means existing code that talks to the OpenAI audio API can often be pointed at a Kokoro-FastAPI instance with minimal changes. It supports multiple languages and voicepacks and allows phoneme based generation for more accurate pronunciation and prosody. ...

Downloads: 4 This Week

Last Update: 2025-12-13
See Project
9

exo

Run your own AI cluster at home with everyday devices

Run your own AI cluster at home with everyday devices. Maintained by exo labs. Forget expensive NVIDIA GPUs, unify your existing devices into one powerful GPU, iPhone, iPad, Android, Mac, Linux, or pretty much any device. Now the default models, run 8B, 70B, and 405B parameter models on your own devices.

Downloads: 15 This Week

Last Update: 2026-03-27
See Project
Turn traffic into pipeline and prospects into customers
For account executives and sales engineers looking for a solution to manage their insights and sales data

Docket is an AI-powered sales enablement platform designed to unify go-to-market (GTM) data through its proprietary Sales Knowledge Lake™ and activate it with intelligent AI agents. The platform helps marketing teams increase pipeline generation by 15% by engaging website visitors in human-like conversations and qualifying leads. For sales teams, Docket improves seller efficiency by 33% by providing instant product knowledge, retrieving collateral, and creating personalized documents. Built for GTM teams, Docket integrates with over 100 tools across the revenue tech stack and offers enterprise-grade security with SOC 2 Type II, GDPR, and ISO 27001 compliance. Customers report improved win rates, shorter sales cycles, and dramatically reduced response times. Docket’s scalable, accurate, and fast AI agents deliver reliable answers with confidence scores, empowering teams to close deals faster.

Learn More
10

Humanoid-Gym

Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real

Humanoid-Gym is a reinforcement learning framework designed to train locomotion and control policies for humanoid robots using high-performance simulation environments. The system is built on top of NVIDIA Isaac Gym, which allows large-scale parallel simulation of robotic environments directly on GPU hardware. Its primary goal is to enable efficient training of humanoid robots in simulation while enabling policies to transfer effectively to real-world hardware without additional training. The framework emphasizes the concept of zero-shot sim-to-real transfer, meaning that behaviors learned in simulation can be deployed directly on physical robots with minimal adjustment. ...

Downloads: 0 This Week

Last Update: 2026-03-15
See Project
11

FlashAttention

Fast and memory-efficient exact attention

...It achieves this by using IO-aware algorithms that minimize memory reads and writes, reducing the quadratic memory overhead typically associated with attention operations. The project provides implementations of FlashAttention, FlashAttention-2, and newer iterations optimized for modern GPU architectures such as NVIDIA Hopper and AMD accelerators. By improving both forward and backward pass efficiency, it enables training and inference of large language models with longer sequence lengths and higher throughput. The library integrates with PyTorch and supports various attention configurations, including causal masking, multi-query attention, and rotary embeddings.

Downloads: 49 This Week

Last Update: 2026-03-18
See Project
12

TAME LLM

Traditional Mandarin LLMs for Taiwan

...These models are designed to support applications such as conversational AI, knowledge retrieval, and domain-specific reasoning in fields like manufacturing, law, healthcare, and electronics. The training pipeline leverages high-performance computing infrastructure and frameworks such as NVIDIA NeMo and Megatron to enable large-scale model training. Taiwan-LLM aims to improve language understanding and generation for Traditional Mandarin users by incorporating region-specific datasets and evaluation benchmarks.

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
13

WanGP

AI video generator optimized for low VRAM and older GPUs use

...It acts as a unified interface for running multiple video, image, and audio generation models, including Wan-based models as well as other systems like Hunyuan Video, Flux, and Qwen. A key focus of the project is reducing VRAM requirements, enabling some workflows to run on as little as 6 GB while still supporting older Nvidia and certain AMD GPUs. Wan2GP provides a full web-based interface that simplifies interaction with complex generative pipelines, making it easier to configure prompts, models, and rendering settings. It also integrates a wide range of utilities such as prompt enhancement, mask editing, motion design, and extraction tools for pose, depth, and flow data to support advanced video workflows.

Downloads: 39 This Week

Last Update: 5 days ago
See Project
14

OpenHardwareMonitor

Free open source tool for real-time PC hardware sensor monitoring

...It provides real-time insights into key system metrics such as temperatures, fan speeds, voltages, load percentages, and clock speeds by reading directly from sensors embedded in CPUs, GPUs, motherboards, and storage devices. The tool supports a wide range of sensor hardware found on modern systems, including Intel and AMD processors, NVIDIA and AMD graphics cards, SMART temperature sensors on storage drives, and many motherboard monitoring chips. Users can view monitored data directly within the application, as customizable desktop gadgets, or through a system tray interface, making it easy to keep an eye on system health at a glance. OpenHardwareMonitor runs without the need for a traditional installation (it can be run portably) and on systems configured with the appropriate runtime (e.g., .NET or Mono).

Downloads: 63 This Week

Last Update: 2026-03-30
See Project
15

CUDA Python

Performance meets Productivity

CUDA Python is a unified Python interface for accessing and working with the NVIDIA CUDA platform, enabling developers to build GPU-accelerated applications entirely in Python. It acts as a metapackage composed of multiple submodules that provide both high-level and low-level access to CUDA functionality, including runtime APIs, driver APIs, and JIT compilation tools. The project is designed to simplify GPU programming by offering Pythonic abstractions while still exposing the full power of CUDA for advanced users. ...

Downloads: 3 This Week

Last Update: 6 days ago
See Project
16

WhisperLive

A nearly-live implementation of OpenAI's Whisper

...It runs as a server–client system in which the server hosts a Whisper backend and clients stream audio to be transcribed with very low delay. The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and network streams such as RTSP and HLS, making it flexible for live events, monitoring, or accessibility workflows. Configuration options let you control the number of clients, maximum connection time, and threading behavior so the server can be tuned for different deployment environments. ...

Downloads: 14 This Week

Last Update: 2026-03-17
See Project
17

Triton Inference Server

The Triton Inference Server provides an optimized cloud

...Triton enables teams to deploy any AI model from multiple deep learning and machine learning frameworks, including TensorRT, TensorFlow, PyTorch, ONNX, OpenVINO, Python, RAPIDS FIL, and more. Triton supports inference across cloud, data center, edge, and embedded devices on NVIDIA GPUs, x86 and ARM CPU, or AWS Inferentia. Triton delivers optimized performance for many query types, including real-time, batched, ensembles, and audio/video streaming. Provides Backend API that allows adding custom backends and pre/post-processing operations. Model pipelines using Ensembling or Business Logic Scripting (BLS). HTTP/REST and GRPC inference protocols based on the community-developed KServe protocol. ...

Downloads: 12 This Week

Last Update: 4 days ago
See Project
18

NeMo Retriever Library

Document content and metadata extraction microservice

...It processes various document types by splitting them into components such as text, tables, charts, and images, and then applies OCR and contextual analysis to convert them into structured data formats. The system is built on NVIDIA NIM microservices, enabling high-performance parallel processing and efficient handling of large datasets. It supports multiple extraction strategies for different document formats, balancing accuracy and throughput depending on the use case. Additionally, it can generate embeddings for extracted content and integrate with vector databases like Milvus, making it well-suited for retrieval-augmented generation pipelines.

Downloads: 2 This Week

Last Update: 2026-03-18
See Project
19

AWS Deep Learning Containers

A set of Docker images for training and serving models in TensorFlow

AWS Deep Learning Containers (DLCs) are a set of Docker images for training and serving models in TensorFlow, TensorFlow 2, PyTorch, and MXNet. Deep Learning Containers provide optimized environments with TensorFlow and MXNet, Nvidia CUDA (for GPU instances), and Intel MKL (for CPU instances) libraries and are available in the Amazon Elastic Container Registry (Amazon ECR). The AWS DLCs are used in Amazon SageMaker as the default vehicles for your SageMaker jobs such as training, inference, transforms etc. They've been tested for machine learning workloads on Amazon EC2, Amazon ECS and Amazon EKS services as well. ...

Downloads: 8 This Week

Last Update: 10 hours ago
See Project
20

Style-Bert-VITS2

Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles

...It includes a full GUI editor to script dialogue, set different styles per line, edit dictionaries, and save/load projects, plus a separate web UI and Colab notebooks for training and experimentation. For those who only need synthesis, the project is published as a Python library (pip install style-bert-vits2) and can run on CPU without an NVIDIA GPU, though training still requires GPU hardware.

Downloads: 11 This Week

Last Update: 2025-11-28
See Project
21

InvokeAI

InvokeAI is a leading creative engine for Stable Diffusion models

...InvokeAI offers an industry leading Web Interface, interactive Command Line Interface, and also serves as the foundation for multiple commercial products. This fork is supported across Linux, Windows and Macintosh. Linux users can use either an Nvidia-based card (with CUDA support) or an AMD card (using the ROCm driver). We do not recommend the GTX 1650 or 1660 series video cards. They are unable to run in half-precision mode and do not have sufficient VRAM to render 512x512 images.

1 Review

Downloads: 20 This Week

Last Update: 2026-03-22
See Project
22

SimpleLLM

950 line, minimal, extensible LLM inference engine built from scratch

...It provides the core components of an LLM runtime—such as tokenization, batching, and asynchronous execution—without the abstraction overhead of more complex engines, making it easier for developers and researchers to understand and modify. Designed to run efficiently on high-end GPUs like NVIDIA H100 with support for models such as OpenAI/gpt-oss-120b, Simple-LLM implements continuous batching and event-driven inference loops to maximize hardware utilization and throughput. Its straightforward code structure allows anyone experimenting with custom kernels, new batching strategies, or inference optimizations to trace execution from input to output with minimal cognitive overhead.

Downloads: 0 This Week

Last Update: 2026-01-28
See Project
23

Simple StyleGan2 for Pytorch

Simplest working implementation of Stylegan2

Simple Pytorch implementation of Stylegan2 that can be completely trained from the command-line, no coding needed. You will need a machine with a GPU and CUDA installed. You can also specify the location where intermediate results and model checkpoints should be stored. You can increase the network capacity (which defaults to 16) to improve generation results, at the cost of more memory. By default, if the training gets cut off, it will automatically resume from the last checkpointed file....

Downloads: 2 This Week

Last Update: 2025-01-12
See Project
24

Megatron-LM

Ongoing research training transformer models at scale

Megatron-LM is a GPU-optimized deep learning framework from NVIDIA designed to train extremely large transformer-based language models efficiently at scale. The repository provides both a reference training implementation and Megatron Core, a composable library of high-performance building blocks for custom large-model pipelines. It supports advanced parallelism strategies including tensor, pipeline, data, expert, and context parallelism, enabling training across massive multi-GPU and multi-node clusters. ...

Downloads: 0 This Week

Last Update: 2026-03-16
See Project
25

OuteTTS

Interface for OuteTTS models

...The project supports multiple backends including llama.cpp (Python bindings and server), Hugging Face Transformers, ExLlamaV2, VLLM and a JavaScript interface via Transformers.js, allowing it to run on CPUs, NVIDIA CUDA GPUs, AMD ROCm, Vulkan-capable GPUs, and Apple Metal. It also includes a notion of speaker profiles: you can create a speaker from a short audio sample, save it as JSON, and reuse it for consistent voice identity across generations and sessions. For best quality, the model is designed to work with a reference speaker clip and will inherit emotion, style, and accent from that reference.

Downloads: 1 This Week

Last Update: 2025-11-28
See Project