cpu free download - SourceForge

Showing 91 open source projects for "cpu"

View related business solutions

Software Development C++ Clear Filters & Widen Search

Premier Construction Software
Premier is a global leader in financial construction ERP software.

Rated #1 Construction Accounting Software by Forbes Advisor in 2022 & 2023. Our modern SAAS solution is designed to meet the needs of General Contractors, Developers/Owners, Homebuilders & Specialty Contractors.

Learn More
AestheticsPro Medical Spa Software
Our new software release will dramatically improve your medspa business performance while enhancing the customer experience

AestheticsPro is the most complete Aesthetics Software on the market today. HIPAA Cloud Compliant with electronic charting, integrated POS, targeted marketing and results driven reporting; AestheticsPro delivers the tools you need to manage your medical spa business. It is our mission To Provide an All-in-One Cutting Edge Software to the Aesthetics Industry.

Learn More
1

CPU Features

A cross platform C99 library to get cpu features at runtime

...Implemented in portable C99, it is thread-safe, has no memory allocations, and raises no exceptions, making it suitable even for use in low-level system libraries. The design emphasizes portability, extensibility, and compatibility with sandboxed or restricted environments where direct CPU access may be limited.

Downloads: 0 This Week

Last Update: 2025-10-10
See Project
2

HLSL++

Math library using HLSL syntax with multiplatform SIMD support

...It also extends beyond standard HLSL capabilities by introducing additional features such as quaternion support, advanced matrix operations, and extended vector types like float8. The library is particularly valuable for game developers who need consistency between CPU and GPU computations, reducing errors and improving maintainability.

Downloads: 4 This Week

Last Update: 2026-04-08
See Project
3

Tracy Profiler

Frame profiler

A real-time, nanosecond resolution, remote telemetry, hybrid frame, and sampling profiler for games and other applications. Tracy supports profiling CPU (Direct support is provided for C, C++, Lua and Python integration. At the same time, third-party bindings to many other languages exist on the internet, such as Rust, Zig, C#, OCaml, Odin, etc.), GPU (All major graphic APIs: OpenGL, Vulkan, Direct3D 11/12, OpenCL.), memory allocations, locks, context switches, automatically attribute screenshots to captured frames, and much more.

Downloads: 5 This Week

Last Update: 2025-12-11
See Project
4

DALI

A GPU-accelerated library containing highly optimized building blocks

...Deep learning applications require complex, multi-stage data processing pipelines that include loading, decoding, cropping, resizing, and many other augmentations. These data processing pipelines, which are currently executed on the CPU, have become a bottleneck, limiting the performance and scalability of training and inference. DALI addresses the problem of the CPU bottleneck by offloading data preprocessing to the GPU. Additionally, DALI relies on its own execution engine, built to maximize the throughput of the input pipeline.

Downloads: 1 This Week

Last Update: 2026-02-19
See Project
Collect! is a highly configurable debt collection software
Everything that matters to debt collection, all in one solution.

The flexible & scalable debt collection software built to automate your workflow. From startup to enterprise, we have the solution for you.

Learn More
5

ChrysaLisp

Parallel OS, with GUI, Terminal, OO Assembler, Class libraries

...It has a virtual CPU instruction set and a powerful object and class system for the assembler and high-level languages. It has function-level dynamic binding and loading and a command terminal with a familiar interface for pipe-style command line applications. A Common Lisp-like interpreter is also provided.

Downloads: 0 This Week

Last Update: 2026-04-11
See Project
6

mold

A Modern Linker

...In compiled languages like C, C++, and Rust, the linking phase can become a significant bottleneck, especially in large codebases, and mold addresses this by leveraging highly optimized algorithms and extensive parallelism. It is capable of utilizing all available CPU cores efficiently, resulting in significantly faster linking compared to other popular linkers such as GNU ld, gold, and LLVM lld. The tool is designed to integrate seamlessly into existing build systems, requiring minimal configuration changes to adopt. Mold supports a wide range of architectures, including x86-64, ARM, RISC-V, and PowerPC, making it suitable for diverse development environments.

Downloads: 6 This Week

Last Update: 6 days ago
See Project
7

Google Highway

Performance-portable, length-agnostic SIMD with runtime dispatch

...This portability is achieved through dynamic or static dispatch mechanisms that select the best available instruction set at runtime or compile time. The library is designed for developers who need to maximize CPU performance in domains such as image processing, compression, cryptography, and scientific computing.

Downloads: 0 This Week

Last Update: 2026-04-07
See Project
8

Halide

A language for fast, portable data-parallel computation

...It was designed to make writing high-performance image and array processing code much easier on modern machines. It works on all major operating systems and with several CPU architectures (X86, ARM, MIPS, Hexagon, PowerPC) and GPU Compute APIs (CUDA, OpenCL, OpenGL, among others). It isn't a standalone programming language however; rather it is embedded in C++ which means that you write C++ code, building an in-memory representation of a Halide pipeline using Halide's C++ API. This representation can then be compiled to an object file, or a JIT-compile and run in the same process. ...

Downloads: 6 This Week

Last Update: 2025-09-17
See Project
9

frugally-deep

A lightweight header-only library for using Keras (TensorFlow) models

...Avoids temporarily allocating (potentially large chunks of) additional RAM during convolutions (by not materializing the im2col input matrix). Utterly ignores even the most powerful GPU in your system and uses only one CPU core per prediction. Quite fast on one CPU core, and you can run multiple predictions in parallel, thus utilizing as many CPUs as you like to improve the overall prediction throughput of your application/pipeline.

Downloads: 3 This Week

Last Update: 2025-05-16
See Project
Loan management software that makes it easy.
Ideal for lending professionals who are looking for a feature rich loan management system

Bryt Software is ideal for lending professionals who are looking for a feature rich loan management system that is intuitive and easy to use. We are 100% cloud-based, software as a service. We believe in providing our customers with fair and honest pricing. Our monthly fees are based on your number of users and we have a minimal implementation charge.

Learn More
10

MNN

MNN is a blazing fast, lightweight deep learning framework

...Android platform, core so size is about 400KB, OpenCL so is about 400KB, Vulkan so is about 400KB. Supports hybrid computing on multiple devices. Currently supports CPU and GPU.

Downloads: 13 This Week

Last Update: 2026-04-07
See Project
11

Intel PCM

Intel® Performance Counter Monitor (Intel® PCM)

Intel® Performance Counter Monitor (Intel® PCM) is an application programming interface (API) and a set of tools based on the API to monitor performance and energy metrics of Intel® Core™, Xeon®, Atom™ and Xeon Phi™ processors. PCM works on Linux, Windows, Mac OS X, FreeBSD, DragonFlyBSD and ChromeOS operating systems.

Downloads: 17 This Week

Last Update: 2026-04-09
See Project
12

ncnn

High-performance neural network inference framework for mobile

...It brings artificial intelligence right at your fingertips with no third-party dependencies, and speeds faster than all other known open source frameworks for mobile phone cpu. ncnn allows developers to easily deploy deep learning algorithm models to the mobile platform and create intelligent APPs. It is cross-platform and supports most commonly used CNN networks, including Classical CNN (VGG AlexNet GoogleNet Inception), Face Detection (MTCNN RetinaFace), Segmentation (FCN PSPNet UNet YOLACT), and more. ncnn is currently being used in a number of Tencent applications, namely: QQ, Qzone, WeChat, and Pitu.

Downloads: 87 This Week

Last Update: 2026-01-13
See Project
13

DirectX-Graphics-Samples

Samples that demonstrate how to build graphics intensive applications

This repo contains the DirectX 12 Graphics samples that demonstrate how to build graphics-intensive applications for Windows 10. In the Samples directory, you will find samples that attempt to break off specific features and specific usage scenarios into bite-sized chunks. For example, the ExecuteIndirect sample will show you just enough about execute indirect to get started with that feature without diving too deep into multiengine whereas the nBodyGravity sample will delve into multiengine...

Downloads: 35 This Week

Last Update: 2026-01-22
See Project
14

MegEngine

Easy-to-use deep learning framework with 3 key features

...Gain the lowest memory usage when inferencing a model by leveraging our unique pushdown memory planner. NOTE: MegEngine now supports Python installation on Linux-64bit/Windows-64bit/MacOS(CPU-Only)-10.14+/Android 7+(CPU-Only) platforms with Python from 3.5 to 3.8. On Windows 10 you can either install the Linux distribution through Windows Subsystem for Linux (WSL) or install the Windows distribution directly. Many other platforms are supported for inference.

Downloads: 2 This Week

Last Update: 2024-04-30
See Project
15

VulkanSceneGraph

Vulkan & C++17 based Scene Graph Project

VulkanSceneGraph (VSG), is a modern, cross-platform, high-performance scene graph library built upon Vulkan graphics/compute API. The software is written in C++17 and follows the CppCoreGuidelines and FOSS Best Practices. The source code is published under the MIT License, with the exception of vulkan.h, used for Vulkan extensions, which is under Apache License 2.0. This repository contains C++ headers and source and CMake build scripts to build the libvsg library. Additional support...

Downloads: 6 This Week

Last Update: 2025-12-30
See Project
16

Shumai

Fast Differentiable Tensor Library in JavaScript & TypeScript with Bun

...The library supports matrix operations, gradient computation, and tensor conversions with intuitive APIs and near-native speed, thanks to Bun’s low-overhead FFI bindings. It can automatically leverage GPU acceleration on Linux (via CUDA) and CPU computation on macOS.

Downloads: 0 This Week

Last Update: 2 days ago
See Project
17

CEmu emulator

Third-party TI-84 Plus CE / TI-83 Premium CE emulator

Developer-oriented emulator of the eZ80-based TI-84 Plus CE / TI-83 Premium CE calculators. CEmu is a third-party TI-84 Plus CE / TI-83 Premium CE calculator emulator, focused on developer features. The core is programmed in C and the GUI in C++ with Qt, for performance and portability reasons. CEmu works natively on Windows, macOS, and Linux! Easy setup - get running by doing a one-time-only connection of your calculator! Accurate and fast emulation. Customizable speed/throttling. Resizable...

Downloads: 15 This Week

Last Update: 2026-01-18
See Project
18

Pixie

Instant Kubernetes-Native Application Observability

...Pixie uses eBPF to automatically collect telemetry data such as full-body requests, resource and network metrics, application profiles, and more. Pixie collects, stores and queries all telemetry data locally in the cluster. Pixie uses less than 5% of cluster CPU and in most cases less than 2%. PxL, Pixie’s flexible Pythonic query language, can be used across Pixie’s UI, CLI, and client APIs.

Downloads: 4 This Week

Last Update: 2025-01-24
See Project
19

TensorRT

C++ library for high performance inference on NVIDIA GPUs

...It includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications. TensorRT-based applications perform up to 40X faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and deploy to hyperscale data centers, embedded, or automotive product platforms. TensorRT is built on CUDA®, NVIDIA’s parallel programming model, and enables you to optimize inference leveraging libraries, development tools, and technologies in CUDA-X™ for artificial intelligence, autonomous machines, high-performance computing, and graphics. ...

Downloads: 19 This Week

Last Update: 2026-03-25
See Project
20

SIMD

C++ wrappers for SIMD intrinsics

...SIMD instructions allow a single operation to be applied to multiple data elements simultaneously, significantly accelerating numerical and data-parallel computations. However, differences across CPU architectures and compilers make direct usage complex, which xsimd addresses by offering a unified API that maps efficiently to underlying hardware capabilities. The library supports a wide range of instruction sets, including SSE, AVX, NEON, and WebAssembly SIMD, ensuring portability across platforms. It provides vectorized implementations of common mathematical operations, allowing developers to operate on batches of values using familiar syntax. xsimd is widely adopted in performance-critical applications.

Downloads: 5 This Week

Last Update: 2026-04-12
See Project
21

Perfetto

Production-grade client-side tracing, profiling, and analysis

...It’s designed around a low-overhead producer/consumer model: instrumented components (“producers”) write binary events into shared memory buffers and a collector (“service”) reliably streams them to storage. The data model spans kernel and userspace, so you can stitch together CPU scheduling, app lifecycles, binder/IPC hops, GPU work, power and thermal signals, file I/O, heap samples, and more into a single coherent timeline. Perfetto’s ecosystem includes a web-based UI that can load multi-GB traces directly in the browser and an offline “trace processor” that exposes the trace as a queryable SQL-like table schema for deep analysis and automation. ...

Downloads: 9 This Week

Last Update: 2026-03-02
See Project
22

bitnet.cpp

Official inference framework for 1-bit LLMs

bitnet.cpp is the official open-source inference framework and ecosystem designed to enable ultra-efficient execution of 1-bit large language models (LLMs), which quantize most model parameters to ternary values (-1, 0, +1) while maintaining competitive performance with full-precision counterparts. At its core is bitnet.cpp, a highly optimized C++ backend that supports fast, low-memory inference on both CPUs and GPUs, enabling models such as BitNet b1.58 to run without requiring enormous...

Downloads: 5 This Week

Last Update: 2026-03-10
See Project
23

UIforETW

User interface for recording and managing ETW traces

UIforETW is a Windows performance tracing companion that wraps the Event Tracing for Windows (ETW) toolchain in an approachable GUI. It standardizes trace collection profiles, launches WPR/xperf with the right providers, and organizes the resulting .etl files for repeatable investigations. The tool streamlines the entire loop—record, annotate, open in WPA/XperfView—so engineers can focus on finding scheduling stalls, I/O bottlenecks, GC pauses, or GPU hitches instead of memorizing...

Downloads: 3 This Week

Last Update: 2025-10-10
See Project
24

ArrayFire

ArrayFire, a general purpose GPU library

...Together we can fulfill The ArrayFire Mission under an excellent Code of Conduct that promotes a respectful and friendly building experience. Rigorous benchmarks and tests ensuring top performance and numerical accuracy. Cross-platform compatibility with support for CUDA, OpenCL, and native CPU on Windows, Mac, and Linux. Built-in visualization functions through Forge.

Downloads: 1 This Week

Last Update: 2025-09-05
See Project
25

Faiss

Library for efficient similarity search and clustering dense vectors

Faiss is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning. Faiss is written in C++ with complete wrappers for Python/numpy. Some of the most useful algorithms are implemented on the GPU. It is developed by Facebook AI Research. Faiss contains several methods for similarity search. It...

Downloads: 3 This Week

Last Update: 2026-03-06
See Project