Open Source Artificial Intelligence Software - Page 4

Artificial Intelligence Software

View 13574 business solutions
  • Point of Sale. Powerful and Simple. Icon
    Point of Sale. Powerful and Simple.

    For retail store owners and multi-location retail operations needing a tool to manage sales, inventory, staff and channels in one place

    Vibe Retail is an all-in-one retail point-of-sale and operations platform built for single-store and multi-location retailers seeking to unify inventory, sales, staff and customer data from one mobile-friendly interface. The system lets you track inventory across locations and warehouses, handle item variations (size, color, material), manage purchase orders and supplier deliveries, print custom barcodes, and transfer stock between stores in real time. On the sales side, Vibe supports multiple payment types (cards, cash, checks, gift cards, EBT), layaway workflows, serial number tracking, delivery management, loyalty programs and branded receipts. Retailers can integrate with online platforms (such as Shopify and WooCommerce), sync in-store and online sales, access 40+ real-time reports on sales, inventory and performance, set up promotions and discounts, and print receipts from mobile devices.
    Learn More
  • Intelligent Automation Solutions Built for Modern Finance Teams Icon
    Intelligent Automation Solutions Built for Modern Finance Teams

    We do CFO stuff.

    Digitally transform your business with workflow automation and integrated payment solutions. Digitally store and secure your data with advanced search and accessibility features that keeps your documents at the tip of your team’s fingers.
    Learn More
  • 1
    LabelImg

    LabelImg

    Graphical image annotation tool and label object bounding boxes

    LabelImg is a graphical image annotation tool. It is written in Python and uses Qt for its graphical interface. Annotations are saved as XML files in PASCAL VOC format, the format used by ImageNet. Besides, it also supports YOLO and CreateML formats. Linux/Ubuntu/Mac requires at least Python 2.6 and has been tested with PyQt 4.8. However, Python 3 or above and PyQt5 are strongly recommended. Virtualenv can avoid a lot of the QT / Python version issues. Build and launch using the instructions. Click 'Change default saved annotation folder' in Menu/File. Click 'Open Dir'. Click 'Create RectBox'. Click and release left mouse to select a region to annotate the rect box. You can use right mouse to drag the rect box to copy or move it. The annotation will be saved to the folder you specify. You can refer to the hotkeys to speed up your workflow.
    Downloads: 93 This Week
    Last Update:
    See Project
  • 2
    CLISP - an ANSI Common Lisp
    CLISP is a portable ANSI Common Lisp implementation and development environment by Bruno Haible. Interpreter, compiler, debugger, CLOS, MOP, FFI, Unicode, sockets, CLX. UI in English, German, French, Spanish, Dutch, Russian, and Danish.
    Leader badge
    Downloads: 478 This Week
    Last Update:
    See Project
  • 3
    Armadillo

    Armadillo

    fast C++ library for linear algebra & scientific computing

    * Fast C++ library for linear algebra (matrix maths) and scientific computing * Easy to use functions and syntax, deliberately similar to Matlab / Octave * Uses template meta-programming techniques to increase efficiency * Provides user-friendly wrappers for OpenBLAS, Intel MKL, LAPACK, ATLAS, ARPACK, SuperLU and FFTW libraries * Useful for machine learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. * Downloads: http://arma.sourceforge.net/download.html * Documentation: http://arma.sourceforge.net/docs.html * Bug reports: http://arma.sourceforge.net/faq.html * Git repo: https://gitlab.com/conradsnicta/armadillo-code
    Leader badge
    Downloads: 2,334 This Week
    Last Update:
    See Project
  • 4
    Vosk Speech Recognition Toolkit

    Vosk Speech Recognition Toolkit

    Offline speech recognition API for Android, iOS, Raspberry Pi

    Vosk is an offline open source speech recognition toolkit. It enables speech recognition for 20+ languages and dialects - English, Indian English, German, French, Spanish, Portuguese, Chinese, Russian, Turkish, Vietnamese, Italian, Dutch, Catalan, Arabic, Greek, Farsi, Filipino, Ukrainian, Kazakh, Swedish, Japanese, Esperanto, Hindi, Czech, Polish. More to come. Vosk models are small (50 Mb) but provide continuous large vocabulary transcription, zero-latency response with streaming API, reconfigurable vocabulary and speaker identification. Speech recognition bindings are implemented for various programming languages like Python, Java, Node.JS, C#, C++, Rust, Go and others. Vosk supplies speech recognition for chatbots, smart home appliances, and virtual assistants. It can also create subtitles for movies, and transcription for lectures and interviews. Vosk scales from small devices like Raspberry Pi or Android smartphones to big clusters.
    Downloads: 87 This Week
    Last Update:
    See Project
  • The leading LMS solution for mission critical learning needs Icon
    The leading LMS solution for mission critical learning needs

    it takes the modern learning environment to workforce enablement and beyond.

    Streamline and integrate your complex learning, compliance, content monetization, and external training capabilities while keeping your people safe and delivering profits with Seertech’s LMS solution.
    Learn More
  • 5
    pgvector

    pgvector

    Open-source vector similarity search for Postgres

    pgvector is an open-source PostgreSQL extension that equips PostgreSQL databases with vector data storage, indexing, and similarity search capabilities—ideal for embeddings-based applications like semantic search and recommendations. You can add an index to use approximate nearest neighbor search, which trades some recall for speed. Unlike typical indexes, you will see different results for queries after adding an approximate index. An HNSW index creates a multilayer graph. It has better query performance than IVFFlat (in terms of speed-recall tradeoff), but has slower build times and uses more memory. Also, an index can be created without any data in the table since there isn’t a training step like IVFFlat.
    Downloads: 87 This Week
    Last Update:
    See Project
  • 6
    ClawX

    ClawX

    Desktop app that provides a graphical interface for OpenClaw AI

    ClawX is a cross-platform desktop application that provides a graphical user interface for OpenClaw AI agents, transforming complex command-line orchestration into an accessible visual experience. Built with Electron, React, and TypeScript, the software embeds the OpenClaw runtime directly into the application to deliver a battery-included setup without requiring separate installations. The platform focuses on usability by offering a guided setup wizard, visual configuration panels, and real-time validation, enabling users to deploy AI agents without terminal expertise. ClawX includes a modern chat interface that supports multiple conversation contexts, Markdown rendering, and persistent message history. It also supports automation through cron-based scheduling and allows users to manage multiple AI channels simultaneously for different workflows.
    Downloads: 86 This Week
    Last Update:
    See Project
  • 7
    DreamTime

    DreamTime

    Use artificial intelligence to create images

    The easiest-to-use application to create fake images from photos and videos. Available for Windows, Linux, and Mac. By being open-source you can build a version for your operating system. Source code is available and totally free, without premium versions or cracks. More stable than DeepNude and easy to use thanks to its modern design. Create the body of your dreams, increase or decrease the size of the body parts or leave everything to random. Don't just stay with static photos, you can also create GIFs, MP4 and WEBM videos! Open files or folders from your computer, you can also open files from Instagram and the web. Vitamined with editing tools for any case, you can also make the process fully automatic. Powerful working method that allows you to edit the algorithm step by step and obtain results that only a human could achieve.
    Downloads: 86 This Week
    Last Update:
    See Project
  • 8
    GLM-4.7

    GLM-4.7

    Advanced language and coding AI model

    GLM-4.7 is an advanced agent-oriented large language model designed as a high-performance coding and reasoning partner. It delivers significant gains over GLM-4.6 in multilingual agentic coding, terminal-based workflows, and real-world developer benchmarks such as SWE-bench and Terminal Bench 2.0. The model introduces stronger “thinking before acting” behavior, improving stability and accuracy in complex agent frameworks like Claude Code, Cline, and Roo Code. GLM-4.7 also advances “vibe coding,” producing cleaner, more modern UIs, better-structured webpages, and visually improved slide layouts. Its tool-use capabilities are substantially enhanced, with notable improvements in browsing, search, and tool-integrated reasoning tasks. Overall, GLM-4.7 shows broad performance upgrades across coding, reasoning, chat, creative writing, and role-play scenarios.
    Downloads: 85 This Week
    Last Update:
    See Project
  • 9
    DeepMosaics

    DeepMosaics

    Automatically remove the mosaics in images and videos, or add mosaics

    Automatically remove the mosaics in images and videos, or add mosaics to them. This project is based on "semantic segmentation" and "Image-to-Image Translation". You can either run DeepMosaics via a pre-built binary package, or from source. Run time depends on the computer's performance (GPU version has better performance but requires CUDA to be installed). Different pre-trained models are suitable for different effects.[Introduction to pre-trained models].
    Downloads: 84 This Week
    Last Update:
    See Project
  • Ditto Edge Server is a lightweight standalone server for resource-constrained edge environments, based on the core Ditto Edge SDK. Icon
    Ditto Edge Server is a lightweight standalone server for resource-constrained edge environments, based on the core Ditto Edge SDK.

    With Ditto Edge Server, you can join devices as small as a Raspberry Pi to a local mesh network and synchronize data across edge environments.

    Ditto's Edge SDK is the only thing your edge devices need to ensure your application is operational in any environment, regardless of network conditions.
    Learn More
  • 10
    Hands-On Large Language Models

    Hands-On Large Language Models

    Official code repo for the O'Reilly Book

    Hands-On-Large-Language-Models is the official GitHub code repository accompanying the practical technical book Hands-On Large Language Models authored by Jay Alammar and Maarten Grootendorst, providing a comprehensive collection of example notebooks, code labs, and supporting materials that illustrate the core concepts and real-world applications of large language models. The repository is structured into chapters that align with the educational progression of the book — covering everything from foundational topics like tokens, embeddings, and transformer architecture to advanced techniques such as prompt engineering, semantic search, retrieval-augmented generation (RAG), multimodal LLMs, and fine-tuning. Each chapter contains executable Jupyter notebooks that are designed to be run in environments like Google Colab, making it easy for learners to experiment interactively with models, visualize attention patterns, implement classification and generation tasks.
    Downloads: 84 This Week
    Last Update:
    See Project
  • 11
    Claude Skills

    Claude Skills

    Public repository for Agent Skills

    Claude Skills is a public repository that showcases and serves as a collection of skills — modular, reusable packages of instructions, scripts, and resources that Claude and other compatible agents can dynamically discover and load to extend their capabilities on specialized tasks. Rather than relying on handcrafted prompts every time, Skills teach an AI agent procedural knowledge and task-specific workflows so it can apply that expertise reliably, whether the task involves document creation, data analysis, design generation, or technical automation. Each Skill lives in its own directory with a SKILL.md file containing metadata and instructions, and can include supplemental scripts or assets that the agent uses to perform complex operations when relevant.
    Downloads: 83 This Week
    Last Update:
    See Project
  • 12
    kokoro-onnx

    kokoro-onnx

    TTS with kokoro and onnx runtime

    kokoro-onnx is a text-to-speech toolkit that wraps the Kokoro neural TTS model in an easy-to-use ONNX Runtime interface, so you can generate speech from Python with minimal setup. It focuses on running efficiently on commodity hardware, including macOS with Apple Silicon, while still delivering near real-time performance for many use cases. The project ships prebuilt model files and a simple example script, so you can go from installation to producing an audio.wav file in just a few steps. It supports multiple languages and voices, with a curated voice list and configuration via a VOICES file hosted alongside the models. The package is distributed on PyPI, meaning you can integrate it directly into applications or scripts using standard Python tooling. It also recommends pairing with an external G2P package to improve pronunciation quality, especially for more complex languages or names, and is licensed under permissive MIT and Apache-style licenses.
    Downloads: 82 This Week
    Last Update:
    See Project
  • 13
    CMU Sphinx

    CMU Sphinx

    Speech Recognition Toolkit

    Thank you for visiting! ----> Maintenance and improvement work has MOVED to https://cmusphinx.github.io/ Please go there for the most recent software and documentation. <---- CMUSphinx is a speaker-independent large vocabulary continuous speech recognizer released under BSD style license. It is also a collection of open source tools and resources that allows researchers and developers to build speech recognition systems.
    Leader badge
    Downloads: 350 This Week
    Last Update:
    See Project
  • 14
    SAM 3

    SAM 3

    Code for running inference and finetuning with SAM 3 model

    SAM 3 (Segment Anything Model 3) is a unified foundation model for promptable segmentation in both images and videos, capable of detecting, segmenting, and tracking objects. It accepts both text prompts (open-vocabulary concepts like “red car” or “goalkeeper in white”) and visual prompts (points, boxes, masks) and returns high-quality masks, boxes, and scores for the requested concepts. Compared with SAM 2, SAM 3 introduces the ability to exhaustively segment all instances of an open-vocabulary concept specified by a short phrase or exemplars, scaling to a vastly larger set of categories than traditional closed-set models. This capability is grounded in a new data engine that automatically annotated over four million unique concepts, producing a massive open-vocabulary segmentation dataset and enabling the model to achieve 75–80% of human performance on the SA-CO benchmark, which itself spans 270K unique concepts.
    Downloads: 80 This Week
    Last Update:
    See Project
  • 15
    Wan2.1

    Wan2.1

    Wan2.1: Open and Advanced Large-Scale Video Generative Model

    Wan2.1 is a foundational open-source large-scale video generative model developed by the Wan team, providing high-quality video generation from text and images. It employs advanced diffusion-based architectures to produce coherent, temporally consistent videos with realistic motion and visual fidelity. Wan2.1 focuses on efficient video synthesis while maintaining rich semantic and aesthetic detail, enabling applications in content creation, entertainment, and research. The model supports text-to-video and image-to-video generation tasks with flexible resolution options suitable for various GPU hardware configurations. Wan2.1’s architecture balances generation quality and inference cost, paving the way for later improvements seen in Wan2.2 such as Mixture-of-Experts and enhanced aesthetics. It was trained on large-scale video and image datasets, providing generalization across diverse scenes and motion patterns.
    Downloads: 79 This Week
    Last Update:
    See Project
  • 16
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    OpenAI Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented as a sequence of tokens to be predicted by the decoder, allowing a single model to replace many stages of a traditional speech-processing pipeline. The multitask training format uses a set of special tokens that serve as task specifiers or classification targets.
    Downloads: 79 This Week
    Last Update:
    See Project
  • 17
    YOLOv3

    YOLOv3

    Object detection architectures and models pretrained on the COCO data

    Fast, precise and easy to train, YOLOv5 has a long and successful history of real time object detection. Treat YOLOv5 as a university where you'll feed your model information for it to learn from and grow into one integrated tool. You can get started with less than 6 lines of code. with YOLOv5 and its Pytorch implementation. Have a go using our API by uploading your own image and watch as YOLOv5 identifies objects using our pretrained models. Start training your model without being an expert. Students love YOLOv5 for its simplicity and there are many quickstart examples for you to get started within seconds. Export and deploy your YOLOv5 model with just 1 line of code. There are also loads of quickstart guides and tutorials available to get your model where it needs to be. Create state of the art deep learning models with YOLOv5
    Downloads: 79 This Week
    Last Update:
    See Project
  • 18
    Project AIRI

    Project AIRI

    Self hosted, you-owned Grok Companion

    AIRI is a self-hosted AI companion platform designed to create interactive virtual characters capable of real-time conversation, gameplay interaction, and multimedia presence. The project aims to emulate advanced AI personalities similar to popular autonomous VTuber-style agents, combining voice interaction, animation, and behavioral logic into a unified system. It supports deployment across web, macOS, and Windows environments, making it accessible for hobbyists and developers building digital companions. AIRI integrates real-time voice chat capabilities and can interact with external applications such as games, enabling more immersive and dynamic experiences. The system emphasizes user ownership and local hosting so developers maintain full control over their AI companion instances. Overall, AIRI serves as an extensible framework for building lifelike AI-driven virtual characters and interactive assistants.
    Downloads: 77 This Week
    Last Update:
    See Project
  • 19
    COLMAP

    COLMAP

    Structure-from-Motion and Multi-View Stereo

    COLMAP is a general-purpose Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline with a graphical and command-line interface. It offers a wide range of features for the reconstruction of ordered and unordered image collections. The software is licensed under the new BSD license.
    Downloads: 76 This Week
    Last Update:
    See Project
  • 20
    NeuralNote

    NeuralNote

    Audio Plugin for Audio to MIDI transcription using deep learning

    NeuralNote is an open-source audio software tool designed to convert recorded audio into MIDI data using modern machine learning techniques. The software functions as an audio plugin that can be used inside digital audio workstations as well as a standalone application for music production and analysis. Its main purpose is to perform audio-to-MIDI transcription, allowing musicians to record a performance and automatically transform it into editable MIDI notes. NeuralNote supports polyphonic transcription, meaning it can detect multiple notes played simultaneously, making it useful for instruments such as piano or guitar. The system relies on neural network models to analyze audio signals and infer pitch, timing, and other musical attributes that can be represented as MIDI data. The resulting MIDI output can be edited, quantized, or exported to other instruments within a music production workflow.
    Downloads: 76 This Week
    Last Update:
    See Project
  • 21
    OpenClaw Installer

    OpenClaw Installer

    ClawdBot one-click deployment tool

    OpenClaw Installer is an open-source one-click deployment and configuration tool for installing OpenClaw — a personal AI assistant — onto systems with minimal manual setup, giving users a streamlined path to get their own AI assistant running quickly. The project provides shell scripts and configuration menus that detect the host environment, install dependencies, download OpenClaw, configure core settings like AI models and identity channels, and start the server automatically. It supports multiple platforms, including macOS, Linux distributions (Ubuntu, Debian, CentOS), and Windows environments via compatible shells, and simplifies otherwise complex installation steps into a guided, terminal-based experience. The tool also includes options to test API connections, validate channel integrations like Telegram or Discord bots, and launch persistent services that keep OpenClaw running in the background.
    Downloads: 73 This Week
    Last Update:
    See Project
  • 22
    ONNX Runtime

    ONNX Runtime

    ONNX Runtime: cross-platform, high performance ML inferencing

    ONNX Runtime is a cross-platform inference and training machine-learning accelerator. ONNX Runtime inference can enable faster customer experiences and lower costs, supporting models from deep learning frameworks such as PyTorch and TensorFlow/Keras as well as classical machine learning libraries such as scikit-learn, LightGBM, XGBoost, etc. ONNX Runtime is compatible with different hardware, drivers, and operating systems, and provides optimal performance by leveraging hardware accelerators where applicable alongside graph optimizations and transforms. ONNX Runtime training can accelerate the model training time on multi-node NVIDIA GPUs for transformer models with a one-line addition for existing PyTorch training scripts. Support for a variety of frameworks, operating systems and hardware platforms. Built-in optimizations that deliver up to 17X faster inferencing and up to 1.4X faster training.
    Downloads: 72 This Week
    Last Update:
    See Project
  • 23
    Text Generation Web UI

    Text Generation Web UI

    Oobabooga - The definitive Web UI for local AI, with powerful features

    A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA. Dropdown menu for switching between models. Notebook mode that resembles OpenAI's playground. Chat mode for conversation and role playing. Instruct mode compatible with Alpaca and Open Assistant formats. Nice HTML output for GPT-4chan. Markdown output for GALACTICA, including LaTeX rendering. Custom chat characters. Advanced chat features (send images, get audio responses with TTS). Very efficient text streaming. Parameter presets, 8-bit mode. Layers splitting across GPU(s), CPU, and disk. CPU mode, FlexGen, DeepSpeed ZeRO-3, API with streaming and without streaming. LLaMA model, including 4-bit GPTQ. RWKV model, LoRA (loading and training), Softprompts, and extensions.
    Downloads: 72 This Week
    Last Update:
    See Project
  • 24
    Voicebox

    Voicebox

    The open-source voice synthesis studio powered by Qwen3-TTS

    Voicebox is a local-first voice synthesis studio that aims to bring professional, DAW-like voice generation workflows to a desktop app while keeping models and voice data entirely on your machine. It positions itself as an open-source alternative to cloud voice platforms by emphasizing privacy, offline use, and freedom from subscriptions or usage caps. The tool supports downloading voice models, cloning voices from short audio samples, and generating speech locally, then organizing the results using studio-oriented editing concepts. A standout capability is its multi-track timeline editor and supporting audio tools (like trimming and conversation mixing), which let creators compose multi-voice scenes instead of generating single clips in isolation. It is API-first, meaning you can use it as an app for production work or integrate its speech generation into your own software via an API layer.
    Downloads: 72 This Week
    Last Update:
    See Project
  • 25
    Qdrant

    Qdrant

    Vector Database for the next generation of AI applications

    Qdrant is a vector similarity engine & vector database. It deploys as an API service providing search for the nearest high-dimensional vectors. With Qdrant, embeddings or neural network encoders can be turned into full-fledged applications for matching, searching, recommending, and much more! Provides the OpenAPI v3 specification to generate a client library in almost any programming language. Alternatively, utilize ready-made client for Python or other programming languages with additional functionality. Implement a unique custom modification of the HNSW algorithm for the Approximate Nearest Neighbor Search. Search with a State-of-the-Art speed and apply search filters without compromising on results. Support additional payload associated with vectors. Not only stores payload but also allows filter results based on payload values. Unlike Elasticsearch post-filtering, Qdrant guarantees all relevant vectors are retrieved.
    Downloads: 71 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB