Open Source Mac Artificial Intelligence Software - Page 4

Artificial Intelligence Software for Mac

View 1808 business solutions
  • Field Service+ for MS Dynamics 365 & Salesforce Icon
    Field Service+ for MS Dynamics 365 & Salesforce

    Empower your field service with mobility and reliability

    Resco’s mobile solution streamlines your field service operations with offline work, fast data sync, and powerful tools for frontline workers, all natively integrated into Dynamics 365 and Salesforce.
    Learn More
  • Failed Payment Recovery for Subscription Businesses Icon
    Failed Payment Recovery for Subscription Businesses

    For subscription companies searching for a failed payment recovery solution to grow revenue, and retain customers.

    FlexPay’s innovative platform uses multiple technologies to achieve the highest number of retained customers, resulting in reduced involuntary churn, longer life span after recovery, and higher revenue. Leading brands like LegalZoom, Hooked on Phonics, and ClinicSense trust FlexPay to recover failed payments, reduce churn, and increase customer lifetime value.
    Learn More
  • 1
    SAM 3

    SAM 3

    Code for running inference and finetuning with SAM 3 model

    SAM 3 (Segment Anything Model 3) is a unified foundation model for promptable segmentation in both images and videos, capable of detecting, segmenting, and tracking objects. It accepts both text prompts (open-vocabulary concepts like “red car” or “goalkeeper in white”) and visual prompts (points, boxes, masks) and returns high-quality masks, boxes, and scores for the requested concepts. Compared with SAM 2, SAM 3 introduces the ability to exhaustively segment all instances of an open-vocabulary concept specified by a short phrase or exemplars, scaling to a vastly larger set of categories than traditional closed-set models. This capability is grounded in a new data engine that automatically annotated over four million unique concepts, producing a massive open-vocabulary segmentation dataset and enabling the model to achieve 75–80% of human performance on the SA-CO benchmark, which itself spans 270K unique concepts.
    Downloads: 80 This Week
    Last Update:
    See Project
  • 2
    Wan2.1

    Wan2.1

    Wan2.1: Open and Advanced Large-Scale Video Generative Model

    Wan2.1 is a foundational open-source large-scale video generative model developed by the Wan team, providing high-quality video generation from text and images. It employs advanced diffusion-based architectures to produce coherent, temporally consistent videos with realistic motion and visual fidelity. Wan2.1 focuses on efficient video synthesis while maintaining rich semantic and aesthetic detail, enabling applications in content creation, entertainment, and research. The model supports text-to-video and image-to-video generation tasks with flexible resolution options suitable for various GPU hardware configurations. Wan2.1’s architecture balances generation quality and inference cost, paving the way for later improvements seen in Wan2.2 such as Mixture-of-Experts and enhanced aesthetics. It was trained on large-scale video and image datasets, providing generalization across diverse scenes and motion patterns.
    Downloads: 79 This Week
    Last Update:
    See Project
  • 3
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    OpenAI Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented as a sequence of tokens to be predicted by the decoder, allowing a single model to replace many stages of a traditional speech-processing pipeline. The multitask training format uses a set of special tokens that serve as task specifiers or classification targets.
    Downloads: 79 This Week
    Last Update:
    See Project
  • 4
    Project AIRI

    Project AIRI

    Self hosted, you-owned Grok Companion

    AIRI is a self-hosted AI companion platform designed to create interactive virtual characters capable of real-time conversation, gameplay interaction, and multimedia presence. The project aims to emulate advanced AI personalities similar to popular autonomous VTuber-style agents, combining voice interaction, animation, and behavioral logic into a unified system. It supports deployment across web, macOS, and Windows environments, making it accessible for hobbyists and developers building digital companions. AIRI integrates real-time voice chat capabilities and can interact with external applications such as games, enabling more immersive and dynamic experiences. The system emphasizes user ownership and local hosting so developers maintain full control over their AI companion instances. Overall, AIRI serves as an extensible framework for building lifelike AI-driven virtual characters and interactive assistants.
    Downloads: 77 This Week
    Last Update:
    See Project
  • Outbound sales software Icon
    Outbound sales software

    Unified cloud-based platform for dialing, emailing, appointment scheduling, lead management and much more.

    Adversus is an outbound dialing solution that helps you streamline your call strategies, automate manual processes, and provide valuable insights to improve your outbound workflows and efficiency.
    Learn More
  • 5
    COLMAP

    COLMAP

    Structure-from-Motion and Multi-View Stereo

    COLMAP is a general-purpose Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline with a graphical and command-line interface. It offers a wide range of features for the reconstruction of ordered and unordered image collections. The software is licensed under the new BSD license.
    Downloads: 76 This Week
    Last Update:
    See Project
  • 6
    NeuralNote

    NeuralNote

    Audio Plugin for Audio to MIDI transcription using deep learning

    NeuralNote is an open-source audio software tool designed to convert recorded audio into MIDI data using modern machine learning techniques. The software functions as an audio plugin that can be used inside digital audio workstations as well as a standalone application for music production and analysis. Its main purpose is to perform audio-to-MIDI transcription, allowing musicians to record a performance and automatically transform it into editable MIDI notes. NeuralNote supports polyphonic transcription, meaning it can detect multiple notes played simultaneously, making it useful for instruments such as piano or guitar. The system relies on neural network models to analyze audio signals and infer pitch, timing, and other musical attributes that can be represented as MIDI data. The resulting MIDI output can be edited, quantized, or exported to other instruments within a music production workflow.
    Downloads: 76 This Week
    Last Update:
    See Project
  • 7
    OpenClaw Installer

    OpenClaw Installer

    ClawdBot one-click deployment tool

    OpenClaw Installer is an open-source one-click deployment and configuration tool for installing OpenClaw — a personal AI assistant — onto systems with minimal manual setup, giving users a streamlined path to get their own AI assistant running quickly. The project provides shell scripts and configuration menus that detect the host environment, install dependencies, download OpenClaw, configure core settings like AI models and identity channels, and start the server automatically. It supports multiple platforms, including macOS, Linux distributions (Ubuntu, Debian, CentOS), and Windows environments via compatible shells, and simplifies otherwise complex installation steps into a guided, terminal-based experience. The tool also includes options to test API connections, validate channel integrations like Telegram or Discord bots, and launch persistent services that keep OpenClaw running in the background.
    Downloads: 73 This Week
    Last Update:
    See Project
  • 8
    ONNX Runtime

    ONNX Runtime

    ONNX Runtime: cross-platform, high performance ML inferencing

    ONNX Runtime is a cross-platform inference and training machine-learning accelerator. ONNX Runtime inference can enable faster customer experiences and lower costs, supporting models from deep learning frameworks such as PyTorch and TensorFlow/Keras as well as classical machine learning libraries such as scikit-learn, LightGBM, XGBoost, etc. ONNX Runtime is compatible with different hardware, drivers, and operating systems, and provides optimal performance by leveraging hardware accelerators where applicable alongside graph optimizations and transforms. ONNX Runtime training can accelerate the model training time on multi-node NVIDIA GPUs for transformer models with a one-line addition for existing PyTorch training scripts. Support for a variety of frameworks, operating systems and hardware platforms. Built-in optimizations that deliver up to 17X faster inferencing and up to 1.4X faster training.
    Downloads: 72 This Week
    Last Update:
    See Project
  • 9
    Text Generation Web UI

    Text Generation Web UI

    Oobabooga - The definitive Web UI for local AI, with powerful features

    A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA. Dropdown menu for switching between models. Notebook mode that resembles OpenAI's playground. Chat mode for conversation and role playing. Instruct mode compatible with Alpaca and Open Assistant formats. Nice HTML output for GPT-4chan. Markdown output for GALACTICA, including LaTeX rendering. Custom chat characters. Advanced chat features (send images, get audio responses with TTS). Very efficient text streaming. Parameter presets, 8-bit mode. Layers splitting across GPU(s), CPU, and disk. CPU mode, FlexGen, DeepSpeed ZeRO-3, API with streaming and without streaming. LLaMA model, including 4-bit GPTQ. RWKV model, LoRA (loading and training), Softprompts, and extensions.
    Downloads: 72 This Week
    Last Update:
    See Project
  • Turn traffic into pipeline and prospects into customers Icon
    Turn traffic into pipeline and prospects into customers

    For account executives and sales engineers looking for a solution to manage their insights and sales data

    Docket is an AI-powered sales enablement platform designed to unify go-to-market (GTM) data through its proprietary Sales Knowledge Lake™ and activate it with intelligent AI agents. The platform helps marketing teams increase pipeline generation by 15% by engaging website visitors in human-like conversations and qualifying leads. For sales teams, Docket improves seller efficiency by 33% by providing instant product knowledge, retrieving collateral, and creating personalized documents. Built for GTM teams, Docket integrates with over 100 tools across the revenue tech stack and offers enterprise-grade security with SOC 2 Type II, GDPR, and ISO 27001 compliance. Customers report improved win rates, shorter sales cycles, and dramatically reduced response times. Docket’s scalable, accurate, and fast AI agents deliver reliable answers with confidence scores, empowering teams to close deals faster.
    Learn More
  • 10
    Voicebox

    Voicebox

    The open-source voice synthesis studio powered by Qwen3-TTS

    Voicebox is a local-first voice synthesis studio that aims to bring professional, DAW-like voice generation workflows to a desktop app while keeping models and voice data entirely on your machine. It positions itself as an open-source alternative to cloud voice platforms by emphasizing privacy, offline use, and freedom from subscriptions or usage caps. The tool supports downloading voice models, cloning voices from short audio samples, and generating speech locally, then organizing the results using studio-oriented editing concepts. A standout capability is its multi-track timeline editor and supporting audio tools (like trimming and conversation mixing), which let creators compose multi-voice scenes instead of generating single clips in isolation. It is API-first, meaning you can use it as an app for production work or integrate its speech generation into your own software via an API layer.
    Downloads: 72 This Week
    Last Update:
    See Project
  • 11
    Qdrant

    Qdrant

    Vector Database for the next generation of AI applications

    Qdrant is a vector similarity engine & vector database. It deploys as an API service providing search for the nearest high-dimensional vectors. With Qdrant, embeddings or neural network encoders can be turned into full-fledged applications for matching, searching, recommending, and much more! Provides the OpenAPI v3 specification to generate a client library in almost any programming language. Alternatively, utilize ready-made client for Python or other programming languages with additional functionality. Implement a unique custom modification of the HNSW algorithm for the Approximate Nearest Neighbor Search. Search with a State-of-the-Art speed and apply search filters without compromising on results. Support additional payload associated with vectors. Not only stores payload but also allows filter results based on payload values. Unlike Elasticsearch post-filtering, Qdrant guarantees all relevant vectors are retrieved.
    Downloads: 71 This Week
    Last Update:
    See Project
  • 12
    CV-CUDA

    CV-CUDA

    CV-CUDA™ is an open-source, GPU accelerated library

    CV-CUDA is an open-source project that enables building efficient cloud-scale Artificial Intelligence (AI) imaging and computer vision (CV) applications. It uses graphics processing unit (GPU) acceleration to help developers build highly efficient pre- and post-processing pipelines. CV-CUDA originated as a collaborative effort between NVIDIA and ByteDance.
    Downloads: 69 This Week
    Last Update:
    See Project
  • 13
    Crush

    Crush

    The glamourous AI CLI coding agent for your favourite terminal 💘

    Crush is a next-generation, terminal-based AI coding assistant developed by Charm, designed to seamlessly integrate with your tools, workflows, and preferred LLMs. It provides developers with an intuitive, session-based experience where multiple contexts can be managed across projects. With flexible model switching, Crush allows you to change providers mid-session while retaining conversation history. It enhances productivity by combining LSP (Language Server Protocol) support with extensible MCP (Model Context Protocol) integrations for richer coding context and external tool connectivity. Built for portability, it offers first-class support across macOS, Linux, Windows (PowerShell and WSL), and BSD systems. Backed by the Charm ecosystem, Crush is a stable, actively maintained evolution of the original OpenCode project.
    Downloads: 69 This Week
    Last Update:
    See Project
  • 14
    Netron

    Netron

    Visualizer for neural network, deep learning, machine learning models

    Netron is a viewer for neural network, deep learning and machine learning models. Netron supports ONNX, Keras, TensorFlow Lite, Caffe, Darknet, Core ML, MNN, MXNet, ncnn, PaddlePaddle, Caffe2, Barracuda, Tengine, TNN, RKNN, MindSpore Lite, and UFF. Netron has experimental support for TensorFlow, PyTorch, TorchScript, OpenVINO, Torch, Arm NN, BigDL, Chainer, CNTK, Deeplearning4j, MediaPipe, ML.NET, scikit-learn, TensorFlow.js. There is an extense variety of sample model files to download or open using the browser version. It is supported by macOS, Windows, Linux, Python Server and browser.
    Downloads: 68 This Week
    Last Update:
    See Project
  • 15
    DeerFlow

    DeerFlow

    Deep Research framework, combining language models with tools

    DeerFlow is an open-source, community-driven “deep research” framework / multi-agent orchestration platform developed by ByteDance. It aims to combine the reasoning power of large language models (LLMs) with automated tool-use — such as web search, web crawling, Python execution, and data processing — to enable complex, end-to-end research workflows. Instead of a monolithic AI assistant, DeerFlow defines multiple specialized agents (e.g. “planner,” “searcher,” “coder,” “report generator”) that collaborate in a structured workflow, allowing tasks like literature reviews, data gathering, data analysis, code execution, and final report generation to be largely automated. It supports asynchronous task coordination, modular tool integration, and orchestrates the data flow between agents — making it suitable for large-scale or multi-stage research pipelines. Users can deploy it locally or on server infrastructure, integrate custom tools, and benefit from its flexible configuration.
    Downloads: 67 This Week
    Last Update:
    See Project
  • 16
    Cherry Studio

    Cherry Studio

    Cherry Studio is a desktop client that supports for multiple LLMs

    Cherry Studio is a cross-platform desktop client that integrates multiple large language model providers into a unified interface for creating and using AI assistants, supporting customization and multi-model conversations. Selection Assistant with smart content selection enhancement. Deep Research with advanced research capabilities. Memory System with global context awareness. Document Preprocessing with improved document handling. MCP Marketplace for Model Context Protocol ecosystem.
    Downloads: 66 This Week
    Last Update:
    See Project
  • 17
    Enchanted

    Enchanted

    Enchanted is iOS and macOS app for chatting with language models

    Enchanted is an open-source, cross-platform application built to let users chat with privately hosted large language models from Apple devices, including macOS, iOS, and visionOS. Designed to work seamlessly with servers like Ollama, it provides a privacy-focused alternative to traditional cloud AI UIs by connecting directly to your own LLM endpoints such as Llama, Mistral, Vicuna, and more. The interface resembles familiar commercial chat apps but emphasizes local control, offline capabilities, and multimodal support, making it ideal for users who want rich AI interaction without exposing sensitive prompts or conversations to third-party services. Enchanted enables features like voice prompts, image attachments, and markdown formatting within chats, giving flexibility for both casual and professional use. Built with attention to design and native Apple UI conventions, it aims to deliver consistent performance across devices while preserving powerful AI access.
    Downloads: 66 This Week
    Last Update:
    See Project
  • 18
    Memvid

    Memvid

    Video-based AI memory library. Store millions of text chunks in MP4

    Memvid encodes text chunks as QR codes within MP4 frames to build a portable “video memory” for AI systems. This innovative approach uses standard video containers and offers millisecond-level semantic search across large corpora with dramatically less storage than vector DBs. It's self-contained—no DB needed—and supports features like PDF indexing, chat integration, and cloud dashboards.
    Downloads: 66 This Week
    Last Update:
    See Project
  • 19
    Buster

    Buster

    Captcha solver extension for humans

    Save time by asking Buster to solve captchas for you. Buster is a Firefox extension which helps you to solve difficult captchas by completing reCAPTCHA audio challenges using speech recognition. Challenges are solved by clicking on the extension button at the bottom of the reCAPTCHA widget. It is not guaranteed that challenges are always solved, the limitations of the technology need to be considered. The continued development of Buster is made possible thanks to the support of awesome backers. If you'd like to join them, please consider contributing with Patreon, PayPal or Bitcoin. The success rate of the extension can be improved by simulating user interactions with the help of a client app. Follow the instructions from the extension's options to download and install the client app on Windows, Linux and macOS, or get the app from this repository.
    Downloads: 65 This Week
    Last Update:
    See Project
  • 20
    Hermes Agent

    Hermes Agent

    The agent that grows with you

    Hermes Agent is a fully open-source autonomous AI agent designed to run persistently on your own machine or server, becoming more capable the longer it operates by learning from experience and building reusable procedural skills. Rather than functioning as a stateless chatbot, it maintains long-term memory across sessions and can generate searchable “Skill Documents” that capture how it solved complex tasks so it doesn’t start from scratch each time. The agent interfaces with messaging platforms like Telegram, Discord, Slack, and WhatsApp through a single gateway process, and also offers an interactive terminal user interface with history, autocomplete, and streamable tool output. It supports scheduled automation in natural language, allowing users to set up recurring tasks such as daily briefings or system audits that it runs unattended.
    Downloads: 65 This Week
    Last Update:
    See Project
  • 21
    PaddleOCR

    PaddleOCR

    Awesome multilingual OCR toolkits based on PaddlePaddle

    PaddleOCR offers exceptional, multilingual, and practical Optical Character Recognition (OCR) tools that can help users train better models and apply them into practice. Inspired by PaddlePaddle, PaddleOCR is an ultra lightweight OCR system, with multilingual recognition, digit recognition, vertical text recognition, as well as long text recognition. It features a PPOCR series of high-quality pre-trained models, which includes: ultra lightweight ppocr_mobile series models, general ppocr_server series models, and ultra lightweight compression ppocr_mobile_slim series models. PaddleOCR is easy to install and easy to use on Windows, Linux, MacOS and other systems.
    Downloads: 64 This Week
    Last Update:
    See Project
  • 22
    TTS-Vue

    TTS-Vue

    Microsoft speech synthesis tool, built with Electron

    TTS-Vue is a desktop text-to-speech application built with Electron, Vue, ElementPlus, and Vite, focused on using Microsoft’s official Speech API for high-quality neural synthesis. It wraps the Microsoft TTS WebSocket interface in a clean UI so users can paste or load text, choose voices, tweak parameters, and export audio without touching raw API calls. The app supports SSML (Speech Synthesis Markup Language), letting power users specify fine-grained control over pronunciation, pauses, prosody, and emphasis using XML-like markup. It includes batch conversion: users can select multiple .txt files and convert them into audio in one go, making it handy for large text collections or repetitive tasks. For long texts or big files, TTS-Vue automatically slices content into manageable segments, converts them separately, and then stitches them back into a single audio file, avoiding the usual length or timeout issues with TTS APIs.
    Downloads: 63 This Week
    Last Update:
    See Project
  • 23
    tesseract-ocr alternative download

    tesseract-ocr alternative download

    Alternative download for tesseract-ocr project

    Alternative download for tesseract-ocr project
    Leader badge
    Downloads: 1,668 This Week
    Last Update:
    See Project
  • 24
    IREE

    IREE

    A retargetable MLIR-based machine learning compiler runtime toolkit

    IREE (Intermediate Representation Execution Environment, pronounced as "eerie") is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the data center and down to satisfy the constraints and special considerations of mobile and edge deployments.
    Downloads: 62 This Week
    Last Update:
    See Project
  • 25
    xiaohongshu-mcp

    xiaohongshu-mcp

    MCP for xiaohongshu.com

    xiaohongshu-mcp is a Model Context Protocol (MCP) server that equips AI assistants with first-class tools for working on Xiaohongshu (Little Red Book), focusing on day-to-day creator and operator workflows rather than generic browsing. The project centers on authenticated actions and data access that matter to content operations, such as checking login state, publishing or scheduling content, fetching recommendations and search results, reading post details, and acting on comments. It’s packaged so MCP-capable clients (e.g., Claude Desktop, Cursor) can discover its tools via schemas instead of prompt guesswork, which improves reliability and reduces brittle automation. The repo highlights a growing community and provides links to a hosted landing page, signaling that the server is intended for practical use beyond a proof of concept. By exposing typed resources and procedures, it enables repeatable, auditable automation in social workflows where UI changes are frequent.
    Downloads: 61 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB