Showing 103 open source projects for "computer vision"

View related business solutions
  • Iris Powered By Generali - Iris puts your customer in control of their identity. Icon
    Iris Powered By Generali - Iris puts your customer in control of their identity.

    Increase customer and employee retention by offering Onwatch identity protection today.

    Iris Identity Protection API sends identity monitoring and alerts data into your existing digital environment – an ideal solution for businesses that are looking to offer their customers identity protection services without having to build a new product or app from scratch.
    Learn More
  • Rezku Point of Sale Icon
    Rezku Point of Sale

    Designed for Real-World Restaurant Operations

    Rezku is an all-inclusive ordering platform and management solution for all types of restaurant and bar concepts. You can now get a fully custom branded downloadable smartphone ordering app for your restaurant exclusively from Rezku.
    Learn More
  • 1
    fastai

    fastai

    Deep learning library

    fastai is a deep learning library which provides practitioners with high-level components that can quickly and easily provide state-of-the-art results in standard deep learning domains, and provides researchers with low-level components that can be mixed and matched to build new approaches. It aims to do both things without substantial compromises in ease of use, flexibility, or performance. This is possible thanks to a carefully layered architecture, which expresses common underlying...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    YOLOv5

    YOLOv5

    YOLOv5 is the world's most loved vision AI

    Introducing Ultralytics YOLOv8, the latest version of the acclaimed real-time object detection and image segmentation model. YOLOv8 is built on cutting-edge advancements in deep learning and computer vision, offering unparalleled performance in terms of speed and accuracy. Its streamlined design makes it suitable for various applications and easily adaptable to different hardware platforms, from edge devices to cloud APIs. Explore the YOLOv8 Docs, a comprehensive resource designed to help you understand and utilize its features and capabilities. ...
    Downloads: 52 This Week
    Last Update:
    See Project
  • 3
    autoMate

    autoMate

    AI tool for automating desktop tasks via natural language input

    autoMate is an AI-powered local automation tool designed to enable users to control and automate their computers using natural language instructions instead of traditional scripting or rule-based systems. It combines large language models with computer vision techniques to interpret user intent and understand on-screen content, allowing it to interact with graphical interfaces similarly to a human user. autoMate follows an observe-decide-act workflow, where it analyzes the screen, plans actions, and executes them through simulated input such as mouse clicks and keyboard events. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    OpenCV

    OpenCV

    Open Source Computer Vision Library

    The Open Source Computer Vision Library has >2500 algorithms, extensive documentation and sample code for real-time computer vision. It works on Windows, Linux, Mac OS X, Android, iOS in your browser through JavaScript. Languages: C++, Python, Julia, Javascript Homepage: https://opencv.org Q&A forum: https://forum.opencv.org/ Documentation: https://docs.opencv.org Source code: https://github.com/opencv Please pay special attention to our tutorials! ...
    Leader badge
    Downloads: 3,053 This Week
    Last Update:
    See Project
  • Failed Payment Recovery for Subscription Businesses Icon
    Failed Payment Recovery for Subscription Businesses

    For subscription companies searching for a failed payment recovery solution to grow revenue, and retain customers.

    FlexPay’s innovative platform uses multiple technologies to achieve the highest number of retained customers, resulting in reduced involuntary churn, longer life span after recovery, and higher revenue. Leading brands like LegalZoom, Hooked on Phonics, and ClinicSense trust FlexPay to recover failed payments, reduce churn, and increase customer lifetime value.
    Learn More
  • 5
    InternGPT

    InternGPT

    Open source demo platform where you can easily showcase your AI models

    InternGPT is an open-source multimodal AI framework designed to extend large language models beyond text interactions into visual reasoning and image manipulation tasks. The system integrates conversational AI with computer vision models so users can interact with images, videos, and visual environments through natural language instructions. Unlike traditional chat systems that rely solely on text prompts, InternGPT allows users to interact with visual content using both language and nonverbal signals such as pointing or highlighting objects within images. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    InternVL

    InternVL

    A Pioneering Open-Source Alternative to GPT-4o

    InternVL is a large-scale multimodal foundation model designed to integrate computer vision and language understanding within a unified architecture. The project focuses on scaling vision models and aligning them with large language models so that they can perform tasks involving both visual and textual information. InternVL is trained on massive collections of image-text data, enabling it to learn representations that capture both visual patterns and semantic meaning. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    OpenVINO Notebooks

    OpenVINO Notebooks

    Jupyter notebook tutorials for OpenVINO

    openvino_notebooks is a collection of interactive Jupyter notebooks designed to demonstrate how to build, optimize, and deploy artificial intelligence applications using the OpenVINO toolkit. The repository provides practical tutorials that guide developers through various AI workflows including computer vision, natural language processing, and generative AI tasks. Each notebook demonstrates how to run pre-trained models, optimize inference performance, and deploy models across hardware such as CPUs, GPUs, and specialized accelerators. The tutorials also illustrate how OpenVINO integrates with models from frameworks like PyTorch, TensorFlow, and ONNX to accelerate inference workloads. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Torch Pruning

    Torch Pruning

    DepGraph: Towards Any Structural Pruning

    ...Torch-Pruning physically removes parameters rather than masking them, which results in smaller and faster models during both training and inference. The toolkit supports a wide variety of architectures used in computer vision and large language models, making it a flexible solution for model compression tasks.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    RF-DETR

    RF-DETR

    RF-DETR is a real-time object detection and segmentation

    RF-DETR is an open-source computer vision framework that implements a real-time object detection and instance segmentation model based on transformer architectures. Developed by Roboflow, the project builds upon modern vision transformer backbones such as DINOv2 to achieve strong accuracy while maintaining efficient inference speeds suitable for real-time applications.
    Downloads: 1 This Week
    Last Update:
    See Project
  • The Most Powerful Software Platform for EHSQ and ESG Management Icon
    The Most Powerful Software Platform for EHSQ and ESG Management

    Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

    Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.
    Learn More
  • 10
    CogView4

    CogView4

    CogView4, CogView3-Plus and CogView3(ECCV 2024)

    ...It emphasizes bilingual usability, making it well-suited for cross-lingual multimodal applications. The model also supports fine-tuning and downstream customization, extending its applicability to creative content generation, human–computer interaction, and research on vision-language alignment.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    AI-Tutorials/Implementations Notebooks

    AI-Tutorials/Implementations Notebooks

    Codes/Notebooks for AI Projects

    ...The repository contains numerous Jupyter notebooks and code samples that demonstrate modern techniques in machine learning, deep learning, data science, and large language model workflows. It includes implementations for a wide range of AI topics such as computer vision, agent systems, federated learning, distributed systems, adversarial attacks, and generative AI. Many of the tutorials focus on building AI agents, multi-agent systems, and workflows that integrate language models with external tools or APIs. The codebase acts as a hands-on learning resource, allowing users to experiment with new frameworks, architectures, and machine learning workflows through guided examples.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    qxresearch-event-1

    qxresearch-event-1

    Python hands on tutorial with 50+ Python Application

    ...The repository contains dozens of small programs, many implemented with minimal lines of code, covering topics such as machine learning, graphical user interfaces, computer vision, and API integration. Each example is designed to illustrate a single concept or application in a clear and concise manner so that learners can quickly understand the underlying logic. The project emphasizes practical experimentation, allowing beginners to modify and extend the example programs to explore new ideas. Many of the examples are accompanied by video explanations that guide learners through the code and demonstrate how the programs work in practice.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    imgclsmob Deep learning networks

    imgclsmob Deep learning networks

    Sandbox for training deep learning networks

    imgclsmob is a deep learning research repository focused on implementing and experimenting with convolutional neural networks for computer vision tasks. The project serves as a sandbox for training and evaluating a wide variety of neural network architectures used in image analysis. It includes implementations of models used for tasks such as image classification, object detection, semantic segmentation, and pose estimation. The repository also contains scripts that help train models, evaluate performance, and convert trained networks between different frameworks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    CUDA Containers for Edge AI & Robotics

    CUDA Containers for Edge AI & Robotics

    Machine Learning Containers for NVIDIA Jetson and JetPack-L4T

    ...These containers simplify the deployment of complex machine learning environments by bundling libraries such as CUDA, TensorRT, and deep learning frameworks into reproducible container images. The project is particularly useful for developers building edge AI and robotics systems that rely on GPU-accelerated inference and real-time computer vision. By using containerized environments, developers can ensure that their applications run consistently across different Jetson platforms and JetPack versions. The repository also includes build tools and package management utilities that help automate the process of assembling machine learning environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Kaggle Solutions

    Kaggle Solutions

    Collection of Kaggle Solutions and Ideas

    ...The repository also highlights important machine learning concepts such as feature engineering, cross-validation strategies, ensemble modeling, and post-processing methods commonly used in winning solutions. Because the content is organized by competition categories such as computer vision, natural language processing, tabular data, and time-series forecasting, users can explore techniques relevant to specific problem types.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    crème de la crème of AI courses

    crème de la crème of AI courses

    This repository is a curated collection of links to various courses

    ...The repository organizes courses by topic, difficulty level, format, and release year, allowing learners to quickly identify relevant material depending on their experience and interests. Topics covered include deep learning, natural language processing, computer vision, large language models, linear algebra, reinforcement learning, and machine learning engineering. Because the repository links to well-known educational content such as university lecture series and professional training materials, it functions as a structured roadmap for individuals who want to develop expertise in artificial intelligence.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    VGGT

    VGGT

    [CVPR 2025 Best Paper Award] VGGT

    VGGT is a transformer-based framework aimed at unifying classic visual geometry tasks—such as depth estimation, camera pose recovery, point tracking, and correspondence—under a single model. Rather than training separate networks per task, it shares an encoder and leverages geometric heads/decoders to infer structure and motion from images or short clips. The design emphasizes consistent geometric reasoning: outputs from one head (e.g., correspondences or tracks) reinforce others (e.g., pose...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Windows-MCP

    Windows-MCP

    MCP server enabling AI agents to control and automate Windows OS

    ...Windows-MCP provides capabilities such as file navigation, application management, UI interaction, and QA testing workflows, making it suitable for building autonomous desktop agents. It focuses on native interaction with Windows UI elements rather than relying on traditional computer vision techniques, which simplifies integration and improves efficiency. It includes a set of tools that simulate user inputs like keyboard and mouse actions while also capturing the current state of windows and interfaces. It is designed to be extensible and adaptable, allowing developers to customize or expand its functionality for different automation or AI use cases.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 19
    BoxMOT

    BoxMOT

    Pluggable SOTA multi-object tracking modules for segmentation

    BoxMOT is an open-source framework designed to provide modular implementations of state-of-the-art multi-object tracking algorithms for computer vision applications. The project focuses on the tracking-by-detection paradigm, where objects detected by vision models are continuously tracked across frames in a video sequence. It provides a pluggable architecture that allows developers to combine different object detectors with multiple tracking algorithms without modifying the core codebase. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    OAGI Python SDK

    OAGI Python SDK

    Python SDK for the Computer Use model Lux, developed by OpenAGI

    OAGI Python SDK is a Python client library for the Lux computer-use model that turns Lux into a programmable automation layer for operating human-facing software via vision and actions. It exposes the OAGI API in an ergonomic way, letting you trigger Lux in three main modes: Tasker for precise scripted sequences, Actor for fast one-shot tasks, and Thinker for open-ended, multi-step objectives.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    Open Model Zoo

    Open Model Zoo

    Pre-trained Deep Learning models and demos

    Open Model Zoo is a large repository of high-quality pre-trained deep learning models and demonstration applications designed to work with the OpenVINO™ toolkit, offering a comprehensive starting point for a wide range of AI and computer vision workloads. It includes hundreds of models covering object detection, classification, segmentation, pose estimation, speech recognition, text-to-speech, and more, many of which are already converted into formats optimized for inference on CPUs, GPUs, VPUs, and other accelerators supported by OpenVINO. In addition to model files, Open Model Zoo provides demo applications that show realistic usage patterns and help developers quickly prototype and understand inference pipelines in C++, Python, or via the OpenCV Graph API. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    StarVector

    StarVector

    StarVector is a foundation model for SVG generation

    StarVector is a multimodal foundation model designed for generating Scalable Vector Graphics (SVG) from images or textual descriptions. The system treats vector graphics creation as a code generation problem, producing SVG code that can render detailed vector images. Its architecture combines computer vision techniques with language modeling capabilities so it can understand visual inputs and textual prompts simultaneously. The model converts raster images or text instructions into structured vector representations, enabling high-quality vectorization and design generation. This approach allows StarVector to create scalable graphics that maintain visual quality regardless of resolution, which is especially useful for design tools and illustration workflows. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    DeepSparse

    DeepSparse

    Sparsity-aware deep learning inference runtime for CPUs

    A sparsity-aware enterprise inferencing system for AI models on CPUs. Maximize your CPU infrastructure with DeepSparse to run performant computer vision (CV), natural language processing (NLP), and large language models (LLMs).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Hiera

    Hiera

    A fast, powerful, and simple hierarchical vision transformer

    Hiera is a hierarchical vision transformer designed to be fast, simple, and strong across image and video recognition tasks. The core idea is to use straightforward hierarchical attention with a minimal set of architectural “bells and whistles,” achieving competitive or superior accuracy while being markedly faster at inference and often faster to train. The repository provides installation options (from source or Torch Hub), a model zoo with pre-trained checkpoints, and code for evaluation...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    Artificial Intelligence for Beginners

    Artificial Intelligence for Beginners

    12 Weeks, 24 Lessons, AI for All

    ...The repository provides a 12-week program composed of 24 lessons that combine theory, code examples, quizzes, and laboratory exercises. It covers a broad range of topics including neural networks, computer vision, natural language processing, and AI ethics. The curriculum is intentionally beginner-friendly while still exposing learners to widely used frameworks such as TensorFlow and PyTorch. It also supports many languages, making the material accessible to a global audience. Overall, the project functions as a complete self-paced learning pathway for students, educators, and developers who want a practical introduction to modern AI concepts.
    Downloads: 6 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB