Open Source Computer Vision Libraries - Page 5

  • The AI workplace management platform Icon
    The AI workplace management platform

    Plan smart spaces, connect teams, manage assets, and get insights with the leading AI-powered operating system for the built world.

    By combining AI workflows, predictive intelligence, and automated insights, OfficeSpace gives leaders a complete view of how their spaces are used and how people work. Facilities, IT, HR, and Real Estate teams use OfficeSpace to optimize space utilization, enhance employee experience, and reduce portfolio costs with precision.
    Learn More
  • Premier Construction Software Icon
    Premier Construction Software

    Premier is a global leader in financial construction ERP software.

    Rated #1 Construction Accounting Software by Forbes Advisor in 2022 & 2023. Our modern SAAS solution is designed to meet the needs of General Contractors, Developers/Owners, Homebuilders & Specialty Contractors.
    Learn More
  • 1
    Colossal-AI

    Colossal-AI

    Making large AI models cheaper, faster and more accessible

    The Transformer architecture has improved the performance of deep learning models in domains such as Computer Vision and Natural Language Processing. Together with better performance come larger model sizes. This imposes challenges to the memory wall of the current accelerator hardware such as GPU. It is never ideal to train large models such as Vision Transformer, BERT, and GPT on a single GPU or a single machine. There is an urgent demand to train models in a distributed environment. However, distributed training, especially model parallelism, often requires domain expertise in computer systems and architecture. It remains a challenge for AI researchers to implement complex distributed training solutions for their models. Colossal-AI provides a collection of parallel components for you. We aim to support you to write your distributed deep learning models just like how you write your model on your laptop.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Assignments for the Computer Vision Course.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    A system for playing chess with a computer player using a real chess board. An experiment in learning the techniques of Computer Vision and having fun in the process.
    Downloads: 0 This Week
    Last Update:
    See Project
  • The Most Powerful Software Platform for EHSQ and ESG Management Icon
    The Most Powerful Software Platform for EHSQ and ESG Management

    Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

    Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.
    Learn More
  • 5
    Computer Vision Pretrained Models

    Computer Vision Pretrained Models

    A collection of computer vision pre-trained models

    A pre-trained model is a model created by someone else to solve a similar problem. Instead of building a model from scratch to solve a similar problem, we can use the model trained on other problem as a starting point. A pre-trained model may not be 100% accurate in your application. For example, if you want to build a self-learning car. You can spend years building a decent image recognition algorithm from scratch or you can take the inception model (a pre-trained model) from Google which was built on ImageNet data to identify images in those pictures. The model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone. TensorFlow implementation of 'YOLO: Real-Time Object Detection', with training and an actual support for real-time running on mobile devices. MobileNets trade off between latency, size and accuracy while comparing favorably with popular models from the literature.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Solving problems of counting the number of vehicles passing on a road during an interval time, as well as the problems of vehicles classification and estimating the speed of the observed traffic flow from traffic scenes acquired by a camera in real-time.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    ConvNet Burden

    ConvNet Burden

    Memory consumption and FLOP count estimates for convnets

    convnet-burden is a MATLAB toolbox / script collection estimating computational cost (FLOPs) and memory consumption of various convolutional neural network architectures. It lets users compute approximate burdens (in FLOPs, memory) for standard image classification CNN models (e.g. ResNet, VGG) based on network definitions. The tool helps researchers compare the computational efficiency of architectures or quantify resource needs. Estimation of memory consumption (e.g. feature map sizes, parameter storage). Support for multiple network definitions/architectures. Estimation of memory consumption (e.g. feature map sizes, parameter storage). Estimation of FLOPs (floating point operations) for CNN architectures.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    DETR

    DETR

    End-to-end object detection with transformers

    PyTorch training code and pretrained models for DETR (DEtection TRansformer). We replace the full complex hand-crafted object detection pipeline with a Transformer, and match Faster R-CNN with a ResNet-50, obtaining 42 AP on COCO using half the computation power (FLOPs) and the same number of parameters. Inference in 50 lines of PyTorch. What it is. Unlike traditional computer vision techniques, DETR approaches object detection as a direct set prediction problem. It consists of a set-based global loss, which forces unique predictions via bipartite matching, and a Transformer encoder-decoder architecture. Given a fixed small set of learned object queries, DETR reasons about the relations of the objects and the global image context to directly output the final set of predictions in parallel. Due to this parallel nature, DETR is very fast and efficient.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    The Data Fusion Peer is a multitier computer vision internet application. The system provides image processing, motion tracking, and visualization information. Application will convert data into 3-Deminsional and other digital environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Skillfully - The future of skills based hiring Icon
    Skillfully - The future of skills based hiring

    Realistic Workplace Simulations that Show Applicant Skills in Action

    Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
    Learn More
  • 10
    Deep Learning Drizzle

    Deep Learning Drizzle

    Drench yourself in Deep Learning, Reinforcement Learning

    Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures! Optimization courses which form the foundation for ML, DL, RL. Computer Vision courses which are DL & ML heavy. Speech recognition courses which are DL heavy. Structured Courses on Geometric, Graph Neural Networks. Section on Autonomous Vehicles. Section on Computer Graphics with ML/DL focus.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Deep Learning with PyTorch

    Deep Learning with PyTorch

    Latest techniques in deep learning and representation learning

    This course concerns the latest techniques in deep learning and representation learning, focusing on supervised and unsupervised deep learning, embedding methods, metric learning, convolutional and recurrent nets, with applications to computer vision, natural language understanding, and speech recognition. The prerequisites include DS-GA 1001 Intro to Data Science or a graduate-level machine learning course. To be able to follow the exercises, you are going to need a laptop with Miniconda (a minimal version of Anaconda) and several Python packages installed. The following instruction would work as is for Mac or Ubuntu Linux users, Windows users would need to install and work in the Git BASH terminal. JupyterLab has a built-in selectable dark theme, so you only need to install something if you want to use the classic notebook interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    DensePose

    DensePose

    A real-time approach for mapping all human pixels of 2D RGB images

    DensePose is a computer vision system that maps all human pixels in an RGB image to the 3D surface of a human body model. It extends human pose estimation from predicting joint keypoints to providing dense correspondences between 2D images and a canonical 3D mesh (such as the SMPL model). This enables detailed understanding of human shape, motion, and surface appearance directly from images or videos. The repository includes the DensePose network architecture, training code, pretrained models, and dataset tools for annotation and visualization. DensePose is widely used in augmented reality, motion capture, virtual try-on, and visual effects applications because it enables real-time 3D human mapping from 2D inputs. The model architecture builds on Mask R-CNN, using additional regression heads to predict UV coordinates that map image pixels to 3D surfaces.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Detectron

    Detectron

    FAIR's research platform for object detection research

    Detectron is an object detection and instance segmentation research framework that popularized many modern detection models in a single, reproducible codebase. Built on Caffe2 with custom CUDA/C++ operators, it provided reference implementations for models like Faster R-CNN, Mask R-CNN, RetinaNet, and Feature Pyramid Networks. The framework emphasized a clean configuration system, strong baselines, and a “model zoo” so researchers could compare results under consistent settings. It includes training and evaluation pipelines that handle multi-GPU setups, standard datasets, and common augmentations, which helped standardize experimental practice in detection research. Visualization utilities and diagnostic scripts make it straightforward to inspect predictions, proposals, and losses while training. Although the project has since been superseded by Detectron2, the original Detectron remains a historically important, reproducible reference that still informs many productions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Diglo is a Music Information Retrieval System based on Computer Vision and Audio Spectrum Analysis, using algorithmic operations to find emergent patterns in musical performance. Also it functions as a low-cost Motion Capture Analysis system.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    DressCode

    A virtual mirror using OpenGL and a SoftKinetic ToF camera.

    We propose a virtual mirror that combines computer vision, computer graphics and physical based simulation. A time-of-flight camera mounted on a vertical monitor provides RGB (red-green-blue) and depth data of the scene. The system uses this retrieved data to construct a 3D model of the user. The final rendered scene omits the constructed 3D model, and projects the rendered garment on the user's image onto the monitor to give the user the impression that (s)he is wearing the garment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    ECCV style files

    ECCV style files

    Repository for style files for European Conference on Computer Vision
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    EStereo is a computer vision C++ library for real-time disparity estimation. It computes dense stereo matching from 2 or 3 images as well as 3D scene reconstruction. The library also comes with a GUI-based application (StereoPlus).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Edges

    Edges

    Structured Edge Detection Toolbox

    Structured Edge Detection (Edges) is a MATLAB toolbox implementing the structured forests method for fast and accurate edge detection (up to ~60 fps in many settings). The toolbox also includes the Edge Boxes object proposal method, fast superpixel generation, and utilities for training, evaluation, and integration with vision pipelines. High performance (frames per second performance depending on settings). Integration with MATLAB and compatibility with external vision pipelines. Fast edge detection using structured forests (predict structured edge maps).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    An open source computer vision library for TI TMS320C64x DSP. It is compatible with OpenCV.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Face Mask Detection

    Face Mask Detection

    Face Mask Detection system based on computer vision and deep learning

    Face Mask Detection system based on computer vision and deep learning using OpenCV and Tensorflow/Keras. Face Mask Detection System built with OpenCV, Keras/TensorFlow using Deep Learning and Computer Vision concepts in order to detect face masks in static images as well as in real-time video streams. Amid the ongoing COVID-19 pandemic, there are no efficient face mask detection applications which are now in high demand for transportation means, densely populated areas, residential districts, large-scale manufacturers and other enterprises to ensure safety. The absence of large datasets of ‘with_mask’ images has made this task cumbersome and challenging. Our face mask detector doesn't use any morphed masked images dataset and the model is accurate. Owing to the use of MobileNetV2 architecture, it is computationally efficient, thus making it easier to deploy the model to embedded systems (Raspberry Pi, Google Coral, etc.).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Faster R-CNN

    Faster R-CNN

    Object detection framework based on deep convolutional networks

    This repository provides a MATLAB / Caffe re-implementation of the Faster R-CNN object detection framework (originally from Ren et al. 2015). The Faster R-CNN architecture combines a Region Proposal Network (RPN) with a Fast R-CNN style detection network to share convolutional feature maps and thus speed up detection. The repo includes code to train, test, and deploy Faster R-CNN models under the MATLAB / Caffe environment, example configuration files, and model checkpoints. Multiple configuration files for different datasets and architectures. Evaluation scripts for mAP and detection metrics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    Fish4Knowledge Project

    Analysis of undersea fish videos

    The Fish4knowledge project investigated: information abstraction and storage methods for analyzing undersea video data (from 10E+15 pixels to 10E+12 units of information), machine and human vocabularies for detecting & describing fish, flexible process architectures to process the data and scientific queries and effective specialised user query interfaces. A combination of computer vision, database storage, workflow and human computer interaction methods were used to achieve this. The project used live video feeds from 10 underwater cameras as a testbed for investigating more generally applicable methods for capture, storage, analysis and querying of multiple video streams. We collated a public database from 3 years containing video summaries of the observed fish and associated descriptors. Expert web-based interfaces were developed for use by marine researchers, allowing unprecedented access to live and previously stored videos, or previously extracted information.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    C++ Computer vision library
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    FlexCVDemo

    FlexCV puts the power of computer vision into the hands of people with

    Until now computer vision has only been accessible to software engineers. FlexCV changes this! It's super easy user interface allows normal people to learn and use computer vision in the real world. Simply add the parts (Elements) and connect them up.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    GAAS

    GAAS

    Autonomous aviation intelligence software for drones and VTOL

    GAAS (Generalized Autonomy Aviation System) is an open source software platform for autonomous drones and VTOLs. GAAS was built to provide a common infrastructure for computer-vision based drone intelligence. In the long term, GAAS aims to accelerate the coming of autonomous VTOLs. Being a BSD-licensed product, GAAS makes it easy for enterprises, researches, and drone enthusiasts to modify the code to suit specific use cases. Our long-term vision is to implement GAAS in autonomous passenger carrying VTOLs (or "flying cars"). The first step of this vision is to make Unmanned Aerial Vehicles truly "unmanned", and thus make drones ubiquitous. We currently support manned and unmanned multi-rotor drones and helicopters. Our next step is to support VTOLs and eVTOLs.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB