Showing 453 open source projects for "visual"

View related business solutions
  • Share your screen instantly while on a phone call with CrankWheel for an engaging presentation. Icon
    Share your screen instantly while on a phone call with CrankWheel for an engaging presentation.

    For salespeople and customer service agents who want to compliment their phone calls with visual elements.

    Our 10x simpler screen sharing tool is designed for you if you spend your days on the phone with clients, and need to add a visual presentation to close sales. No more scheduling a follow-up meeting, or teaching them to use a complex tool. Send them a text message or email, and they see your screen in seconds.
    Learn More
  • Supercharge Your Manufacturing with Easy MRP and MES Software Icon
    Supercharge Your Manufacturing with Easy MRP and MES Software

    Designed for SME manufacturers who want to reduce wasteful manual processing, save time and increase profits.

    Flowlens eliminates stock-outs, shortage and overstocks, avoiding costly production delays. Stay in control of inventory levels and keep production running smoothly with real-time visibility and easy-to-use stock management. Import bulk data with ease.
    Learn More
  • 1
    1D Visual Tokenization and Generation

    1D Visual Tokenization and Generation

    This repo contains the code for 1D tokenizer and generator

    The 1D Visual Tokenization and Generation project from ByteDance introduces a novel “one-dimensional” tokenizer designed for images: instead of representing images with large grids of 2D tokens (as in many prior generative/image-modeling systems), it compresses images into as few as 32 discrete tokens (or more, optionally) — thereby achieving a very compact, efficient representation that drastically speeds up generation and reconstruction while retaining strong fidelity.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    AWS Toolkit for Visual Studio Code

    AWS Toolkit for Visual Studio Code

    Local Lambda debug, CodeWhisperer, SAM/CFN syntax, etc.

    The AWS Toolkit extension for Visual Studio Code enables you to interact with Amazon Web Services (AWS). Try the AWS Code Sample Catalog to start coding with the AWS SDK. The AWS Explorer provides access to the AWS services that you can work with when using the Toolkit. To see the AWS Explorer, choose the AWS icon in the Activity bar. The Developer Tools panel is a section for developer-focused tooling curated for working in an IDE.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    DVC Extension for Visual Studio Code

    DVC Extension for Visual Studio Code

    https://github.com/iterative/vscode-dvc

    A Visual Studio Code extension that integrates Data Version Control (DVC) into the development environment, enhancing reproducibility and collaboration for machine learning projects.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Next AI Draw.io

    Next AI Draw.io

    A next.js web application that integrates AI capabilities with draw.io

    Next AI Draw.io is an AI-enhanced diagramming application that integrates generative intelligence into the familiar draw.io-style visual workflow. The project aims to help users create diagrams, flowcharts, and structured visual content more efficiently by leveraging AI-assisted generation and editing capabilities. It combines modern web technologies with diagram automation features to reduce the manual effort typically required in visual design tools. The system is intended for developers, product teams, and technical planners who need rapid diagram creation within collaborative environments. ...
    Downloads: 15 This Week
    Last Update:
    See Project
  • Employees get more done with Rippling Icon
    Employees get more done with Rippling

    Streamline your business with an all-in-one platform for HR, IT, payroll, and spend management.

    Effortlessly manage the entire employee lifecycle, from hiring to benefits administration. Automate HR tasks, ensure compliance, and streamline approvals. Simplify IT with device management, software access, and compliance monitoring, all from one dashboard. Enjoy timely payroll, real-time financial visibility, and dynamic spend policies. Rippling empowers your business to save time, reduce costs, and enhance efficiency, allowing you to focus on growth. Experience the power of unified management with Rippling today.
    Learn More
  • 5
    FaceFusion

    FaceFusion

    Industry leading face manipulation platform

    FaceFusion is an open-source face swapping and facial enhancement toolkit designed for high-quality video and image manipulation workflows. The project enables users to replace faces in images or videos while maintaining temporal consistency and visual realism. It integrates modern deep learning models for face detection, alignment, and blending to produce smoother results than traditional approaches. FaceFusion is built with a modular pipeline that allows users to customize processing steps and optimize performance for different hardware environments. The tool is often used in content creation, visual effects experimentation, and research into generative media. ...
    Downloads: 275 This Week
    Last Update:
    See Project
  • 6
    DeepSeek-OCR 2

    DeepSeek-OCR 2

    Visual Causal Flow

    DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents with rich spatial structure. ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 7
    UI-TARS Desktop

    UI-TARS Desktop

    A GUI Agent app based on UI-TARS to control your computer using AI

    UI-TARS Desktop is a graphical user interface (GUI) agent application that leverages the UI-TARS vision-language model to enable natural language control of computers. This cross-platform tool supports both Windows and macOS, allowing users to perform tasks through intuitive commands. Key features include screenshot-based visual recognition, precise mouse and keyboard control, and real-time feedback on actions. Provides immediate responses and visual feedback on actions performed. The application facilitates seamless interaction with the computer, enhancing user experience by simplifying complex operations into straightforward language instructions. Leverages advanced AI to bridge the gap between visual elements and language commands. ...
    Downloads: 44 This Week
    Last Update:
    See Project
  • 8
    OpenClaw Office

    OpenClaw Office

    OpenClaw Office is the visual monitoring and management frontend

    ...Users can observe communication flows between agents through visual connections, track token usage and operational costs, and analyze performance through integrated dashboards and charts. The system also includes live chat capabilities, allowing users to monitor conversations and tool calls as they occur.
    Downloads: 19 This Week
    Last Update:
    See Project
  • 9
    n8n

    n8n

    Free and source-available fair-code licensed workflow automation tool

    n8n is an extendable workflow automation tool. With a fair-code distribution model, n8n will always have visible source code, be available to self-host, and allow you to add your own custom functions, logic and apps. n8n's node-based approach makes it highly versatile, enabling you to connect anything to everything. n8n has 200+ different nodes to automate workflows.
    Downloads: 866 This Week
    Last Update:
    See Project
  • Dynamic Work and Complex Project Management Platform | Quickbase Icon
    Dynamic Work and Complex Project Management Platform | Quickbase

    Quickbase is the leading application platform for dynamic work.

    Our no-code platform lets you easily create, connect, and customize enterprise applications that fix visibility and workflow gaps without replacing a single system.
    Learn More
  • 10
    GalTransl

    GalTransl

    Automated translation solution for visual novels

    GalTransl is an automated translation system specifically designed for visual novels, particularly those in the “galgame” genre, leveraging large language models to streamline and enhance the translation process. It integrates support for multiple advanced LLM providers such as GPT-4, Claude, DeepSeek, and other models, enabling high-quality, context-aware translations that go beyond traditional machine translation approaches.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 11
    Kimi K2.5

    Kimi K2.5

    Moonshot's most powerful AI model

    Kimi K2.5 is Moonshot AI’s open-source, native multimodal agentic model built through continual pretraining on approximately 15 trillion mixed vision and text tokens. Based on a 1T-parameter Mixture-of-Experts (MoE) architecture with 32B activated parameters, it integrates advanced language reasoning with strong visual understanding. K2.5 supports both “Thinking” and “Instant” modes, enabling either deep step-by-step reasoning or low-latency responses depending on the task. Designed for agentic workflows, it features an Agent Swarm mechanism that decomposes complex problems into coordinated sub-agents executing in parallel. With a 256K context length and MoonViT vision encoder, the model excels across reasoning, coding, long-context comprehension, image, and video benchmarks. ...
    Downloads: 52 This Week
    Last Update:
    See Project
  • 12
    Dify

    Dify

    One API for plugins and datasets, one interface for prompt engineering

    Dify is an easy-to-use LLMOps platform designed to empower more people to create sustainable, AI-native applications. With visual orchestration for various application types, Dify offers out-of-the-box, ready-to-use applications that can also serve as Backend-as-a-Service APIs. Unify your development process with one API for plugins and datasets integration, and streamline your operations using a single interface for prompt engineering, visual analytics, and continuous improvement. ...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 13
    Video-subtitle-remover (VSR)

    Video-subtitle-remover (VSR)

    AI tool that removes hardcoded subtitles and text from videos locally

    ...Video Subtitle Remover analyzes video frames and detects subtitle regions, then replaces the removed areas using an AI algorithm that fills the space with reconstructed visual content. This process aims to maintain the original resolution and visual continuity of the video after subtitle removal. It allows users to define a specific subtitle region so that only text in that area is removed rather than modifying the entire frame. It can also automatically remove text throughout the whole video when a position is not specified. ...
    Downloads: 95 This Week
    Last Update:
    See Project
  • 14
    Flowise

    Flowise

    Drag & drop UI to build your customized LLM flow

    Open source UI visual tool to build your customized LLM flow using LangchainJS, written in Node Typescript/Javascript. Conversational agent for a chat model which utilizes chat-specific prompts and buffer memory. Open source is the core of Flowise, and it will always be free for commercial and personal usage. Flowise support different environment variables to configure your instance.
    Downloads: 40 This Week
    Last Update:
    See Project
  • 15
    ArtCraft

    ArtCraft

    Crafting engine for artists, designers, and filmmakers

    ...The project positions itself as an intentional “crafting engine” for artists, designers, and filmmakers who want deeper control over generative media pipelines. Rather than relying purely on text prompts, ArtCraft emphasizes visual manipulation, compositional control, and iterative refinement so creators can treat AI output more like a malleable creative medium. The application is built with performance and responsiveness in mind, enabling users to move between different creative canvases and asset workflows within a unified interface. It aims to support complex multimedia generation workflows including image, video, and potentially 3D content creation, making it useful for experimental filmmaking and advanced visual design.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 16
    ClawX

    ClawX

    Desktop app that provides a graphical interface for OpenClaw AI

    ClawX is a cross-platform desktop application that provides a graphical user interface for OpenClaw AI agents, transforming complex command-line orchestration into an accessible visual experience. Built with Electron, React, and TypeScript, the software embeds the OpenClaw runtime directly into the application to deliver a battery-included setup without requiring separate installations. The platform focuses on usability by offering a guided setup wizard, visual configuration panels, and real-time validation, enabling users to deploy AI agents without terminal expertise. ...
    Downloads: 48 This Week
    Last Update:
    See Project
  • 17
    Puck

    Puck

    Open source visual editor for building React drag-and-drop pages

    Puck is an open source visual editor designed for React applications that enables developers to build customizable drag-and-drop page editing experiences. It allows teams to create their own page builders by defining React components that can be arranged and configured through a visual interface. Puck is component-based and configuration-driven, meaning developers specify how components render and which editable fields control their properties.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    Jaaz

    Jaaz

    Open source multimodal creative AI assistant with infinite canvas tool

    Jaaz is an open source multimodal creative assistant designed to help users generate and organize visual media using artificial intelligence. It functions as a creative workspace where images, videos, and visual storyboards can be produced and arranged on an infinite canvas environment. It combines AI agents with visual editing tools, allowing users to generate media through prompts, sketches, or simple instructions. Jaaz supports multiple AI models and can integrate both local and cloud-based inference systems, enabling flexible creative workflows. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    InternGPT

    InternGPT

    Open source demo platform where you can easily showcase your AI models

    InternGPT is an open-source multimodal AI framework designed to extend large language models beyond text interactions into visual reasoning and image manipulation tasks. The system integrates conversational AI with computer vision models so users can interact with images, videos, and visual environments through natural language instructions. Unlike traditional chat systems that rely solely on text prompts, InternGPT allows users to interact with visual content using both language and nonverbal signals such as pointing or highlighting objects within images. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    graphify

    graphify

    AI coding assistant skill (Claude Code, Codex, OpenCode, OpenClaw)

    ...The architecture emphasizes flexibility, enabling users to customize how data is mapped and displayed. It may also include analytical features to explore patterns, clusters, or anomalies within the graph. Overall, graphify serves as a bridge between raw data and visual insight.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 21
    InternLM-XComposer-2.5

    InternLM-XComposer-2.5

    InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System

    ...It incorporates visual understanding modules that allow the model to analyze images and integrate them into coherent narrative outputs. The framework also supports tasks such as image captioning, multimodal reasoning, and layout generation for structured visual documents. By combining language generation with visual composition capabilities, the system enables new forms of content creation that integrate written explanations with automatically generated visual components.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    InternVL

    InternVL

    A Pioneering Open-Source Alternative to GPT-4o

    InternVL is a large-scale multimodal foundation model designed to integrate computer vision and language understanding within a unified architecture. The project focuses on scaling vision models and aligning them with large language models so that they can perform tasks involving both visual and textual information. InternVL is trained on massive collections of image-text data, enabling it to learn representations that capture both visual patterns and semantic meaning. The model supports a wide variety of tasks, including visual perception, image classification, and cross-modal retrieval between images and text. It can also be connected to language models to enable conversational interfaces that understand images, videos, and other visual content. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Playwright MCP

    Playwright MCP

    Playwright MCP server

    An MCP server developed by Microsoft that offers browser automation capabilities using Playwright, enabling LLMs to interact with web pages through structured accessibility snapshots without relying on visual data. ​
    Downloads: 6 This Week
    Last Update:
    See Project
  • 24
    SAM 3

    SAM 3

    Code for running inference and finetuning with SAM 3 model

    SAM 3 (Segment Anything Model 3) is a unified foundation model for promptable segmentation in both images and videos, capable of detecting, segmenting, and tracking objects. It accepts both text prompts (open-vocabulary concepts like “red car” or “goalkeeper in white”) and visual prompts (points, boxes, masks) and returns high-quality masks, boxes, and scores for the requested concepts. Compared with SAM 2, SAM 3 introduces the ability to exhaustively segment all instances of an open-vocabulary concept specified by a short phrase or exemplars, scaling to a vastly larger set of categories than traditional closed-set models. ...
    Downloads: 51 This Week
    Last Update:
    See Project
  • 25
    CC Workflow Studio

    CC Workflow Studio

    Accelerate Claude Code/GitHub Copilot

    CC Workflow Studio is a powerful Visual Studio Code extension that accelerates AI-assisted development by providing a visual workflow editor tailored for AI automation and agent orchestration, particularly with tools like Claude Code, GitHub Copilot, OpenAI Codex, and others. The extension lets developers and creators design complex AI workflows using intuitive drag-and-drop canvases or via conversational AI commands, blending graphical editing with natural language refinement. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB