Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
AI Models
Search Results

Search Results for "visual studio code"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 15
Mac 14
Windows 14
More...
BSD 11
ChromeOS 11

Category

Artificial Intelligence 15

License

OSI-Approved Open Source 11
Creative Commons Attribution License 1

Programming Language

Python 15

Showing 15 open source projects for "visual studio code"

View related business solutions

AI Models Python Clear Filters & Widen Search

Tremendous is the global payouts platform for businesses sending gift cards and money at scale.
Getting started is simple: add a funding method and place your first order in minutes.

Trusted by 20,000+ leading organizations, Tremendous has delivered billions of rewards and enables businesses to reach recipients across 230+ countries and regions. Recipients have 2,500+ payout options to choose from, including gift cards, prepaid cards, cash transfers, and charitable donations.

Learn More
Job Evaluation and Talent Management Software
For human resources departments in search of a tool to manage time, expenses, leave, documents, recruitment, and onboarding

Encompassing Visions (ENCV), industry-leading job evaluation and pay equity software, is the best choice for organizations requiring transparent, comprehensive, and objective Job Evaluation software designed to help them ensure equal pay for work of equal value.

Learn More
1

Moondream

Tiny vision language model

Moondream is a creative code project and visual experimentation repository that explores generative graphics, aesthetic patterns, and interactive art through code. The project typically showcases procedural visualizations, algorithmic designs, and artistic experiments that push the boundaries of what can be expressed with programming languages and rendering frameworks.

Downloads: 0 This Week

Last Update: 2026-01-23
See Project
2

DeepSeek-OCR 2

Visual Causal Flow

DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents with rich spatial structure. ...

Downloads: 7 This Week

Last Update: 2026-02-03
See Project
3

SAM 3

Code for running inference and finetuning with SAM 3 model

SAM 3 (Segment Anything Model 3) is a unified foundation model for promptable segmentation in both images and videos, capable of detecting, segmenting, and tracking objects. It accepts both text prompts (open-vocabulary concepts like “red car” or “goalkeeper in white”) and visual prompts (points, boxes, masks) and returns high-quality masks, boxes, and scores for the requested concepts. Compared with SAM 2, SAM 3 introduces the ability to exhaustively segment all instances of an...

Downloads: 53 This Week

Last Update: 4 days ago
See Project
4

Depth Anything 3

Recovering the Visual Space from Any Views

Depth Anything 3 is a research-driven project that brings accurate and dense depth estimation to any input image or video, enabling foundational understanding of 3D structure from 2D visual content. Designed to work across diverse scenes, lighting conditions, and image types, it uses advanced neural networks trained on large, heterogeneous datasets, producing depth maps that reveal scene depth relationships and object surfaces with strong fidelity. The model can be applied to photography,...

Downloads: 10 This Week

Last Update: 2026-03-21
See Project
Share your screen instantly while on a phone call with CrankWheel for an engaging presentation.
For salespeople and customer service agents who want to compliment their phone calls with visual elements.

Our 10x simpler screen sharing tool is designed for you if you spend your days on the phone with clients, and need to add a visual presentation to close sales. No more scheduling a follow-up meeting, or teaching them to use a complex tool. Send them a text message or email, and they see your screen in seconds.

Learn More
5

ComfyUI-LTXVideo

LTX-Video Support for ComfyUI

ComfyUI-LTXVideo is a bridge between ComfyUI’s node-based generative workflow environment and the LTX-Video multimedia processing framework, enabling creators to orchestrate complex video tasks within a visual graph paradigm. Instead of writing code to apply effects, transitions, edits, and data flows, users can assemble nodes that represent video inputs, transformations, and outputs, letting them prototype and automate video production pipelines visually. This integration empowers non-programmers and rapid-iteration teams to harness the performance of LTX-Video while maintaining the clarity and flexibility of a dataflow graph model. ...

Downloads: 1 This Week

Last Update: 2026-03-06
See Project
6

Wan2.1

Wan2.1: Open and Advanced Large-Scale Video Generative Model

Wan2.1 is a foundational open-source large-scale video generative model developed by the Wan team, providing high-quality video generation from text and images. It employs advanced diffusion-based architectures to produce coherent, temporally consistent videos with realistic motion and visual fidelity. Wan2.1 focuses on efficient video synthesis while maintaining rich semantic and aesthetic detail, enabling applications in content creation, entertainment, and research. The model supports...

1 Review

Downloads: 96 This Week

Last Update: 2026-03-05
See Project
7

LTX-2

Python inference and LoRA trainer package for the LTX-2 audio–video

...The framework targets both interactive graphical applications and media-rich experiences, making it a solid foundation for games, creative tools, or visualization systems that demand both performance and flexibility. While being low-level, it also provides sensible defaults and helper abstractions that reduce boilerplate and help teams maintain clear, maintainable code.

Downloads: 67 This Week

Last Update: 2026-03-30
See Project
8

FramePack

Lets make video diffusion practical

FramePack explores compact representations for sequences of image frames, targeting tasks where many near-duplicate frames carry redundant information. The idea is to “pack” frames by detecting shared structure and storing differences efficiently, which can accelerate training or inference on video-like data. By reducing I/O and memory bandwidth, datasets become lighter to load while models still see the essential temporal variation. The repository demonstrates both packing and unpacking...

Downloads: 14 This Week

Last Update: 2025-10-21
See Project
9

GLM-4.6V

GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

GLM-4.6V represents the latest generation of the GLM-V family and marks a major step forward in multimodal AI by combining advanced vision-language understanding with native “tool-call” capabilities, long-context reasoning, and strong generalization across domains. Unlike many vision-language models that treat images and text separately or require intermediate conversions, GLM-4.6V allows inputs such as images, screenshots or document pages directly as part of its reasoning pipeline — and...

Downloads: 0 This Week

Last Update: 2026-04-06
See Project
Build innovative business apps powered by process automation
Connect workflows, teams and systems within one digital business transformation platform

Manage your business as a unified system of interacting processes. Use BPMN 2.0 for low-code process modeling by business people. Follow your strategic goals with process architecture that always corresponds to the structure of an actual business.

Learn More
10

HunyuanOCR

OCR expert VLM powered by Hunyuan's native multimodal architecture

HunyuanOCR is an open-source, end-to-end OCR (optical character recognition) Vision-Language Model (VLM) developed by Tencent‑Hunyuan. It’s designed to unify the entire OCR pipeline, detection, recognition, layout parsing, information extraction, translation, and even subtitle or structured output generation, into a single model inference instead of a cascade of separate tools. Despite being fairly lightweight (about 1 billion parameters), it delivers state-of-the-art performance across a...

Downloads: 0 This Week

Last Update: 2026-04-08
See Project
11

Oasis

Inference script for Oasis 500M

Open-Oasis provides inference code and released weights for Oasis 500M, an interactive world model that generates gameplay frames conditioned on user keyboard input. Instead of rendering a pre-built game world, the system produces the next visual state via a diffusion-transformer approach, effectively “imagining” the world response to your actions in real time. The project focuses on enabling action-conditional frame generation so developers can experiment with interactive, model-generated environments rather than static video generation alone. ...

Downloads: 0 This Week

Last Update: 2026-01-06
See Project
12

MetaCLIP

ICLR2024 Spotlight: curation/training code, metadata, distribution

MetaCLIP is a research codebase that extends the CLIP framework into a meta-learning / continual learning regime, aiming to adapt CLIP-style models to new tasks or domains efficiently. The goal is to preserve CLIP’s strong zero-shot transfer capability while enabling fast adaptation to domain shifts or novel class sets with minimal data and without catastrophic forgetting. The repository provides training logic, adaptation strategies (e.g. prompt tuning, adapter modules), and evaluation...

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
13

Style Aligned

Official code for Style Aligned Image Generation via Shared Attention

StyleAligned is a diffusion-model editing technique and codebase that preserves the visual “style” of an original image while applying new semantic edits driven by text. Instead of fully re-generating an image—and risking changes to lighting, texture, or rendering choices—the method aligns internal features across denoising steps so the target edit inherits the source style. This alignment acts like a constraint on the model’s evolution, steering composition, palette, and brushwork even as...

Downloads: 0 This Week

Last Update: 2025-10-10
See Project
14

GLIDE (Text2Im)

GLIDE: a diffusion-based text-conditional image synthesis model

glide-text2im is an open source implementation of OpenAI’s GLIDE model, which generates photorealistic images from natural language text prompts. It demonstrates how diffusion-based generative models can be conditioned on text to produce highly detailed and coherent visual outputs. The repository provides both model code and pretrained checkpoints, making it possible for researchers and developers to experiment with text-to-image synthesis. GLIDE includes advanced techniques such as classifier-free guidance, which improves the quality and alignment of generated images with the input text. The project also offers sampling scripts and utilities for exploring how diffusion models can be applied to multimodal tasks. ...

Downloads: 0 This Week

Last Update: 6 days ago
See Project
15

gpt-oss-20b

OpenAI’s compact 20B open model for fast, agentic, and local use

GPT-OSS-20B is OpenAI’s smaller, open-weight language model optimized for low-latency, agentic tasks, and local deployment. With 21B total parameters and 3.6B active parameters (MoE), it fits within 16GB of memory thanks to native MXFP4 quantization. Designed for high-performance reasoning, it supports Harmony response format, function calling, web browsing, and code execution. Like its larger sibling (gpt-oss-120b), it offers adjustable reasoning depth and full chain-of-thought visibility...

Downloads: 0 This Week

Last Update: 2025-08-05
See Project

Previous
You're on page 1
Next

Related Searches

ocr

tesseract-ocr-w64-setup.exe

scan

portable subtitle downloader

video

Related Categories

Artificial Intelligence

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Privacy Choices Advertise