Showing 212 open source projects for "python for windows"

View related business solutions
  • Endpoint Protection Software for Businesses | HYPERSECURE Icon
    Endpoint Protection Software for Businesses | HYPERSECURE

    DriveLock protects systems, data, end devices from data loss and misuse.

    The HYPERSECURE endpoint protection platform is a comprehensive suite of products and services enhanced by European third-party solutions. It ensures our customers’ IT security, regulatory compliance, and digital sovereignty.
    Learn More
  • Complete Data Management for Nonprofits Icon
    Complete Data Management for Nonprofits

    Designed to fit with multi-level non-profit organization, across any sector

    NewOrg is a robust platform built with enhanced features to help non-profit organizations that capture and integrate the information from all of their operational areas to better manage volunteers, clients, programs, outcome reporting, activity sign-ups & scheduling, communications, surveys, fundraising activities and Development campaigns. NewOrg can truly deliver an intuitive product that will help manage your Committees, Donors, Events, and Memberships so that the organization runs efficiently.
    Learn More
  • 1
    DINOv3

    DINOv3

    Reference PyTorch implementation and models for DINOv3

    DINOv3 is the third-generation iteration of Meta’s self-supervised visual representation learning framework, building upon the ideas from DINO and DINOv2. It continues the paradigm of learning strong image representations without labels using teacher–student distillation, but introduces a simplified and more scalable training recipe that performs well across datasets and architectures. DINOv3 removes the need for complex augmentations or momentum encoders, streamlining the pipeline while...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 2
    DeepSeek R1

    DeepSeek R1

    Open-source, high-performance AI model with advanced reasoning

    DeepSeek-R1 is an open-source large language model developed by DeepSeek, designed to excel in complex reasoning tasks across domains such as mathematics, coding, and language. DeepSeek R1 offers unrestricted access for both commercial and academic use. The model employs a Mixture of Experts (MoE) architecture, comprising 671 billion total parameters with 37 billion active parameters per token, and supports a context length of up to 128,000 tokens. DeepSeek-R1's training regimen uniquely...
    Downloads: 90 This Week
    Last Update:
    See Project
  • 3
    SAM 3D Body

    SAM 3D Body

    Code for running inference with the SAM 3D Body Model 3DB

    SAM 3D Body is a promptable model for single-image full-body 3D human mesh recovery, designed to estimate detailed human pose and shape from just one RGB image. It reconstructs the full body, including feet and hands, using the Momentum Human Rig (MHR), a parametric mesh representation that decouples skeletal structure from surface shape for more accurate and interpretable results. The model is trained to be robust in diverse, in-the-wild conditions, so it handles varied clothing,...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    DeepSeek-V3.2-Exp

    DeepSeek-V3.2-Exp

    An experimental version of DeepSeek model

    DeepSeek-V3.2-Exp is an experimental release of the DeepSeek model family, intended as a stepping stone toward the next generation architecture. The key innovation in this version is DeepSeek Sparse Attention (DSA), a sparse attention mechanism that aims to optimize training and inference efficiency in long-context settings without degrading output quality. According to the authors, they aligned the training setup of V3.2-Exp with V3.1-Terminus so that benchmark results remain largely...
    Downloads: 31 This Week
    Last Update:
    See Project
  • Ditto Edge Server is a lightweight standalone server for resource-constrained edge environments, based on the core Ditto Edge SDK. Icon
    Ditto Edge Server is a lightweight standalone server for resource-constrained edge environments, based on the core Ditto Edge SDK.

    With Ditto Edge Server, you can join devices as small as a Raspberry Pi to a local mesh network and synchronize data across edge environments.

    Ditto's Edge SDK is the only thing your edge devices need to ensure your application is operational in any environment, regardless of network conditions.
    Learn More
  • 5
    Hunyuan3D 2.0

    Hunyuan3D 2.0

    High-Resolution 3D Assets Generation with Large Scale Diffusion Models

    The Hunyuan3D-2 model, developed by Tencent, is designed for generating high-resolution 3D assets using large-scale diffusion models. This model offers advanced capabilities for creating detailed 3D models, including texture enhancements, multi-view shape generation, and rapid inference for real-time applications. It is particularly useful for industries requiring high-quality 3D content, such as gaming, film, and virtual reality. Hunyuan3D-2 supports various enhancements and is available...
    Downloads: 30 This Week
    Last Update:
    See Project
  • 6
    GLM-4.5

    GLM-4.5

    GLM-4.5: Open-source LLM for intelligent agents by Z.ai

    GLM-4.5 is a cutting-edge open-source large language model designed by Z.ai for intelligent agent applications. The flagship GLM-4.5 model has 355 billion total parameters with 32 billion active parameters, while the compact GLM-4.5-Air version offers 106 billion total parameters and 12 billion active parameters. Both models unify reasoning, coding, and intelligent agent capabilities, providing two modes: a thinking mode for complex reasoning and tool usage, and a non-thinking mode for...
    Downloads: 80 This Week
    Last Update:
    See Project
  • 7
    HeartMuLa

    HeartMuLa

    A Family of Open Sourced Music Foundation Models

    HeartMuLa is the open-source library and reference implementation for the HeartMuLa family of music foundation models, designed to support both music generation and music-related understanding tasks in a cohesive stack. At the center is HeartMuLa, a music language model that generates music conditioned on inputs like lyrics and tags, with multilingual support that broadens the range of lyric-driven use cases. The project also includes HeartCodec, a music codec optimized for high...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 8
    SAM 3D Objects

    SAM 3D Objects

    Models for object and human mesh reconstruction

    SAM 3D Objects is a foundation model that reconstructs full 3D geometry, texture, and spatial layout of objects and scenes from a single image. Given one RGB image and object masks (for example, from the Segment Anything family), it can generate a textured 3D mesh for each object, including pose and approximate scene layout. The model is specifically designed to be robust in real-world images with clutter, occlusions, small objects, and unusual viewpoints, where many earlier 3D-from-image...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    xFormers

    xFormers

    Hackable and optimized Transformers building blocks

    xformers is a modular, performance-oriented library of transformer building blocks, designed to allow researchers and engineers to compose, experiment, and optimize transformer architectures more flexibly than monolithic frameworks. It abstracts components like attention layers, feedforward modules, normalization, and positional encoding, so you can mix and match or swap optimized kernels easily. One of its key goals is efficient attention: it supports dense, sparse, low-rank, and...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Easy-to-Use Website Accessibility Widget Icon
    Easy-to-Use Website Accessibility Widget

    An accessibility solution for quick website accessibility improvement.

    All in One Accessibility is an AI based accessibility tool that helps organizations to enhance the accessibility and usability of websites quickly.
    Learn More
  • 10
    CogVideo

    CogVideo

    Text and image to video generation: CogVideoX and CogVideo

    CogVideo is an open-source family of advanced video generation models that can create videos from text, images, or existing video inputs. Built on large-scale Transformer and diffusion architectures, it enables multimodal generation across text-to-video, image-to-video, and video continuation tasks. The latest CogVideoX models offer higher resolution outputs, longer video durations, and improved controllability through prompt engineering. The project includes tools for inference,...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 11
    Depth Anything 3

    Depth Anything 3

    Recovering the Visual Space from Any Views

    Depth Anything 3 is a research-driven project that brings accurate and dense depth estimation to any input image or video, enabling foundational understanding of 3D structure from 2D visual content. Designed to work across diverse scenes, lighting conditions, and image types, it uses advanced neural networks trained on large, heterogeneous datasets, producing depth maps that reveal scene depth relationships and object surfaces with strong fidelity. The model can be applied to photography,...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 12
    DeepSeek VL

    DeepSeek VL

    Towards Real-World Vision-Language Understanding

    DeepSeek-VL is DeepSeek’s initial vision-language model that anchors their multimodal stack. It enables understanding and generation across visual and textual modalities—meaning it can process an image + a prompt, answer questions about images, caption, classify, or reason about visuals in context. The model is likely used internally as the visual encoder backbone for agent use cases, to ground perception in downstream tasks (e.g. answering questions about a screenshot). The repository...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 13
    VibeVoice

    VibeVoice

    Open-source multi-speaker long-form text-to-speech model

    VibeVoice-1.5B is Microsoft’s frontier open-source text-to-speech (TTS) model designed for generating expressive, long-form, multi-speaker conversational audio such as podcasts. Unlike traditional TTS systems, it excels in scalability, speaker consistency, and natural turn-taking for up to 90 minutes of continuous speech with as many as four distinct speakers. A key innovation is its use of continuous acoustic and semantic speech tokenizers operating at an ultra-low frame rate of 7.5 Hz,...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 14
    DB-GPT

    DB-GPT

    Revolutionizing Database Interactions with Private LLM Technology

    DB-GPT is an experimental open-source project that uses localized GPT large models to interact with your data and environment. With this solution, you can be assured that there is no risk of data leakage, and your data is 100% private and secure.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 15
    DeepSeek-OCR 2

    DeepSeek-OCR 2

    Visual Causal Flow

    DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 16
    Z80-μLM

    Z80-μLM

    Z80-μLM is a 2-bit quantized language model

    Z80-μLM is a retro-computing AI project that demonstrates a tiny language model (Z80-μLM) engineered to run on an 8-bit Z80 CPU by aggressively quantizing weights down to 2-bit precision. The repository provides a complete workflow where you train or fine-tune conversational models in Python, then export them into a format that can be executed on classic Z80 systems. A key deliverable is producing CP/M-compatible .COM binaries, enabling a genuinely vintage “chat with your computer”...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    LTX-Video

    LTX-Video

    Official repository for LTX-Video

    LTX-Video is a sophisticated multimedia processing framework from Lightricks designed to handle high-quality video editing, compositing, and transformation tasks with performance and scalability. It provides runtime components that efficiently decode, encode, and manipulate video streams, frame buffers, and audio tracks while exposing a rich API for building customized editing features like transitions, effects, color grading, and keyframe automation. The toolkit is built with both real-time...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 18
    GLM-OCR

    GLM-OCR

    Accurate × Fast × Comprehensive

    GLM-OCR is an open-source multimodal optical character recognition (OCR) model built on a GLM-V encoder–decoder foundation that brings robust, accurate document understanding to complex real-world layouts and modalities. Designed to handle text recognition, table parsing, formula extraction, and general information retrieval from documents containing mixed content, GLM-OCR excels across major benchmarks while remaining highly efficient with a relatively compact parameter size (~0.9B),...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 19
    FramePack

    FramePack

    Lets make video diffusion practical

    FramePack explores compact representations for sequences of image frames, targeting tasks where many near-duplicate frames carry redundant information. The idea is to “pack” frames by detecting shared structure and storing differences efficiently, which can accelerate training or inference on video-like data. By reducing I/O and memory bandwidth, datasets become lighter to load while models still see the essential temporal variation. The repository demonstrates both packing and unpacking...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 20
    AlphaGenome

    AlphaGenome

    Programmatic access to the AlphaGenome model

    The AlphaGenome API provides access to AlphaGenome, Google DeepMind’s unifying model for deciphering the regulatory code within DNA sequences. This repository contains client-side code, examples, and documentation to help you use the AlphaGenome API. AlphaGenome offers multimodal predictions, encompassing diverse functional outputs such as gene expression, splicing patterns, chromatin features, and contact maps. The model analyzes DNA sequences of up to 1 million base pairs in length and can...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    Qwen

    Qwen

    The official repo of Qwen chat & pretrained large language model

    Qwen is a series of large language models developed by Alibaba Cloud, consisting of various pretrained versions like Qwen-1.8B, Qwen-7B, Qwen-14B, and Qwen-72B. These models, which range from smaller to larger configurations, are designed for a wide range of natural language processing tasks. They are openly available for research and commercial use, with Qwen's code and model weights shared on GitHub. Qwen's capabilities include text generation, comprehension, and conversation, making it a...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 22
    HY-World 1.5

    HY-World 1.5

    A Systematic Framework for Interactive World Modeling

    HY-WorldPlay is a Hunyuan AI project focusing on immersive multimodal content generation and interaction within virtual worlds or simulated environments. It aims to empower AI agents with the capability to both understand and generate multimedia content — including text, audio, image, and potentially 3D or game-world elements — enabling lifelike dialogue, environmental interpretations, and responsive world behavior. The platform targets use cases in digital entertainment, game worlds,...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 23
    Phi-3-MLX

    Phi-3-MLX

    Phi-3.5 for Mac: Locally-run Vision and Language Models

    Phi-3-Vision-MLX is an Apple MLX (machine learning on Apple silicon) implementation of Phi-3 Vision, a lightweight multi-modal model designed for vision and language tasks. It focuses on running vision-language AI efficiently on Apple hardware like M1 and M2 chips.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    FinGPT

    FinGPT

    Open-Source Financial Large Language Models

    FinGPT is an open-source, finance-specialized large language model framework that blends the capabilities of general LLMs with real-time financial data feeds, domain-specific knowledge bases, and task-oriented agents to support market analysis, research automation, and decision support. It extends traditional GPT-style models by connecting them to live or historical financial datasets, news APIs, and economic indicators so that outputs are grounded in relevant and recent market conditions...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 25
    DeepSeek VL2

    DeepSeek VL2

    Mixture-of-Experts Vision-Language Models for Advanced Multimodal

    DeepSeek-VL2 is DeepSeek’s vision + language multimodal model—essentially the next-gen successor to their first vision-language models. It combines image and text inputs into a unified embedding / reasoning space so that you can query with text and image jointly (e.g. “What’s going on in this scene?” or “Generate a caption appropriate to context”). The model supports both image understanding (vision tasks) and multimodal reasoning, and is likely used as a component in agent systems to...
    Downloads: 10 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB