Showing 62 open source projects for "framework python"

View related business solutions
  • Skillfully - The future of skills based hiring Icon
    Skillfully - The future of skills based hiring

    Realistic Workplace Simulations that Show Applicant Skills in Action

    Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
    Learn More
  • Simplify your short-term rental business with our Channel Manager Icon
    Simplify your short-term rental business with our Channel Manager

    Boost your vacation rental revenue

    Manage your vacation rental like a pro with features that streamline, sync and simplify - all from one intuitive software.
    Learn More
  • 1
    NVIDIA Earth2Studio

    NVIDIA Earth2Studio

    Open-source deep-learning framework

    NVIDIA Earth2Studio is an open-source Python package and framework designed to accelerate the development and deployment of AI-driven weather and climate science workflows. It provides a unified API that lets researchers, data scientists, and engineers build complex forecasting and analysis pipelines by combining modular prognostic and diagnostic AI models with a diverse range of real-world data sources such as global forecast systems, reanalysis datasets, and satellite feeds. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    DINOv3

    DINOv3

    Reference PyTorch implementation and models for DINOv3

    DINOv3 is the third-generation iteration of Meta’s self-supervised visual representation learning framework, building upon the ideas from DINO and DINOv2. It continues the paradigm of learning strong image representations without labels using teacher–student distillation, but introduces a simplified and more scalable training recipe that performs well across datasets and architectures. DINOv3 removes the need for complex augmentations or momentum encoders, streamlining the pipeline while...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 3
    LTX-2

    LTX-2

    Python inference and LoRA trainer package for the LTX-2 audio–video

    LTX-2 is a powerful, open-source toolkit developed by Lightricks that provides a modular, high-performance base for building real-time graphics and visual effects applications. It is architected to give developers low-level control over rendering pipelines, GPU resource management, shader orchestration, and cross-platform abstractions so they can craft visually compelling experiences without starting from scratch. Beyond basic rendering scaffolding, LTX-2 includes optimized math libraries,...
    Downloads: 40 This Week
    Last Update:
    See Project
  • 4
    TADA

    TADA

    Open Source Speech Language Model

    TADA is an open-source speech-language modeling framework designed to unify spoken audio and text representations within a single generative architecture. The system focuses on aligning speech and text streams using a dual-alignment mechanism that synchronizes the acoustic signal with its textual representation. By modeling both modalities together, the framework allows developers to build systems capable of generating, understanding, and transforming speech and language simultaneously. This...
    Downloads: 0 This Week
    Last Update:
    See Project
  • The sales CRM that makes your life easy, so all you have to do is sell. Icon
    The sales CRM that makes your life easy, so all you have to do is sell.

    The simpler way to sell

    Welcome to the simpler way to sell. Pipedrive is CRM software that makes your life easy, for less legwork and more sales. Let us track your sales conversations, eliminate admin tasks, get you more leads and uncover how you win, because your day belongs to you. Join more than 100,000 sales teams around the world that use the CRM rated #1 by SoftwareReviews in 2019. Start your free 14-day trial and get full access – no credit card needed.
    Try it free (No Credit Card Required)
  • 5
    Wan2.2

    Wan2.2

    Wan2.2: Open and Advanced Large-Scale Video Generative Model

    Wan2.2 is a major upgrade to the Wan series of open and advanced large-scale video generative models, incorporating cutting-edge innovations to boost video generation quality and efficiency. It introduces a Mixture-of-Experts (MoE) architecture that splits the denoising process across specialized expert models, increasing total model capacity without raising computational costs. Wan2.2 integrates meticulously curated cinematic aesthetic data, enabling precise control over lighting,...
    Downloads: 166 This Week
    Last Update:
    See Project
  • 6
    fairseq2

    fairseq2

    FAIR Sequence Modeling Toolkit 2

    fairseq2 is a modern, modular sequence modeling framework developed by Meta AI Research as a complete redesign of the original fairseq library. Built from the ground up for scalability, composability, and research flexibility, fairseq2 supports a broad range of language, speech, and multimodal content generation tasks, including instruction fine-tuning, reinforcement learning from human feedback (RLHF), and large-scale multilingual modeling. Unlike the original fairseq—which evolved into a...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 7
    HY-World 1.5

    HY-World 1.5

    A Systematic Framework for Interactive World Modeling

    HY-WorldPlay is a Hunyuan AI project focusing on immersive multimodal content generation and interaction within virtual worlds or simulated environments. It aims to empower AI agents with the capability to both understand and generate multimedia content — including text, audio, image, and potentially 3D or game-world elements — enabling lifelike dialogue, environmental interpretations, and responsive world behavior. The platform targets use cases in digital entertainment, game worlds,...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    Hunyuan3D-1

    Hunyuan3D-1

    A Unified Framework for Text-to-3D and Image-to-3D Generation

    Hunyuan3D-1 is an earlier version in the same 3D generation line (the unified framework for text-to-3D and image-to-3D tasks) by Tencent Hunyuan. It provides a framework combining shape generation and texture synthesis, enabling users to create 3D assets from images or text conditions. While less advanced than version 2.1, it laid the foundations for the later PBR, higher resolution, and open-source enhancements. (Note: less detailed public documentation was found for Hunyuan3D-1 compared to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    CogVideo

    CogVideo

    Text and image to video generation: CogVideoX and CogVideo

    CogVideo is an open-source family of advanced video generation models that can create videos from text, images, or existing video inputs. Built on large-scale Transformer and diffusion architectures, it enables multimodal generation across text-to-video, image-to-video, and video continuation tasks. The latest CogVideoX models offer higher resolution outputs, longer video durations, and improved controllability through prompt engineering. The project includes tools for inference,...
    Downloads: 21 This Week
    Last Update:
    See Project
  • Endpoint Protection Software for Businesses | HYPERSECURE Icon
    Endpoint Protection Software for Businesses | HYPERSECURE

    DriveLock protects systems, data, end devices from data loss and misuse.

    The HYPERSECURE endpoint protection platform is a comprehensive suite of products and services enhanced by European third-party solutions. It ensures our customers’ IT security, regulatory compliance, and digital sovereignty.
    Learn More
  • 10
    LTX-Video

    LTX-Video

    Official repository for LTX-Video

    LTX-Video is a sophisticated multimedia processing framework from Lightricks designed to handle high-quality video editing, compositing, and transformation tasks with performance and scalability. It provides runtime components that efficiently decode, encode, and manipulate video streams, frame buffers, and audio tracks while exposing a rich API for building customized editing features like transitions, effects, color grading, and keyframe automation. The toolkit is built with both real-time...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 11
    Google DeepMind GraphCast and GenCast

    Google DeepMind GraphCast and GenCast

    Global weather forecasting model using graph neural networks and JAX

    GraphCast, developed by Google DeepMind, is a research-grade weather forecasting framework that employs graph neural networks (GNNs) to generate medium-range global weather predictions. The repository provides complete example code for running and training both GraphCast and GenCast, two models introduced in DeepMind’s research papers. GraphCast is designed to perform high-resolution atmospheric simulations using the ERA5 dataset from ECMWF, while GenCast extends the approach with...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    FinGPT

    FinGPT

    Open-Source Financial Large Language Models

    FinGPT is an open-source, finance-specialized large language model framework that blends the capabilities of general LLMs with real-time financial data feeds, domain-specific knowledge bases, and task-oriented agents to support market analysis, research automation, and decision support. It extends traditional GPT-style models by connecting them to live or historical financial datasets, news APIs, and economic indicators so that outputs are grounded in relevant and recent market conditions...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 13
    Sapiens

    Sapiens

    High-resolution models for human tasks

    Sapiens is a research framework from Meta AI focused on embodied intelligence and human-like multimodal learning, aiming to train agents that can perceive, reason, and act in complex environments. It integrates sensory inputs such as vision, audio, and proprioception into a unified learning architecture that allows agents to understand and adapt to their surroundings dynamically. The project emphasizes long-horizon reasoning and cross-modal grounding—connecting language, perception, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    ComfyUI-LTXVideo

    ComfyUI-LTXVideo

    LTX-Video Support for ComfyUI

    ComfyUI-LTXVideo is a bridge between ComfyUI’s node-based generative workflow environment and the LTX-Video multimedia processing framework, enabling creators to orchestrate complex video tasks within a visual graph paradigm. Instead of writing code to apply effects, transitions, edits, and data flows, users can assemble nodes that represent video inputs, transformations, and outputs, letting them prototype and automate video production pipelines visually. This integration empowers...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    HunyuanVideo-I2V

    HunyuanVideo-I2V

    A Customizable Image-to-Video Model based on HunyuanVideo

    HunyuanVideo-I2V is a customizable image-to-video generation framework from Tencent Hunyuan, built on their HunyuanVideo foundation. It extends video generation so that given a static reference image plus an optional prompt, it generates a video sequence that preserves the reference image’s identity (especially in the first frame) and allows stylized effects via LoRA adapters. The repository includes pretrained weights, inference and sampling scripts, training code for LoRA effects, and...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    Fara-7B

    Fara-7B

    An Efficient Agentic Model for Computer Use

    Fara-7B is a Microsoft initiative aimed at bringing rigor, transparency, and structured evaluation to AI systems through automated and customizable assessment frameworks. It provides stakeholders with a way to benchmark and evaluate models across dimensions such as fairness, robustness, security, privacy, and ethical considerations. Rather than relying on ad-hoc or manual review processes, FARA enables organizations to profile AI behavior using standardized tests, metrics, and reporting...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    HunyuanImage-3.0

    HunyuanImage-3.0

    A Powerful Native Multimodal Model for Image Generation

    HunyuanImage-3.0 is a powerful, native multimodal text-to-image generation model released by Tencent’s Hunyuan team. It unifies multimodal understanding and generation in a single autoregressive framework, combining text and image modalities seamlessly rather than relying on separate image-only diffusion components. It uses a Mixture-of-Experts (MoE) architecture with many expert subnetworks to scale efficiently, deploying only a subset of experts per token, which allows large parameter...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    ChatGPT Clone

    ChatGPT Clone

    ChatGPT interface with better UI

    ChatGPT Clone demonstrates a ChatGPT-style conversational interface wired to large-language-model backends, packaged so developers can self-host and extend. The goal is to replicate the core chat UX—message history, streaming tokens, code blocks, and system prompts—while letting you plug in different provider APIs or local models. It showcases a clean separation between the web client and the message orchestration layer so you can experiment with prompts, roles, and memory strategies. The...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 19
    DFlash

    DFlash

    Block Diffusion for Ultra-Fast Speculative Decoding

    DFlash is an open-source framework for ultra-fast speculative decoding using a lightweight block diffusion model to draft text in parallel with a target large language model, dramatically improving inference speed without sacrificing generation quality. It acts as a “drafter” that proposes likely continuations which the main model then verifies, enabling significant throughput gains compared to traditional autoregressive decoding methods that generate token by token. This approach has been...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Mesh R-CNN

    Mesh R-CNN

    code for Mesh R-CNN, ICCV 2019

    Mesh R-CNN is a 3D reconstruction and object understanding framework developed by Facebook Research that extends Mask R-CNN into the 3D domain. Built on top of Detectron2 and PyTorch3D, Mesh R-CNN enables end-to-end 3D mesh prediction directly from single RGB images. The model learns to detect, segment, and reconstruct detailed 3D mesh representations of objects in natural images, bridging the gap between 2D perception and 3D understanding. Unlike voxel-based or point-based approaches, Mesh...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    HunyuanWorld-Voyager

    HunyuanWorld-Voyager

    RGBD video generation model conditioned on camera input

    HunyuanWorld-Voyager is a next-generation video diffusion framework developed by Tencent-Hunyuan for generating world-consistent 3D scene videos from a single input image. By leveraging user-defined camera paths, it enables immersive scene exploration and supports controllable video synthesis with high realism. The system jointly produces aligned RGB and depth video sequences, making it directly applicable to 3D reconstruction tasks. At its core, Voyager integrates a world-consistent video...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 22
    HunyuanWorld 1.0

    HunyuanWorld 1.0

    Generating Immersive, Explorable, and Interactive 3D Worlds

    HunyuanWorld-1.0 is an open-source, simulation-capable 3D world generation model developed by Tencent Hunyuan that creates immersive, explorable, and interactive 3D environments from text or image inputs. It combines the strengths of video-based diversity and 3D-based geometric consistency through a novel framework using panoramic world proxies and semantically layered 3D mesh representations. This approach enables 360° immersive experiences, seamless mesh export for graphics pipelines, and...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    InstantCharacter

    InstantCharacter

    Personalize Any Characters with a Scalable Diffusion Transformer

    InstantCharacter is a tuning-free diffusion transformer framework created by Tencent Hunyuan / InstantX team, which enables generating images of a specific character (subject) from a single reference image, preserving identity and character features. Uses adapters, so full fine-tuning of the base model is not required. Demo scripts and pipeline API (via infer_demo.py, pipeline.py) included. It works by adapting a base image generation model with a lightweight adapter so that you can produce...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    GLM-TTS

    GLM-TTS

    Controllable & emotion-expressive zero-shot TTS

    GLM-TTS is an advanced text-to-speech synthesis system built on large language model technologies that focuses on producing high-quality, expressive, and controllable spoken output, including features like emotion modulation and zero-shot voice cloning. It uses a two-stage architecture where a generative LLM first converts text into intermediate speech token sequences and then a Flow-based neural model converts those tokens into natural audio waveforms, enabling rich prosody and voice...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 25
    Step-Audio

    Step-Audio

    Open-source framework for intelligent speech interaction

    Step-Audio is a unified, open-source framework aimed at building intelligent speech systems that combine both comprehension and generation: it integrates large language models (LLMs) with speech input/output to handle not only semantic understanding but also rich vocal characteristics like tone, style, dialect, emotion, and prosody. The design moves beyond traditional separate-component pipelines (ASR → text model → TTS), instead offering a multimodal model that ingests speech or audio and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB