Showing 78 open source projects for "translation-pack"

View related business solutions
  • Skillfully - The future of skills based hiring Icon
    Skillfully - The future of skills based hiring

    Realistic Workplace Simulations that Show Applicant Skills in Action

    Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
    Learn More
  • Award-Winning Medical Office Software Designed for Your Specialty Icon
    Award-Winning Medical Office Software Designed for Your Specialty

    Succeed and scale your practice with cloud-based, data-backed, AI-powered healthcare software.

    RXNT is an ambulatory healthcare technology pioneer that empowers medical practices and healthcare organizations to succeed and scale through innovative, data-backed, AI-powered software.
    Learn More
  • 1
    ComfyUI-3D-Pack

    ComfyUI-3D-Pack

    An extensive node suite that enables ComfyUI to process 3D inputs

    ComfyUI-3D-Pack is an extension package for the ComfyUI visual AI workflow environment that enables users to generate and manipulate 3D assets using advanced machine learning techniques. ComfyUI itself is a node-based interface for designing and executing generative AI pipelines, and this extension expands its capabilities by introducing nodes specifically designed for working with three-dimensional data.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 2
    Agent Starter Pack

    Agent Starter Pack

    Ship AI Agents to Google Cloud in minutes, not months

    Agent Starter Pack is a production-focused framework that provides pre-built templates and infrastructure for rapidly developing and deploying generative AI agents on Google Cloud. It is designed to eliminate the complexity of moving from prototype to production by bundling essential components such as deployment pipelines, monitoring, security, and evaluation tools into a single package.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    SoniTranslate

    SoniTranslate

    Synchronized Translation for Videos

    SoniTranslate is a video translation and dubbing system that produces synchronized target-language audio tracks for existing video content. It provides a web UI built with Gradio, allowing users to upload a video, choose source and target languages, and then run a pipeline that handles transcription, translation and re-synthesis of speech. Under the hood, it uses advanced speech and diarization models to separate speakers, align audio with timecodes and respect subtitle timing, which lets the generated dub track stay in sync with the original video structure. ...
    Downloads: 31 This Week
    Last Update:
    See Project
  • 4
    GalTransl

    GalTransl

    Automated translation solution for visual novels

    GalTransl is an automated translation system specifically designed for visual novels, particularly those in the “galgame” genre, leveraging large language models to streamline and enhance the translation process. It integrates support for multiple advanced LLM providers such as GPT-4, Claude, DeepSeek, and other models, enabling high-quality, context-aware translations that go beyond traditional machine translation approaches.
    Downloads: 5 This Week
    Last Update:
    See Project
  • AestheticsPro Medical Spa Software Icon
    AestheticsPro Medical Spa Software

    Our new software release will dramatically improve your medspa business performance while enhancing the customer experience

    AestheticsPro is the most complete Aesthetics Software on the market today. HIPAA Cloud Compliant with electronic charting, integrated POS, targeted marketing and results driven reporting; AestheticsPro delivers the tools you need to manage your medical spa business. It is our mission To Provide an All-in-One Cutting Edge Software to the Aesthetics Industry.
    Learn More
  • 5
    Voice-Pro

    Voice-Pro

    Comprehensive Gradio WebUI for audio processing

    Voice-Pro is the best gradio WebUI for transcription, translation and text-to-speech. It can be easily installed with one click. Create a virtual environment using Miniconda, running completely separate from the Windows system (fully portable). Supports real-time transcription and translation, as well as batch mode.
    Downloads: 36 This Week
    Last Update:
    See Project
  • 6
    HY-MT

    HY-MT

    Hunyuan Translation Model Version 1.5

    HY-MT (Hunyuan Translation) is a high-quality multilingual machine translation model suite developed to support mutual translation across dozens of languages with strong performance even at smaller model scales. It ships with both an 1.8 B parameter model and a larger 7 B model, the latter optimized not only for direct translation but also for formatted and contextualized output, allowing better handling of terminology and mixed-language content.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    StreamSpeech

    StreamSpeech

    StreamSpeech is a seamless model for offline speech recognition

    StreamSpeech is an “all-in-one” speech model designed to perform offline and simultaneous speech recognition, speech translation, and speech synthesis within a single unified architecture. Developed as part of an ACL 2024 paper, it targets streaming and low-latency scenarios where intermediate results and final translations or synthetic speech must be produced continuously as audio is being received. The model supports eight tasks: offline ASR, speech-to-text translation, speech-to-speech translation, and TTS, as well as their streaming or simultaneous counterparts, all handled by the same underlying system. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    OpenMLSys-ZH

    OpenMLSys-ZH

    Machine Learning Systems: Design and Implementation

    ...It helps bridge language barriers in open machine learning systems by providing side-by-side translation or localized explanations. The repository includes scripts or tooling to keep translation synchronized with upstream changes, versioning, and possibly translation metadata (contributors, timestamp). Users can browse or clone the translated documentation to follow along with the original content, deploy examples, or understand system internals in their preferred language.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    OpenAI Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented as a sequence of tokens to be predicted by the decoder, allowing a single model to replace many stages of a traditional speech-processing pipeline. ...
    Downloads: 80 This Week
    Last Update:
    See Project
  • The Most Powerful Software Platform for EHSQ and ESG Management Icon
    The Most Powerful Software Platform for EHSQ and ESG Management

    Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

    Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.
    Learn More
  • 10
    CodeGeeX

    CodeGeeX

    CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)

    CodeGeeX is a large-scale multilingual code generation model with 13 billion parameters, trained on 850B tokens across more than 20 programming languages. Developed with MindSpore and later made PyTorch-compatible, it is capable of multilingual code generation, cross-lingual code translation, code completion, summarization, and explanation. It has been benchmarked on HumanEval-X, a multilingual program synthesis benchmark introduced alongside the model, and achieves state-of-the-art performance compared to other open models like InCoder and CodeGen. CodeGeeX also powers IDE plugins for VS Code and JetBrains, offering features like code completion, translation, debugging, and annotation. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 11
    Whisper-WebUI

    Whisper-WebUI

    A Web UI for easy subtitle using whisper model

    Whisper WebUI is an open-source browser-based interface that simplifies the use of Whisper speech recognition models by providing an intuitive graphical environment for transcription, translation, and subtitle generation. Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools. The platform integrates optimized implementations such as faster-whisper, significantly improving transcription speed and reducing memory usage compared to standard models. ...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 12
    VideoCaptioner

    VideoCaptioner

    AI-powered tool for generating, optimizing, and translating subtitles

    VideoCaptioner is an open source AI-powered subtitle processing tool designed to simplify the workflow of creating subtitles for videos. It integrates speech recognition, language processing, and translation technologies to automatically generate and refine subtitles from video or audio sources. VideoCaptioner uses speech-to-text engines such as Whisper variants to transcribe spoken content and convert it into subtitle text with accurate timestamps. After transcription, large language models are used to intelligently restructure subtitles into natural sentences, correct wording, and improve readability for viewers. ...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 13
    Ultravox

    Ultravox

    Fast multimodal LLM for real-time voice interaction and AI apps

    ...Ultravox is optimized for low latency, achieving fast response times suitable for interactive voice agents and real-time applications. It supports use cases such as conversational AI agents, speech-to-speech translation, and analysis of spoken audio content. Ultravox also includes tooling and configuration systems for training, evaluation, and dataset integration.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    ChatGPT Academic

    ChatGPT Academic

    ChatGPT extension for scientific research work

    ChatGPT extension for scientific research work, specially optimized academic paper polishing experience, supports custom shortcut buttons, supports custom function plug-ins, supports markdown table display, double display of Tex formulas, complete code display function, new local Python/C++/Go project tree Analysis function/Project source code self-translation ability, newly added PDF and Word document batch summary function/PDF paper full-text translation function. All buttons are dynamically generated by reading functional.py, you can add custom functions at will, and liberate the pasteboard. Support for markdown tables output by GPT. If the output contains a formula, it will be displayed in tex form and rendered form at the same time, which is convenient for copying and reading.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    ESPnet

    ESPnet

    End-to-end speech processing toolkit

    ESPnet is a comprehensive end-to-end speech processing toolkit covering a wide spectrum of tasks, including automatic speech recognition (ASR), text-to-speech (TTS), speech translation (ST), speech enhancement, speaker diarization, and spoken language understanding. It uses PyTorch as its deep learning engine and adopts a Kaldi-style data processing pipeline for features, data formats, and experimental recipes. This combination allows researchers to leverage modern neural architectures while still benefiting from the robust data preparation practices developed in the speech community. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    shuyuan

    shuyuan

    Reading book source

    ...It likely supports different input formats (text, HTML, PDF), and may integrate optional translation or text normalization tools.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Qwen2-Audio

    Qwen2-Audio

    Repo of Qwen2-Audio chat & pretrained large audio language model

    ...It supports two major modes: Voice Chat (interactive voice only input) and Audio Analysis (audio + text instructions), with both base and instruction-tuned models. It is evaluated on many benchmarks (speech recognition, translation, sound classification, emotion, etc.), and offers pretrained models (e.g. 7B) released via ModelScope and Hugging Face. Code & examples provided with Hugging Face transformers, and usage via AutoProcessor, model classes etc. High performance on many standard benchmarks: ASR, speech-emotion recognition, vocal sound classification, speech translation etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    pix2pixHD

    pix2pixHD

    Synthesizing and manipulating 2048x1024 images with conditional GANs

    pix2pixHD is a PyTorch-based implementation of a conditional generative adversarial network designed for high-resolution image-to-image translation, capable of producing photorealistic outputs at resolutions up to 2048×1024. It is widely used to convert structured inputs such as semantic label maps into realistic images, making it particularly valuable in applications like autonomous driving simulation, face synthesis, and scene generation. The model improves upon earlier GAN approaches by introducing multi-scale generators and discriminators that enable stable training and fine detail generation at large resolutions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Auto Synced & Translated Dubs

    Auto Synced & Translated Dubs

    Automatically translates the text of a video based on a subtitle file

    Auto-Synced-Translated-Dubs is a toolchain that automatically translates and re-dubs videos using AI voices while keeping the new speech aligned to the original timing via subtitle files. It assumes you have a human-made SRT (or similar) subtitle file; the script then uses translation services such as Google Cloud or DeepL to generate translated subtitle tracks in one or more target languages. Using the timestamps of each subtitle line, it computes the required duration of each spoken segment and synthesizes audio via neural TTS services, producing one audio clip per subtitle entry. The tool then time-stretches or compresses each TTS clip to match the original speech duration exactly, which preserves lip-sync and rhythm as closely as possible without manual editing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    FramePack

    FramePack

    Lets make video diffusion practical

    FramePack explores compact representations for sequences of image frames, targeting tasks where many near-duplicate frames carry redundant information. The idea is to “pack” frames by detecting shared structure and storing differences efficiently, which can accelerate training or inference on video-like data. By reducing I/O and memory bandwidth, datasets become lighter to load while models still see the essential temporal variation. The repository demonstrates both packing and unpacking steps, making it straightforward to integrate into preprocessing pipelines. ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 21
    CodeGeeX4

    CodeGeeX4

    CodeGeeX4-ALL-9B, a versatile model for all AI software development

    ...Compared to its predecessors, CodeGeeX4 introduces improved reasoning, stronger alignment with developer needs, and better performance on real-world programming benchmarks. It supports tasks such as code completion, generation from natural language descriptions, code translation, bug fixing, and explanation. The repository provides model checkpoints, inference examples, and fine-tuning guides, making it adaptable for both research and practical software development workflows. With its open release, CodeGeeX4 aims to provide a transparent alternative to proprietary coding assistants while advancing the field of AI-assisted programming.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Lingvo

    Lingvo

    Framework for building neural networks

    ...It has been used to implement state of the art architectures such as recurrent neural networks, Transformer models, variational autoencoder hybrids, and multi task systems. Lingvo includes reference models and configurations for domains like machine translation, automatic speech recognition, language modeling, image understanding, and 3D object detection. Centralized hyperparameter configuration files allow researchers to share exact experiment setups so others can retrain and compare results reliably.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Code-Graph-RAG

    Code-Graph-RAG

    The ultimate RAG for your monorepo

    ...The system integrates with graph databases such as Memgraph to store and manage relationships, enabling efficient querying and visualization of complex dependencies. It also supports AI-driven query translation, converting natural language into graph queries for deeper analysis and interaction.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 24
    Transformers

    Transformers

    State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX

    ...Using pre-trained models can reduce your compute costs, carbon footprint, and save you the time and resources required to train a model from scratch. These models support common tasks in different modalities. Text, for tasks like text classification, information extraction, question answering, summarization, translation, text generation, in over 100 languages. Images, for tasks like image classification, object detection, and segmentation. Audio, for tasks like speech recognition and audio classification. Transformers provides APIs to quickly download and use those pretrained models on a given text, fine-tune them on your own datasets and then share them with the community on our model hub. ...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 25
    ShortGPT

    ShortGPT

    AI framework for automated short video creation and editing tools

    ShortGPT is an experimental AI-powered framework designed to automate the creation of short-form and long-form video content. It provides a structured system that handles multiple stages of the content creation workflow, including script generation, asset sourcing, voiceover synthesis, and video editing. ShortGPT uses large language models to generate scripts and prompts that guide the automated editing and production process. ShortGPT includes specialized content engines that manage...
    Downloads: 12 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next
MongoDB Logo MongoDB