Showing 11 open source projects for "nvidia"

View related business solutions
  • Wiz: #1 Cloud Security Software for Modern Cloud Protection Icon
    Wiz: #1 Cloud Security Software for Modern Cloud Protection

    Protect Everything You Build and Run in the Cloud

    Use the Wiz Cloud Security Platform to build faster in the cloud, enabling security, dev and devops to work together in a self-service model built for the scale and speed of your cloud development.
    Learn More
  • Secure Cloud Storage for Files, Photos and Documents | pCloud Icon
    Secure Cloud Storage for Files, Photos and Documents | pCloud

    Store, access, and manage your files on your own terms, from anywhere.

    Store, sync, and share your files securely with pCloud. Get up to 10 GB of free secure cloud storage and access your files from any device, anywhere.
    Learn More
  • 1
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    NVIDIA NeMo Framework

    NVIDIA NeMo Framework

    Scalable generative AI framework built for researchers and developers

    NVIDIA NeMo is a scalable, cloud-native generative AI framework aimed at researchers and PyTorch developers working on large language models, multimodal models, and speech AI (ASR and TTS), with growing support for computer vision. It provides collections of domain-specific modules and reference implementations that make it easier to pre-train, fine-tune, and deploy very large models on multi-GPU and multi-node infrastructure.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    clone-voice

    clone-voice

    A sound cloning tool with a web interface, using your voice

    ...The app is designed to be very easy to use: you download a precompiled package, double-click app.exe, and it launches a browser-based web interface where you control cloning and synthesis. It does not require an NVIDIA GPU to run basic tasks, although GPU acceleration can be used when available, making it accessible on modest machines. The tool supports around sixteen languages, including Chinese, English, Japanese, Korean, French, German, Italian, and others, and can capture reference voices directly from a microphone or from uploaded audio.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 4
    FastKoko

    FastKoko

    Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model

    FastKoko is a self-hosted text-to-speech server built around the Kokoro-82M model and exposed through a FastAPI backend. It is designed to be easy to deploy via Docker, with separate CPU and GPU images so that users can choose between pure CPU inference and NVIDIA GPU acceleration. The project exposes an OpenAI-compatible speech endpoint, which means existing code that talks to the OpenAI audio API can often be pointed at a Kokoro-FastAPI instance with minimal changes. It supports multiple languages and voicepacks and allows phoneme based generation for more accurate pronunciation and prosody. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Intelligent testing agents | Checksum.ai Icon
    Intelligent testing agents | Checksum.ai

    Checksum generates, runs, and maintains end-to-end tests automatically so your team ships with confidence as code output grows.

    Coding agents write the code. Checksum runs it—continuously testing against real APIs, real data, real edge cases—before it ever reaches production.
    Learn More
  • 5
    WhisperLive

    WhisperLive

    A nearly-live implementation of OpenAI's Whisper

    ...It runs as a server–client system in which the server hosts a Whisper backend and clients stream audio to be transcribed with very low delay. The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and network streams such as RTSP and HLS, making it flexible for live events, monitoring, or accessibility workflows. Configuration options let you control the number of clients, maximum connection time, and threading behavior so the server can be tuned for different deployment environments. ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 6
    Style-Bert-VITS2

    Style-Bert-VITS2

    Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles

    ...It includes a full GUI editor to script dialogue, set different styles per line, edit dictionaries, and save/load projects, plus a separate web UI and Colab notebooks for training and experimentation. For those who only need synthesis, the project is published as a Python library (pip install style-bert-vits2) and can run on CPU without an NVIDIA GPU, though training still requires GPU hardware.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 7
    OuteTTS

    OuteTTS

    Interface for OuteTTS models

    ...The project supports multiple backends including llama.cpp (Python bindings and server), Hugging Face Transformers, ExLlamaV2, VLLM and a JavaScript interface via Transformers.js, allowing it to run on CPUs, NVIDIA CUDA GPUs, AMD ROCm, Vulkan-capable GPUs, and Apple Metal. It also includes a notion of speaker profiles: you can create a speaker from a short audio sample, save it as JSON, and reuse it for consistent voice identity across generations and sessions. For best quality, the model is designed to work with a reference speaker clip and will inherit emotion, style, and accent from that reference.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Parallel WaveGAN

    Parallel WaveGAN

    Unofficial Parallel WaveGAN

    ...Its main goal is to provide a real-time neural vocoder that can turn mel spectrograms into high-quality speech audio efficiently. The repository is designed to work hand-in-hand with ESPnet-TTS and NVIDIA Tacotron2-style front ends, so you can build complete TTS or singing voice synthesis pipelines. It includes a large collection of “Kaldi-style” recipes for many datasets such as LJSpeech, LibriTTS, VCTK, JSUT, CMU Arctic, and multiple singing voice corpora in Japanese, Mandarin, Korean, and more. The project provides pre-trained models, Colab demos, and example configurations, allowing researchers to quickly evaluate vocoder quality or adapt models to new datasets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    HiFi-GAN

    HiFi-GAN

    Generative Adversarial Networks for Efficient and High Fidelity Speech

    ...In experiments on LJSpeech, HiFi-GAN was shown to achieve mean opinion scores close to human recordings while synthesizing 22.05 kHz audio up to ~168× faster than real time on an NVIDIA V100 GPU. A smaller configuration trades a bit of quality for even higher speed and can run more than 13× faster than real time on CPU, making it suitable for deployment scenarios without powerful GPUs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Streamline Hiring with Skill Assessments Icon
    Streamline Hiring with Skill Assessments

    Say goodbye to hiring guesswork. Use Canditech’s job simulation tests to assess real-world skills and make data-driven decisions.

    Canditech offers innovative, cheat-proof skill assessments and job simulations to transform your hiring process. From technical skills to soft skills, we help you assess candidates on actual job performance. With over 500 customizable tests and powerful video interview features, you can evaluate real-world capabilities, streamline your hiring, and reduce biases. Whether you’re hiring for remote roles, mass hiring, or looking to expand your diversity pool, Canditech’s data-driven platform ensures the right candidates are chosen for the job every time.
    Get a Free Demo
  • 10
    Bangla TTS

    Bangla TTS

    Bangla text to speech synthesis in python

    ...Installation -------------------------------------- * Install Anaconda * conda create -n new_virtual_env python==3.6.8 * conda activate new_virtual_env * pip install -r requirements.txt * While running for the first time, keep your internet connection on to download the weights of the speech synthesis models (>500 MB) * For fast inference, you must install tensorflow-gpu and have a NVidia GPU. Project link: https://github.com/zabir-nabil/bangla-tts
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    OpenSeq2Seq

    OpenSeq2Seq

    Toolkit for efficient experimentation with Speech Recognition

    ...It supports multi-GPU and multi-node data-parallel training, and integrates with Horovod to scale out across large GPU clusters. Mixed-precision support (float16) is optimized for NVIDIA Volta and Turing GPUs, allowing significant speedups and memory savings without sacrificing model quality. The project comes with configuration-driven training scripts, documentation, and examples that demonstrate how to set up pipelines for tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB