Showing 18 open source projects for "data"

View related business solutions
  • PairSoft | AP Automation and Doc Management Icon
    PairSoft | AP Automation and Doc Management

    Free your team from manual processes.

    Streamline operations and elevate your team's efficiency with PairSoft. Our AP automation, procurement, and document management solutions eliminate manual processes, cut costs, and free your team to focus on strategic initiatives. Experience our state-of-the-art invoice-to-pay solution, now integrated with advanced AI technology for faster, smarter results. Our customers report a significant 70% reduction in approval times and annual savings of $62,000 in employee hours. At PairSoft, we aim to transform your business operations through automation. Explore the future of automation at pairsoft.com, where you can leverage cutting-edge features like invoice capture, OCR, and comprehensive AP automation to transform your workflow. Whether you are a small business or a large enterprise, our solutions are designed to scale with your needs, providing robust functionality and ease of use. Join the growing number of businesses that trust PairSoft.
    Learn More
  • Time tracking software for the global workforce Icon
    Time tracking software for the global workforce

    Teams of all sizes and in various industries that want the best time tracking and employee monitoring solution.

    It's easy with Hubstaff, a time-tracking and workforce management platform that automates almost every aspect of running or growing a business. Teams can track time to projects and to-dos using Hubstaff's desktop, web, or mobile applications. You'll be able to see how much time your team spends on different tasks, plus productivity metrics like activity rates and app usage through Hubstaff's online dashboard. Most of the available features are customizable on a per-user basis, so you can create the team management tool you need.
    Learn More
  • 1
    ESPnet

    ESPnet

    End-to-end speech processing toolkit

    ESPnet is a comprehensive end-to-end speech processing toolkit covering a wide spectrum of tasks, including automatic speech recognition (ASR), text-to-speech (TTS), speech translation (ST), speech enhancement, speaker diarization, and spoken language understanding. It uses PyTorch as its deep learning engine and adopts a Kaldi-style data processing pipeline for features, data formats, and experimental recipes. This combination allows researchers to leverage modern neural architectures while still benefiting from the robust data preparation practices developed in the speech community. ESPnet provides many ready-to-run recipes for popular academic benchmarks, making it straightforward to reproduce published results or serve as baselines for new research. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Chatterbox

    Chatterbox

    SoTA open-source TTS

    Chatterbox is Resemble AI's first production-grade open source TTS model. Licensed under MIT, Chatterbox has been benchmarked against leading closed-source systems like ElevenLabs and is consistently preferred in side-by-side evaluations. Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. It's also the first open source TTS model to support emotion exaggeration control, a powerful feature that makes your voices stand out. Try it now on our...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 3
    ChatTTS

    ChatTTS

    A generative speech model for daily dialogue

    ChatTTS is an open-source conversational text-to-speech model optimized for dialogue, developed by 2Noise. Trained on 100,000+ hours of English and Chinese conversation data, it excels at generating expressive prosody—pauses, interjections, laughter—for more natural-sounding speech synthesis in assistant and chatbot applications.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    ...NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI architectures are typically large and require a lot of data and compute for training. NeMo uses PyTorch Lightning for easy and performant multi-GPU/multi-node mixed-precision training. Supported models: Jasper, QuartzNet, CitriNet, Conformer-CTC, Conformer-Transducer, Squeezeformer-CTC, Squeezeformer-Transducer, ContextNet, LSTM-Transducer (RNNT), LSTM-CTC. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • All-in-One Mental Health EHR Icon
    All-in-One Mental Health EHR

    Simplify your systems. Strengthen your cash flow. Start fresh with Ensora Health

    Ensora Health’s Mental Health EHR is designed for mental health professionals, therapists, and practice managers looking for a secure, user-friendly solution to streamline administrative tasks and improve efficiency in their practice management
    Learn More
  • 5
    AI Runner

    AI Runner

    Offline inference engine for art, real-time voice conversations

    ...It is implemented as a desktop-oriented Python application and emphasizes privacy and self-hosting, allowing users to work with text-to-speech, speech-to-text, text-to-image and multimodal models without sending data to external services. At the core of its LLM stack is a mode-based architecture with specialized “modes” such as Author, Code, Research, QA and General, and a workflow manager that automatically routes user requests to the right agent based on the task. The project has a strong focus on developer ergonomics, with thorough development guidelines, environment configuration using .env variables, and a clear structure for tests, tools and agents.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 6
    NVIDIA NeMo Framework

    NVIDIA NeMo Framework

    Scalable generative AI framework built for researchers and developers

    ...NeMo 2.0 introduces a Python-based configuration system, replacing YAML with more flexible, programmable configs that can be versioned and composed for different experiments. The framework builds on PyTorch Lightning–style modular abstractions, so training scripts are composed from reusable components for data loading, models, optimizers, and schedulers, which simplifies experimentation and adaptation. NeMo is designed to scale: with tools like NeMo-Run, users can orchestrate large-scale experiments across thousands of GPUs.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    gTTS

    gTTS

    Python library and CLI tool to interface with Google Translate

    gTTS (Google Text-to-Speech) is a Python library and command-line tool that wraps the speech functionality of Google Translate. It lets you send text to the Google Translate TTS endpoint and receive spoken audio back as MP3 data, either written to a file, a file-like object, or standard output. The library is designed to handle long texts, using a speech-specific sentence tokenizer that keeps intonation and punctuation natural while splitting requests into acceptable chunks. It supports customizable text pre-processors, which can correct pronunciations, tweak formatting, or handle domain-specific vocabulary before sending it to the API. gTTS is primarily aimed at developers who want a quick way to add cloud-backed speech to scripts, apps, or pipelines without managing any model weights locally. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    LuxTTS

    LuxTTS

    A high-quality rapid TTS voice cloning model

    ...It implements a lightweight architecture based on ZipVoice and optimized sampling techniques so that it can generate speech at speeds up to roughly 150 times real-time on a single GPU and faster than real-time on CPU, all while producing audio at high fidelity with 48 kHz quality. The project supports zero-shot voice cloning, meaning it can adapt to a reference speaker’s voice with minimal example data, enabling realistic and personalized synthetic speech. Intended for developers, hobbyists, and creators, the repository includes installation instructions, usage examples, and Python APIs that make it feasible to integrate the model in local workflows, web demos, or production systems. Its design emphasizes efficiency and practicality, fitting within modest GPU memory footprints.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    MetaVoice-1B

    MetaVoice-1B

    Foundational model for human-like, expressive TTS

    ...Specifically, the base model (MetaVoice-1B) uses around 1.2 billion parameters and has been trained on a massive dataset — reportedly around 100,000 hours of speech data. The goal is to provide human-like, expressive, and flexible TTS: able to generate natural-sounding speech that can handle diverse inputs and likely generalize over voice styles, intonation, prosody, and perhaps multiple languages or accents. With that scale and dataset volume, MetaVoice aims to push the boundary of what open-source TTS models can achieve: high fidelity, natural prosody, and robustness even for edge cases. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Airlock Digital - Application Control (Allowlisting) Made Simple Icon
    Airlock Digital - Application Control (Allowlisting) Made Simple

    Airlock Digital delivers an easy-to-manage and scalable application control solution to protect endpoints with confidence.

    For organizations seeking the most effective way to prevent malware and ransomware in their environments. It has been designed to provide scalable, efficient endpoint security for organizations with even the most diverse architectures and rigorous compliance requirements. Built by practitioners for the world’s largest and most secure organizations, Airlock Digital delivers precision Application Control & Allowlisting for the modern enterprise.
    Learn More
  • 10
    Orpheus TTS

    Orpheus TTS

    Towards Human-Sounding Speech

    ...It is designed to produce human-like speech with natural intonation, emotion, and rhythm, targeting quality comparable to or better than many closed-source systems. The project ships both pretrained and finetuned English models, as well as a family of multilingual models released as a research preview, and includes data-processing scripts so users can train or finetune their own variants. Inference is provided through a Python package that uses vLLM under the hood for high-throughput, low-latency generation, including streaming examples that show how to generate audio chunks in real time. The maintainers provide Colab notebooks, a standardized prompting format, and one-click deployment via Baseten for production-grade, FP8/FP16 optimized inference with ~200 ms streaming latency.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    WhisperSpeech

    WhisperSpeech

    An Open Source text-to-speech system built by inverting Whisper

    ...The project aims to be for speech what Stable Diffusion is for images: powerful, hackable, and safe for commercial use, with code under Apache-2.0/MIT and models trained only on properly licensed data. Its architecture follows a token-based, multi-stage pipeline inspired by AudioLM and SPEAR-TTS: Whisper is used to produce semantic tokens, EnCodec compresses the waveform into acoustic tokens, and Vocos reconstructs high-fidelity audio from those tokens. The repository includes notebooks and scripts for inference, long-form synthesis, and finetuning, as well as pre-trained models and converted datasets hosted on Hugging Face. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    shuyuan

    shuyuan

    Reading book source

    shuyuan is a project oriented around reading and knowledge consumption, especially targeting large-scale text content such as books, articles, or educational material. The name suggests “academy” or “study hall,” and the tool aims to help users ingest, organize, and manage reading content — possibly offering features like text parsing, annotation, metadata generation, translation, or storage for later reference. The repository is set up to support document ingestion, indexing, and maybe some...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Bert-VITS2

    Bert-VITS2

    VITS2 backbone with multilingual-bert

    ...It provides emotional modeling through “emo embeddings,” allowing voices to be conditioned on different affective states during synthesis. Releases include optimizations for Japanese and English alignment, expanded training data, spec caching and pre-generation tools, as well as ONNX export for more lightweight inference deployments.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    EmotiVoice

    EmotiVoice

    Multi-Voice and Prompt-Controlled TTS Engine

    ...EmotiVoice provides multiple ways to interact with it, including a web interface, a Docker image, an HTTP API (including an OpenAI-compatible TTS API), and Python scripts for batch synthesis. It also supports voice cloning with your own data, backed by recipes for popular datasets like DataBaker and LJSpeech, so you can train or adapt voices to custom personas.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    WaveRNN

    WaveRNN

    WaveRNN Vocoder + TTS

    ...The repository includes scripts and code for preprocessing datasets such as LJSpeech, training Tacotron to produce mel spectrograms, training WaveRNN on those spectrograms (with optional GTA data), and finally generating audio. A quick_start.py script allows users to immediately synthesize example sentences from a pretrained model and inspect both generated audio and attention plots. For custom TTS, the project guides you through training Tacotron, forcing GTA spectrogram export when desired, training WaveRNN with or without GTA, and then running joint generation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Transformer TTS

    Transformer TTS

    Implementation of a Transformer based neural network

    ...This design addresses common autoregressive issues such as repetition, skipped words, and unstable attention, and results in robust, fast synthesis where all frames are predicted in parallel. The repository ships with tooling to build datasets (especially LJSpeech) and create training data, plus scripts to train both the aligner and the TTS model, monitor training with TensorBoard, and resume or reset training runs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    OpenSeq2Seq

    OpenSeq2Seq

    Toolkit for efficient experimentation with Speech Recognition

    ...The toolkit includes ready-made models for neural machine translation, automatic speech recognition, speech synthesis, language modeling, and additional NLP tasks such as sentiment analysis. It supports multi-GPU and multi-node data-parallel training, and integrates with Horovod to scale out across large GPU clusters. Mixed-precision support (float16) is optimized for NVIDIA Volta and Turing GPUs, allowing significant speedups and memory savings without sacrificing model quality. The project comes with configuration-driven training scripts, documentation, and examples that demonstrate how to set up pipelines for tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    DC-TTS

    DC-TTS

    TensorFlow Implementation of DC-TTS: yet another text-to-speech model

    ...The model is split into two networks: Text2Mel, which maps text to mel-spectrograms, and SSRN (spectrogram super-resolution network), which converts low-resolution mel-spectrograms into high-resolution magnitude spectrograms suitable for waveform synthesis. Training scripts, data loaders, and hyperparameter configurations are provided to reproduce results on several datasets, including LJ Speech for English, a Korean single-speaker dataset, and audiobook data from Nick Offerman and Kate Winslet.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB