Showing 12 open source projects for "text to speech"

View related business solutions
  • The Most Powerful Software Platform for EHSQ and ESG Management Icon
    The Most Powerful Software Platform for EHSQ and ESG Management

    Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

    Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.
    Learn More
  • The AI workplace management platform Icon
    The AI workplace management platform

    Plan smart spaces, connect teams, manage assets, and get insights with the leading AI-powered operating system for the built world.

    By combining AI workflows, predictive intelligence, and automated insights, OfficeSpace gives leaders a complete view of how their spaces are used and how people work. Facilities, IT, HR, and Real Estate teams use OfficeSpace to optimize space utilization, enhance employee experience, and reduce portfolio costs with precision.
    Learn More
  • 1
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    OpenAI Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented...
    Downloads: 69 This Week
    Last Update:
    See Project
  • 2
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 3
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI architectures are typically large and require a lot of data and compute for training. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    FireRedASR

    FireRedASR

    Open-source industrial-grade ASR models

    FireRedASR is an industrial-grade family of open-source automatic speech recognition models designed to provide high-precision speech-to-text performance across languages including Mandarin, English, and various Chinese dialects, achieving new state-of-the-art benchmarks on public test sets. The project includes multiple model variants to meet different application needs, such as high-accuracy end-to-end interaction using an encoder-adapter-LLM framework and efficient real-time recognition using attention-based encoder-decoder architectures, giving developers flexibility in balancing performance and resource constraints. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Iris Powered By Generali - Iris puts your customer in control of their identity. Icon
    Iris Powered By Generali - Iris puts your customer in control of their identity.

    Increase customer and employee retention by offering Onwatch identity protection today.

    Iris Identity Protection API sends identity monitoring and alerts data into your existing digital environment – an ideal solution for businesses that are looking to offer their customers identity protection services without having to build a new product or app from scratch.
    Learn More
  • 5
    Mice MX OS speech to text Voice Control

    Mice MX OS speech to text Voice Control

    Mice speech to text with MX Cinnamon OS ISO

    Note about this image This image contains a system based on Linux MX, which was created to improve accessibility within the Linux environment. The distribution uses the Cinnamon desktop interface, which is configured to be operated using voice commands and outputs. The user interface and the control of your own devices and home automation systems can be customized and extended. The voice control program MiceStTM.py was developed to enable easy adaptation to other languages. However, only...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6

    Mice TTM

    mice stt tts

    Dieses Tool wird speziell für die Barrierefreiheit unter Linux entwickelt. Es ermöglicht das umwandeln/konvertieren/parsen von Texten die aus einer Spracherkennung stammen, in Diktate sowie das Ausführen von Makros. Dies funktioniert ohne Internet, da die Spracherkennung auf dem PC selbst erfolgt. Mausbewegungen auf benannte Wörter und dann entsprechend auswählen oder per Sprachbefehl klicken. Außerdem können Textpassagen z.B. unter Libreoffice Wirter per Sprachbefehl entsprechend...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Maia
    MAIA (MyApp Intelligence Artificial) is designed to provide a foundation for building your own voice-controlled assistant with Python. It uses various libraries and modules for speech recognition, text-to-speech synthesis, and custom functionality.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Tensor2Tensor

    Tensor2Tensor

    Library of deep learning models and datasets

    Deep Learning (DL) has enabled the rapid advancement of many useful technologies, such as machine translation, speech recognition and object detection. In the research community, one can find code open-sourced by the authors to help in replicating their results and further advancing deep learning. However, most of these DL systems use unique setups that require significant engineering effort and may only work for a specific problem or architecture, making it hard to run new experiments and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    FM2TXT

    FM2TXT

    RtlSdr listen to radio, recognize audio, and writes text file log

    Just log your favorite FM station speech to a text file using rtl-sdr dongle and speech recognition. Cross-platform tool. Follow the README on the download page for Windows installation. https://sourceforge.net/projects/fm2txt-rtlsdr/files/ If you prefer GitHub source, not SF: https://github.com/randaller/fm2txt For those, who want to recognize from soundcard, not from rtl-sdr (this allows to transcribe NFM etc): https://github.com/randaller/souncard2txt
    Downloads: 0 This Week
    Last Update:
    See Project
  • Inventory and Order Management Software for Multichannel Sellers Icon
    Inventory and Order Management Software for Multichannel Sellers

    Avoid stockouts, overselling, and losing control as your business grows.

    We are the most powerful inventory and order management platform for Amazon, Walmart, and multichannel product sellers. Centralize orders, product information, and fulfillment operations to run more efficiently, sell more products, and stay compliant with marketplace requirements so you can grow profitably.
    Learn More
  • 10
    JAVT - Just Another Voice Transformer

    JAVT - Just Another Voice Transformer

    Just Another Speech Recognition and Text to Speech software.

    JAVT or Just Another Voice Transformer (formerly, it is called Just Another Video Transcriber) is a Speech Recognition software that also support text to Speech and simple media conversion. JAVT allows you to convert from video files to audio wav file using ffmpeg, and then transcribe the audio file to text using either Microsoft SAPI or CMU Sphinx. You can also open a text file and allow JAVT to read it out for you through text to speech conversion.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Pythia is a natural language question answering system, which uses Speech Recognition and Text To Speech technologies to communicate with the user.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Voice Conference Manager uses VoiceXML and CCXML to control speech recognition, text to speech, and voice biometrics for a telephone conference service. Say the names or numbers of people and VCM places them into the call. Can be hosted on public servers
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB