Showing 61 open source projects for "python user interface"

View related business solutions
  • Network Discovery Software | JDisc Discovery Icon
    Network Discovery Software | JDisc Discovery

    JDisc Discovery supports the IT organizationss of medium-sized businesses and large-scale enterprises.

    JDisc Discovery is a comprehensive network inventory and IT asset management solution designed to help organizations gain clear, up-to-date visibility into their IT environment. It automatically scans and maps devices across the network, including servers, workstations, virtual machines, and network hardware, to create a detailed inventory of all connected assets. This includes critical information such as hardware configurations, software installations, patch levels, and relationshipots between devices.
    Learn More
  • Polygon Software | Apparel Software | PLM and ERP Solutions Icon
    Polygon Software | Apparel Software | PLM and ERP Solutions

    Small to mid-sized sewn goods manufacturers and textile mills.

    PolyPM is an integrated enterprise resource planning (ERP) and product lifecycle management (PLM) solution developed by Polygon Software. Built for small to medium-sized apparel manufacturers, PolyPM enables businesses to integrate all aspects of the product development, supply chain and production processes, as well as instantly access all their style and manufacturing information anywhere in the world. This allows businesses to shorten time-to-market, incur lower development costs, and improve customer service and worker productivity.
    Learn More
  • 1
    MLX-Audio

    MLX-Audio

    A text-to-speech, speech-to-text and speech-to-speech library

    MLX-Audio is a speech library built on Apple’s MLX framework and optimized for Apple Silicon machines (M-series Macs). It focuses on text-to-speech and speech-to-speech workflows, with APIs and a command-line interface that make it easy to generate high-quality audio from text. Because it uses MLX and targets Apple Silicon, inference is fast and can take advantage of hardware acceleration and quantization for efficient on-device performance. The project provides a straightforward CLI (mlx_audio.tts.generate) as well as a Python API for programmatic generation of audio, including parameters for voice choice, speed, language hints, output format, and sample rate. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 2
    kokoro-onnx

    kokoro-onnx

    TTS with kokoro and onnx runtime

    kokoro-onnx is a text-to-speech toolkit that wraps the Kokoro neural TTS model in an easy-to-use ONNX Runtime interface, so you can generate speech from Python with minimal setup. It focuses on running efficiently on commodity hardware, including macOS with Apple Silicon, while still delivering near real-time performance for many use cases. The project ships prebuilt model files and a simple example script, so you can go from installation to producing an audio.wav file in just a few steps. ...
    Downloads: 122 This Week
    Last Update:
    See Project
  • 3
    Applio

    Applio

    A simple, high-quality voice conversion tool focused on ease of use

    Applio is a high-quality voice conversion toolkit designed to make modern RVC/VITS-based voice cloning accessible to non-experts. It focuses strongly on ease of use: installation scripts for Windows, Linux, and macOS set up dependencies and then launch a browser-based Gradio interface. Within that interface, users can train and run voice conversion models for tasks like singing conversion, speech-to-speech transformation, and voice cloning. The project is structured to be flexible through...
    Downloads: 90 This Week
    Last Update:
    See Project
  • 4
    gTTS

    gTTS

    Python library and CLI tool to interface with Google Translate

    gTTS (Google Text-to-Speech) is a Python library and command-line tool that wraps the speech functionality of Google Translate. It lets you send text to the Google Translate TTS endpoint and receive spoken audio back as MP3 data, either written to a file, a file-like object, or standard output. The library is designed to handle long texts, using a speech-specific sentence tokenizer that keeps intonation and punctuation natural while splitting requests into acceptable chunks. It supports...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Managed Cybersecurity Platform Built for MSPs Icon
    Managed Cybersecurity Platform Built for MSPs

    Discover the cyber platform that secures and insures SMEs

    In a world that lives and breathes all things digital, every business is at risk. Cybersecurity has become a major problem for small and growing businesses due to limited budgets, resources, time, and training. Hackers are leveraging these vulnerabilities, and most of the existing cybersecurity solutions on the market are too cumbersome, too complicated, and far too costly.
    Learn More
  • 5
    OuteTTS

    OuteTTS

    Interface for OuteTTS models

    ...The project supports multiple backends including llama.cpp (Python bindings and server), Hugging Face Transformers, ExLlamaV2, VLLM and a JavaScript interface via Transformers.js, allowing it to run on CPUs, NVIDIA CUDA GPUs, AMD ROCm, Vulkan-capable GPUs, and Apple Metal. It also includes a notion of speaker profiles: you can create a speaker from a short audio sample, save it as JSON, and reuse it for consistent voice identity across generations and sessions.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    clone-voice

    clone-voice

    A sound cloning tool with a web interface, using your voice

    Clone-voice is a local voice-cloning tool that lets you synthesize speech in any target voice or convert one recording into another voice using the same timbre. It is built around Coqui’s XTTS-v2 model, so it inherits multilingual support and modern neural TTS quality while wrapping it in a user-friendly desktop workflow. The app is designed to be very easy to use: you download a precompiled package, double-click app.exe, and it launches a browser-based web interface where you control cloning and synthesis. It does not require an NVIDIA GPU to run basic tasks, although GPU acceleration can be used when available, making it accessible on modest machines. ...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 7
    EPUB to Audiobook Converter

    EPUB to Audiobook Converter

    EPUB to audiobook converter, optimized for Audiobookshelf

    ...The project supports multiple TTS providers, including Microsoft Azure TTS, EdgeTTS, OpenAI TTS, local Piper, and Kokoro via an OpenAI-compatible endpoint, allowing users to choose between cloud and self-hosted voices. A recent addition is a Gradio-based WebUI, which wraps all configuration options in a graphical interface for users who prefer not to work with the command line. The tool offers advanced options such as controlling chapter ranges, handling paragraph detection via newline modes, removing endnote markers, and using regex-based search-and-replace files to tweak pronunciations. It can be run directly with Python or via Docker.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 8
    ChatTTS webUI & API

    ChatTTS webUI & API

    A simple native web interface that uses ChatTTS to synthesize text

    ChatTTS-ui is a local web interface and API wrapper around the ChatTTS speech synthesis system, designed to make advanced TTS models easy to use from a browser. It runs a small backend server (Python + Torch + ffmpeg) and exposes a simple webpage where you can type text, adjust parameters, and generate audio. The project supports Chinese, English, and mixed text with digits and control symbols, making it suitable for bilingual content and numerically heavy text like announcements or prompts. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 9
    FastRTC

    FastRTC

    The python library for real-time communication

    FastRTC is a Python library designed to simplify real-time communication (RTC), especially for audio and video streaming applications. It abstracts away much of the complexity that typically comes with implementing WebRTC by providing a simple interface — e.g. a Stream class — that can be mounted within a web backend (for example a FastAPI application).
    Downloads: 6 This Week
    Last Update:
    See Project
  • A privacy-first API that predicts global consumer preferences Icon
    A privacy-first API that predicts global consumer preferences

    Qloo AI adds value to a wide range of Fortune 500 companies in the media, technology, CPG, hospitality, and automotive sectors.

    Through our API, we provide contextualized personalization and insights based on a deep understanding of consumer behavior and more than 575 million people, places, and things.
    Learn More
  • 10
    ebook2audiobook

    ebook2audiobook

    Generate audiobooks from e-books, voice cloning & 1107+ languages

    ebook2audiobook is a tool to convert legally obtained eBooks (non-DRM) into fully narrated audiobooks, complete with chapters and metadata. It automates the pipeline: it reads the eBook file, splits it into appropriate segments (chapters, paragraphs), uses text-to-speech (TTS) models to synthesize audio, optionally applies voice cloning, and outputs a final audiobook — ideal for people who prefer listening over reading, or for accessibility purposes. The tool supports a wide array of...
    Downloads: 28 This Week
    Last Update:
    See Project
  • 11
    SoniTranslate

    SoniTranslate

    Synchronized Translation for Videos

    SoniTranslate is a video translation and dubbing system that produces synchronized target-language audio tracks for existing video content. It provides a web UI built with Gradio, allowing users to upload a video, choose source and target languages, and then run a pipeline that handles transcription, translation and re-synthesis of speech. Under the hood, it uses advanced speech and diarization models to separate speakers, align audio with timecodes and respect subtitle timing, which lets...
    Downloads: 28 This Week
    Last Update:
    See Project
  • 12
    VideoChat

    VideoChat

    Real-time voice interactive digital human

    ...It supports both pure end-to-end voice solutions based on multimodal large language models (GLM-4-Voice feeding directly into talking-head generation) and a more traditional cascaded pipeline using ASR → LLM → TTS → talking head. It is built as a Gradio Python demo, exposing a web interface where users can talk to an animated avatar that lip-syncs to synthesized speech while responding intelligently. The system is customizable: you can define your own avatar appearance and voice, and it supports voice cloning so you can generate a new voice from a short 3–10 second reference sample. The tech stack integrates FunASR for speech recognition, Qwen for language understanding, multiple TTS engines like GPT-SoVITS, CosyVoice, or edge-tts, and MuseTalk for talking-head generation.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Audiblez

    Audiblez

    Generate audiobooks from e-books

    Audiblez is a tool for generating high-quality .m4b audiobooks directly from .epub e-books using the Kokoro-82M neural text-to-speech model. It focuses on making audiobook creation easy and fast: from a single command, the tool splits an e-book into chapters, synthesizes audio for each section, and then merges the results into a structured audiobook with chapter-based WAV files and a final .m4b container. The Kokoro-82M model it uses is compact (82M parameters) yet natural sounding, trained...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    Matcha-TTS

    Matcha-TTS

    A fast TTS architecture with conditional flow matching

    Matcha-TTS is a non-autoregressive neural text-to-speech architecture that uses conditional flow matching to generate speech quickly while maintaining natural quality. It models speech as an ODE-based generative process, and conditional flow matching lets it reach high-quality audio in only a few synthesis steps, which greatly reduces latency compared to score-matching diffusion approaches. The model is fully probabilistic, so it can generate diverse realizations of the same text while still...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 15
    TTS WebUI

    TTS WebUI

    A single Gradio + React WebUI with extensions for ACE-Step

    TTS-WebUI is a unified Gradio + React web interface that brings together a large ecosystem of text-to-speech, voice conversion, and audio generation models under a single UI. It supports a wide range of models such as Bark, MusicGen, Tortoise, RVC, StyleTTS2, ParlerTTS, CosyVoice, XTTSv2, Stable Audio, SeamlessM4T, and many others, exposing them as interchangeable backends for speech and music synthesis.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 16
    AI Runner

    AI Runner

    Offline inference engine for art, real-time voice conversations

    AI Runner is an offline inference engine designed to run a collection of AI workloads on your own machine, including image generation for art, real-time voice conversations, LLM-powered chatbots and automated workflows. It is implemented as a desktop-oriented Python application and emphasizes privacy and self-hosting, allowing users to work with text-to-speech, speech-to-text, text-to-image and multimodal models without sending data to external services. At the core of its LLM stack is a mode-based architecture with specialized “modes” such as Author, Code, Research, QA and General, and a workflow manager that automatically routes user requests to the right agent based on the task. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 17
    Lingvo

    Lingvo

    Framework for building neural networks

    Lingvo is a TensorFlow based framework focused on building and training sequence models, especially for language and speech tasks. It was originally developed for internal research and later open sourced to support reproducible experiments and shared model implementations. The framework provides a structured way to define models, input pipelines, and training configurations using a common interface for layers, which encourages reuse across different tasks. It has been used to implement state...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    OpenAI-Compatible Edge-TTS API

    OpenAI-Compatible Edge-TTS API

    Free, high-quality text-to-speech API endpoint to replace OpenAI

    OpenAI-Compatible Edge-TTS API is a local, OpenAI-compatible text-to-speech API that uses edge-tts—Microsoft Edge’s online TTS service—as the backend. The project emulates the /v1/audio/speech endpoint used by OpenAI, so any client that can talk to the OpenAI TTS API can be redirected to this service with minimal changes. It exposes parameters for input text, voice selection, audio format, and playback speed, mirroring the OpenAI interface while mapping popular OpenAI voice names to...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Speech-AI-Forge

    Speech-AI-Forge

    Speech-AI-Forge is a project developed around TTS generation model

    Speech-AI-Forge is a full-stack project built around modern text-to-speech generation models, providing both an API server and a Gradio-based web UI for interactive use. At its core, it acts as a hub that wires together multiple speech-related capabilities, including TTS, speech-to-text and LLM-based control flows, behind a consistent interface. The system is designed to be deployed in several ways: you can try it online via hosted demos, spin it up in a one-click Colab environment, run it...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Bailing

    Bailing

    Bailing is a voice dialogue robot similar to GPT-4o

    Bailing is an open-source voice-dialogue assistant designed to deliver natural voice-based conversations by combining automatic speech recognition (ASR), voice activity detection (VAD), a large language model (LLM), and text-to-speech (TTS) in a single pipeline. Its goal is to offer a “voice-first” chat experience similar to what one might expect from a system like GPT-4o, but fully open and deployable by users. The project is modular: each core function — ASR, VAD, LLM, TTS — exists as a...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    shuyuan

    shuyuan

    Reading book source

    shuyuan is a project oriented around reading and knowledge consumption, especially targeting large-scale text content such as books, articles, or educational material. The name suggests “academy” or “study hall,” and the tool aims to help users ingest, organize, and manage reading content — possibly offering features like text parsing, annotation, metadata generation, translation, or storage for later reference. The repository is set up to support document ingestion, indexing, and maybe some...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Mice MX OS speech to text Voice Control

    Mice MX OS speech to text Voice Control

    Mice speech to text with MX Cinnamon OS ISO

    Note about this image This image contains a system based on Linux MX, which was created to improve accessibility within the Linux environment. The distribution uses the Cinnamon desktop interface, which is configured to be operated using voice commands and outputs. The user interface and the control of your own devices and home automation systems can be customized and extended. The voice control program MiceStTM.py was developed to enable easy adaptation to other languages. However, only German settings are currently implemented. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    Simple TTS Reader

    Simple TTS Reader

    A small clipboard reader

    Simple TTS Reader is a small utility that reads text from your clipboard using Microsoft Speech API. Whenever you copy any text, the app instantly converts it into spoken words. Select your preferred speech engine from those installed on your system, such as Microsoft Zira, and adjust speed and volume for personalized playback. The application can also be minimized to the system tray. Plus, it is free and comes with an intuitive interface that makes it accessible to everyone.
    Leader badge
    Downloads: 90 This Week
    Last Update:
    See Project
  • 24
    EasyTTS

    EasyTTS

    Text to Speech Utility

    EasyTTS is a text to speech app for 64 bit Windows that offers online and offline text-to-speech, with settings for how fast the voice is. It supports languages other than English but only if you are connected to the Internet. These are Spanish, Portuguese, Russian, French, and Mandarin (?) Chinese.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    QChartist

    QChartist

    Free and Open Source Technical Analysis Charting Software

    QChartist is a free and open source technical analysis charting software. Its purpose is to provide a complete set of tools to perform technical analysis on charts and data. It helps to make forecasts mainly for markets but can also be used for weather or any quantifiable data. The program is flexible and its functionalities can be easily extended. You can draw geometrical shapes on your charts or plot programmable indicators from your data. It is also possible to filter or merge data. I got...
    Downloads: 13 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB