Search Results for "python voice synthesis" - Page 5

Sort By:

Showing 448 open source projects for "python voice synthesis"

View related business solutions

Power through agendas and documents, make more informed decisions and conduct board meetings faster.
For team managers searching for a solution to manage their meetings

iBabs not only captures the entire decision-making process – it takes all the paperwork out of meetings. iBabs empowers everyone who has ever organized or attended, a meeting. With a seemingly simple app that offers complete control and a comprehensive overview of all those fiddly details. With about 3000 organizations and over 300,000 users, iBabs gives you peace of mind. So you can quickly organize effective meetings, and good decisions can be made with confidence. iBabs didn’t just happen overnight. We started analyzing and simplifying board meeting processes many years ago. We understand all the work that goes into meetings, and how to streamline everything so it all flows smoothly. On any device, confidentially, securely and automatically. Make good decisions with confidence.

Learn More
MicroStation by Bentley Systems is the trusted computer-aided design (CAD) software built specifically for infrastructure design.
Microstation enables architects, engineers, and designers to create precise 2D and 3D drawings that bring complex projects to life.

MicroStation is the only computer-aided design software for infrastructure design, helping architects and engineers like you bring their vision to life, present their designs to their clients, and deliver their projects to the community.

Learn More
1

VCClient

Software that uses AI to perform real-time voice conversion

VCClient is a real-time voice conversion system that uses machine learning models to transform a speaker’s voice into another voice with minimal latency. It is designed for live applications such as streaming, gaming, and virtual communication, where immediate feedback is essential. The system supports multiple voice conversion models, including RVC and other neural network-based approaches, allowing users to switch between different voices or customize their output. It provides both a...

Downloads: 12 This Week

Last Update: 2026-03-23
See Project
2

Everywhere

Context-aware desktop AI assistant that understands screen content

Everywhere is a context-aware desktop AI assistant designed to interact directly with the content displayed on a user’s screen. It distinguishes itself from traditional AI tools by eliminating the need for manual input methods such as copying text or taking screenshots, instead allowing users to invoke assistance instantly through a shortcut. It can analyze on-screen information in real time and provide contextual responses, making it useful for tasks like troubleshooting errors, summarizing...

Downloads: 7 This Week

Last Update: 4 hours ago
See Project
3

ChatTTS

A generative speech model for daily dialogue

ChatTTS is an open-source conversational text-to-speech model optimized for dialogue, developed by 2Noise. Trained on 100,000+ hours of English and Chinese conversation data, it excels at generating expressive prosody—pauses, interjections, laughter—for more natural-sounding speech synthesis in assistant and chatbot applications.

Downloads: 4 This Week

Last Update: 2026-04-10
See Project
4

TTS WebUI

A single Gradio + React WebUI with extensions for ACE-Step

TTS-WebUI is a unified Gradio + React web interface that brings together a large ecosystem of text-to-speech, voice conversion, and audio generation models under a single UI. It supports a wide range of models such as Bark, MusicGen, Tortoise, RVC, StyleTTS2, ParlerTTS, CosyVoice, XTTSv2, Stable Audio, SeamlessM4T, and many others, exposing them as interchangeable backends for speech and music synthesis. The project provides an installer that sets up Conda, Python environments, and all necessary dependencies, so users can focus on experimenting with voices instead of managing tooling. ...

Downloads: 2 This Week

Last Update: 2026-04-05
See Project
Planfix: Manage Projects, Team's Tasks and Business Processes
All-in-One Enterprise-Level Software is Now Available for SMB

Planfix is like a souped-up business process management system for folks who really know their stuff. It's built to help you dive deeper and gives you more options than your run-of-the-mill project and task management systems. Best part? Even small businesses and non-profits can get in on the action.

Learn More
5

ChemCrow

Chemcrow

ChemCrow is an AI-powered framework designed to assist in chemical research and discovery. It integrates AI models with chemical knowledge bases to provide intelligent recommendations for synthesis planning, reaction prediction, and material discovery. This tool helps automate and accelerate research in computational chemistry and drug development.

Downloads: 7 This Week

Last Update: 2025-02-25
See Project
6

CowAgent

AI assistant based on large models that can actively think and plan

CowAgent, based on the chatgpt-on-wechat project, is an open-source AI agent framework that integrates large language models into the WeChat ecosystem to create intelligent conversational assistants. It enables automated message handling by connecting WeChat accounts with AI models that can generate contextual replies, process voice messages, and produce images directly inside chats. The platform has evolved beyond a simple chatbot into a more autonomous agent capable of planning complex...

Downloads: 5 This Week

Last Update: 2026-04-14
See Project
7

AgentScope

Build and run agents you can see, understand and trust

AgentScope is a production-ready agent framework designed to help developers build, deploy, and scale intelligent agentic applications. It provides essential abstractions that evolve with advancing LLM capabilities, emphasizing reasoning, tool use, and flexible orchestration rather than rigid prompt constraints. With built-in support for ReAct agents, memory, planning, human-in-the-loop control, and real-time voice interaction, developers can create powerful agents in minutes. AgentScope...

Downloads: 0 This Week

Last Update: 2 days ago
See Project
8

Podcastfy.ai

Transforming Multimodal Content into Captivating Multilingual Audio

Podcastfy is an open-source Python package that transforms multi-modal content (text, images) into engaging, multi-lingual audio conversations using GenAI. Input content includes websites, PDFs, youtube videos as well as images. Unlike UI-based tools focused primarily on note-taking or research synthesis (e.g. NotebookLM), Podcastfy focuses on the programmatic and bespoke generation of engaging, conversational transcripts and audio from a multitude of multi-modal sources enabling customization and scale.

Downloads: 5 This Week

Last Update: 2024-11-16
See Project
9

LiveKit Agents

Framework for building realtime multimodal voice AI agents apps

LiveKit Agents is an open source framework designed for building realtime AI agents that can participate as programmable entities within communication sessions. It enables developers to create conversational and multimodal agents capable of processing voice, audio, and other inputs in realtime environments. These agents can join LiveKit rooms as participants and interact with users or systems through speech, text, and other modalities. LiveKit Agents provides libraries and tooling that allow...

Downloads: 3 This Week

Last Update: 2 days ago
See Project
Dragonfly | An In-Memory Data Store without Limits
Dragonfly Cloud is engineered to handle the heaviest data workloads with the strictest security requirements.

Dragonfly is a drop-in Redis replacement that is designed for heavy data workloads running on modern cloud hardware. Migrate in less than a day and experience up to 25X the performance on half the infrastructure.

Learn More
10

Flowly AI

Flowly is 100x faster than OpenClaw

Flowly is an open-source personal AI assistant that runs locally on your machine and connects to multiple communication platforms like Telegram, WhatsApp, Discord, and Slack. It acts as a centralized AI system that can perform tasks such as web browsing, file management, command execution, scheduling, and more—all while keeping your data private. Designed for flexibility, Flowly supports multiple AI providers and models through LiteLLM, allowing users to customize how their assistant...

Downloads: 5 This Week

Last Update: 2026-03-29
See Project
11

Synthetic Data Generator

SDG is a specialized framework

Synthetic Data Generator is an open-source framework designed to generate high-quality synthetic tabular datasets that replicate the statistical characteristics of real data while avoiding privacy risks. The platform enables developers and data scientists to create artificial datasets that preserve important relationships between variables without containing sensitive personal information. This makes the generated data suitable for tasks such as machine learning model training, testing...

Downloads: 1 This Week

Last Update: 2026-03-06
See Project
12

PyTorch3D

PyTorch3D is FAIR's library of reusable components for deep learning

PyTorch3D is a comprehensive library for 3D deep learning that brings differentiable rendering, geometric operations, and 3D data structures into the PyTorch ecosystem. It’s designed to make it easy to build and train neural networks that work directly with 3D data such as meshes, point clouds, and implicit surfaces. The library provides fast GPU-accelerated implementations of rendering pipelines, transformations, rasterization, and lighting—making it possible to compute gradients through...

Downloads: 1 This Week

Last Update: 2025-11-27
See Project
13

myGPTReader

AI Slack bot for reading, summarizing, and chatting with content

myGPTReader is an AI-powered Slack bot designed to help users read, summarize, and interact with various types of digital content through conversational interfaces. It enables users to quickly understand web pages, documents, and even video content by transforming them into interactive discussions rather than static reading experiences. myGPTReader supports a wide range of file formats, including eBooks, PDFs, and text-based documents, making it flexible for both casual and professional use...

Downloads: 2 This Week

Last Update: 6 days ago
See Project
14

MARS5

MARS5 speech model (TTS) from CAMB.AI

MARS5-TTS is CAMB.AI’s open-source English speech model designed for high-quality text-to-speech and voice emulation. It uses a two-stage architecture that combines an autoregressive (AR) model with a non-autoregressive (NAR) model, giving it both expressiveness and speed. The model is built to handle prosodically challenging content such as sports commentary, anime dialogue, and other high-energy or highly varied speech patterns with realistic rhythm and intonation. To control speaker...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
15

OpenAI Agents JS

A lightweight, powerful framework for multi-agent workflows

...The repo includes examples showing how to build agents that call local functions, chain between agents, validate input/output, stream responses, and interact in real time (e.g. voice agents via WebRTC). It also has tracing and debugging support so you can introspect how agents executed their workflows. Because it aligns closely with the Python Agents SDK, it aims for cross-language parity so that JS/TS devs can adopt similar agent architectures.

Downloads: 0 This Week

Last Update: 2 days ago
See Project
16

OpenaiBot

Refractoring ChatBot+LLM, Gpt-3.5-turbo, ChatGPT Bot/Voice Assistant

If you don't have the instant messaging platform you need or you want to develop a new application, you are welcome to contribute to this repository. You can develop a new Controller by using Event.py. Compatibility with multiple LLMs and integration with GPT and third-party systems is handled by our llm-kira project on GitHub. It can accurately limit billing, with limits and ID binding. Supports asynchronous operations and can handle multiple requests simultaneously. Allows for private and...

Downloads: 0 This Week

Last Update: 2024-04-29
See Project
17

MetaScreener

AI-powered tool for efficient abstract and PDF screening

MetaScreener is an open-source AI-assisted tool designed to streamline the screening process in systematic literature reviews and academic research workflows. The system helps researchers analyze large collections of academic abstracts and research papers to determine which studies are relevant for inclusion in evidence synthesis projects. Instead of manually reviewing hundreds or thousands of documents, researchers can use MetaScreener to apply machine learning techniques that assist with...

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
18

Stable Virtual Camera

Stable Virtual Camera: Generative View Synthesis with Diffusion Models

Stable Virtual Camera is a multi-view diffusion model developed by Stability AI that transforms 2D images into immersive 3D videos with realistic depth and perspective. Unlike traditional methods that require complex reconstruction or scene-specific optimization, this model allows users to generate novel views from any number of input images and define custom camera trajectories, enabling dynamic exploration of scenes. It supports various aspect ratios and can produce 3D-consistent videos up...

Downloads: 0 This Week

Last Update: 2025-03-20
See Project
19

Audiblez

Generate audiobooks from e-books

Audiblez is a tool for generating high-quality .m4b audiobooks directly from .epub e-books using the Kokoro-82M neural text-to-speech model. It focuses on making audiobook creation easy and fast: from a single command, the tool splits an e-book into chapters, synthesizes audio for each section, and then merges the results into a structured audiobook with chapter-based WAV files and a final .m4b container. The Kokoro-82M model it uses is compact (82M parameters) yet natural sounding, trained...

Downloads: 7 This Week

Last Update: 2025-11-30
See Project
20

Harbor LLM

Run a full local LLM stack with one command using Docker

Harbor is an open source, containerized toolkit designed to simplify running local large language model (LLM) environments. It combines a CLI and companion app to launch backends, frontends, and supporting services with minimal setup. With a single command, users can start preconfigured tools like Ollama and Open WebUI, enabling chat, workflows, and integrations immediately. Harbor supports multiple inference engines, including llama.cpp and vLLM, and connects them seamlessly to user...

Downloads: 3 This Week

Last Update: 5 days ago
See Project
21

MiniCPM-o

A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming

MiniCPM-o 2.6 is a cutting-edge multimodal large language model (MLLM) designed for high-performance tasks across vision, speech, and video. Capable of running on end-side devices such as smartphones and tablets, it provides powerful features like real-time speech conversation, video understanding, and multimodal live streaming. With 8 billion parameters, MiniCPM-o 2.6 surpasses its predecessors in versatility and efficiency, making it one of the most robust models available. It supports...

Downloads: 1 This Week

Last Update: 2025-05-15
See Project
22

SEO Machine

A specialized Claude Code workspace for creating long-form

SEO Machine is an AI-powered content production system built as a structured workspace for generating long-form, SEO-optimized blog content through automated workflows. It integrates research, writing, analysis, and optimization into a single pipeline, allowing users to produce high-quality articles tailored to search engine performance. The system uses specialized commands and agents to perform tasks such as keyword research, competitor analysis, content drafting, and optimization. It...

Downloads: 0 This Week

Last Update: 2026-04-10
See Project
23

PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model

PaddleSpeech is an open-source toolkit on PaddlePaddle platform for a variety of critical tasks in speech and audio, with state-of-art and influential models. Via the easy-to-use, efficient, flexible and scalable implementation, our vision is to empower both industrial application and academic research, including training, inference & testing modules, and deployment process. Low barriers to install, CLI, Server, and Streaming Server is available to quick-start your journey. We provide...

Downloads: 0 This Week

Last Update: 2025-03-04
See Project
24

FAY

Framework for building AI-powered interactive digital humans and agent

Fay is an open source framework designed to build and deploy interactive digital humans powered by large language models. It acts as a middleware layer that connects digital character technologies with conversational AI systems and business applications. Fay supports various types of digital humans, including 2.5D and 3D avatars, and can be integrated with applications running on mobile devices, PCs, web platforms, and embedded systems. Its architecture allows developers to combine different...

Downloads: 2 This Week

Last Update: 5 days ago
See Project
25

AI Voice Interface

Downloads: 0 This Week

Last Update: 2026-02-17
See Project