A Python library for audio
Audiocraft is a library for audio processing and generation
Multimodal Diffusion with Representation Alignment
Official Python inference and LoRA trainer package
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Official repository for LTX-Video
Workflow and speech recognition app
Python inference and LoRA trainer package for the LTX-2 audio–video
Speech-to-text, text-to-speech, and speaker recognition
Spring AI Alibaba examples for building and testing AI apps
Transform your voice in real-time voxal voice changer
Build AI-powered semantic search applications
Customizable AI chat component for websites with API support
The Triton Inference Server provides an optimized cloud
App in java for chatting to a generative A.I. (involving tts and stt)
elevenlabs-api is an open source Java wrapper around the ElevenLabs
Easy Tools of PDF, Image, File, Network, Data, and Medias
Integrate with the latest language models, image generation and speech
Common Resource Grep
Beauty can be applied to live broadcasts, short videos, and selfies
AlphaPlayer is a video animation engine
IPTV/NVR/CCTV/Video cloud https://fastocloud.com
Speech Recognition System