An extensive node suite that enables ComfyUI to process 3D inputs
Ship AI Agents to Google Cloud in minutes, not months
Synchronized Translation for Videos
Automated translation solution for visual novels
Comprehensive Gradio WebUI for audio processing
Hunyuan Translation Model Version 1.5
StreamSpeech is a seamless model for offline speech recognition
Machine Learning Systems: Design and Implementation
Robust Speech Recognition via Large-Scale Weak Supervision
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
A Web UI for easy subtitle using whisper model
AI-powered tool for generating, optimizing, and translating subtitles
Fast multimodal LLM for real-time voice interaction and AI apps
ChatGPT extension for scientific research work
End-to-end speech processing toolkit
Reading book source
Repo of Qwen2-Audio chat & pretrained large audio language model
Synthesizing and manipulating 2048x1024 images with conditional GANs
Automatically translates the text of a video based on a subtitle file
Lets make video diffusion practical
CodeGeeX4-ALL-9B, a versatile model for all AI software development
Framework for building neural networks
The ultimate RAG for your monorepo
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
AI framework for automated short video creation and editing tools