Synchronized Translation for Videos
Automated translation solution for visual novels
Comprehensive Gradio WebUI for audio processing
Hunyuan Translation Model Version 1.5
StreamSpeech is a seamless model for offline speech recognition
Machine Learning Systems: Design and Implementation
Robust Speech Recognition via Large-Scale Weak Supervision
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
A Web UI for easy subtitle using whisper model
AI-powered tool for generating, optimizing, and translating subtitles
Fast multimodal LLM for real-time voice interaction and AI apps
ChatGPT extension for scientific research work
End-to-end speech processing toolkit
Reading book source
Repo of Qwen2-Audio chat & pretrained large audio language model
Synthesizing and manipulating 2048x1024 images with conditional GANs
Automatically translates the text of a video based on a subtitle file
CodeGeeX4-ALL-9B, a versatile model for all AI software development
Framework for building neural networks
The ultimate RAG for your monorepo
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
AI framework for automated short video creation and editing tools
Easy-to-use and high-performance NLP and LLM framework
A nearly-live implementation of OpenAI's Whisper
OCR expert VLM powered by Hunyuan's native multimodal architecture