Showing 12 open source projects for "ffmpeg"

View related business solutions
  • Cloud-hosted construction project information management for improved communication, and increased efficiency. Icon
    Cloud-hosted construction project information management for improved communication, and increased efficiency.

    Ideal for on-premise project information management.

    Newforma empowers over 4M professionals and 1,500 AECO firms worldwide by revolutionizing Project Information Management. We transform vast amounts of project data into a meticulously organized, easily accessible, and fully searchable resource—all from a single, centralized platform. From pre-construction to years after completion, Newforma ensures you have the critical information you need at every stage of your projects.
    Learn More
  • Empower Your Workforce and Digitize Your Shop Floor Icon
    Empower Your Workforce and Digitize Your Shop Floor

    Benefits to Manufacturers

    Easily connect to most tools and equipment on the shop floor, enabling efficient data collection and boosting productivity with vital insights. Turn information into action to generate new ideas and better processes.
    Learn More
  • 1
    JavaCV

    JavaCV

    Java interface to OpenCV, FFmpeg, and more

    JavaCV uses wrappers from the JavaCPP Presets of commonly used libraries by researchers in the field of computer vision (OpenCV, FFmpeg, libdc1394, FlyCapture, Spinnaker, OpenKinect, librealsense, CL PS3 Eye Driver, videoInput, ARToolKitPlus, flandmark, Leptonica, and Tesseract) and provides utility classes to make their functionality easier to use on the Java platform, including Android. JavaCV also comes with hardware accelerated full-screen image display (CanvasFrame and GLCanvasFrame), easy-to-use methods to execute code in parallel on multiple cores (Parallel), user-friendly geometric and color calibration of cameras and projectors (GeometricCalibrator, ProCamGeometricCalibrator, ProCamColorCalibrator), detection and matching of feature points (ObjectFinder), a set of classes that implement direct image alignment of projector-camera systems (mainly GNImageAligner, ProjectiveTransformer, ProjectiveColorTransformer, ProCamTransformer, and ReflectanceInitializer), and more.
    Downloads: 23 This Week
    Last Update:
    See Project
  • 2
    ChatTTS webUI & API

    ChatTTS webUI & API

    A simple native web interface that uses ChatTTS to synthesize text

    ChatTTS-ui is a local web interface and API wrapper around the ChatTTS speech synthesis system, designed to make advanced TTS models easy to use from a browser. It runs a small backend server (Python + Torch + ffmpeg) and exposes a simple webpage where you can type text, adjust parameters, and generate audio. The project supports Chinese, English, and mixed text with digits and control symbols, making it suitable for bilingual content and numerically heavy text like announcements or prompts. From version 0.96 onward, ffmpeg installation is required for deployment, and previous CSV/PT voice tables are no longer valid, so users instead work with updated “voice value” parameters. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 3
    AI-Media2Doc

    AI-Media2Doc

    AI tool converting video/audio into structured documents instantly

    ...It is designed to transform multimedia inputs into formats such as knowledge notes, summaries, mind maps, and social-style articles, making content easier to review and reuse. AI-Media2Doc emphasizes privacy by processing media locally in the browser using WebAssembly-based ffmpeg, ensuring that original video files are not uploaded externally. It separates client-side media handling from backend AI processing, reducing data exposure while still enabling transcription and document generation. AI-Media2Doc supports flexible customization through prompts, allowing users to tailor output styles based on their needs. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    AI YouTube Shorts Generator

    AI YouTube Shorts Generator

    A python tool that uses GPT-4, FFmpeg, and OpenCV

    AI-YouTube-Shorts-Generator is a Python-based tool that automates the creation of short-form vertical video clips (“shorts”) from longer source videos — ideal for adapting content for platforms like YouTube Shorts, Instagram Reels, or TikTok. It analyzes input video (whether a local file or a YouTube URL), transcribes audio (with optional GPU-accelerated speech-to-text), uses an AI model to identify the most compelling or engaging segments, and then crops/resizes the video and applies...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Airlock Digital - Application Control (Allowlisting) Made Simple Icon
    Airlock Digital - Application Control (Allowlisting) Made Simple

    Airlock Digital delivers an easy-to-manage and scalable application control solution to protect endpoints with confidence.

    For organizations seeking the most effective way to prevent malware and ransomware in their environments. It has been designed to provide scalable, efficient endpoint security for organizations with even the most diverse architectures and rigorous compliance requirements. Built by practitioners for the world’s largest and most secure organizations, Airlock Digital delivers precision Application Control & Allowlisting for the modern enterprise.
    Learn More
  • 5
    StemRoller

    StemRoller

    Isolate vocals, drums, bass, and other instrumental stems from songs

    StemRoller is the first free app that enables you to separate vocal and instrumental stems from any song with a single click! StemRoller uses Facebook's state-of-the-art Demucs algorithm for demixing songs and integrates search results from YouTube. Simply type the name/artist of any song into the search bar and click the Split button that appears in the results! You'll need to wait several minutes for splitting to complete. Once stems have been extracted, you'll see an Open button next to...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 6
    SoniTranslate

    SoniTranslate

    Synchronized Translation for Videos

    SoniTranslate is a video translation and dubbing system that produces synchronized target-language audio tracks for existing video content. It provides a web UI built with Gradio, allowing users to upload a video, choose source and target languages, and then run a pipeline that handles transcription, translation and re-synthesis of speech. Under the hood, it uses advanced speech and diarization models to separate speakers, align audio with timecodes and respect subtitle timing, which lets...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 7
    ChatGPT Telegram Bot

    ChatGPT Telegram Bot

    A Telegram bot that integrates with OpenAI's official ChatGPT APIs

    A Telegram bot that integrates with OpenAI's official ChatGPT, DALL·E and Whisper APIs to provide answers. Ready to use with minimal configuration required.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    EasyVoice

    EasyVoice

    Open source text-to-speech tool, supports extra-long text

    ...Under the hood, easyVoice uses a modern stack with Vue 3 and Element Plus on the front end, Node.js and Express on the back end, and TTS engines such as Microsoft Azure TTS and OpenAI-compatible APIs, orchestrated through ffmpeg.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    CSM (Conversational Speech Model)

    CSM (Conversational Speech Model)

    A Conversational Speech Generation Model

    The CSM (Conversational Speech Model) is a speech generation model developed by Sesame AI that creates RVQ audio codes from text and audio inputs. It uses a Llama backbone and a smaller audio decoder to produce audio codes for realistic speech synthesis. The model has been fine-tuned for interactive voice demos and is hosted on platforms like Hugging Face for testing. CSM offers a flexible setup and is compatible with CUDA-enabled GPUs for efficient execution.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6Storage Self Storage Facility Management Software Icon
    6Storage Self Storage Facility Management Software

    For Self Storage Facility Owners, Operators, and Managers

    Stop struggling with outdated, clunky software. 6Storage is the modern software platform designed to make your self-storage operations easier.
    Learn More
  • 10
    MyBox

    MyBox

    Easy Tools of PDF, Image, File, Network, Data, and Medias

    javafx-desktop-apps pdf image ocr icc barcode color-palette text bytes markdown html archive compress digest video audio editor converter media https://github.com/Mararsh/MyBox Self-contain packages need not java env nor installation. Jar packages need Java 16 or higher.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    DCVGAN

    DCVGAN

    DCVGAN: Depth Conditional Video Generation, ICIP 2019.

    This paper proposes a new GAN architecture for video generation with depth videos and color videos. The proposed model explicitly uses the information of depth in a video sequence as additional information for a GAN-based video generation scheme to make the model understands scene dynamics more accurately. The model uses pairs of color video and depth video for training and generates a video using the two steps. Generate the depth video to model the scene dynamics based on the geometrical...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    3D ResNets for Action Recognition

    3D ResNets for Action Recognition

    3D ResNets for Action Recognition (CVPR 2018)

    We uploaded the pretrained models described in this paper including ResNet-50 pretrained on the combined dataset with Kinetics-700 and Moments in Time. We significantly updated our scripts. If you want to use older versions to reproduce our CVPR2018 paper, you should use the scripts in the CVPR2018 branch.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB