Showing 53 open source projects for "duplicate"

View related business solutions
  • Skillfully - The future of skills based hiring Icon
    Skillfully - The future of skills based hiring

    Realistic Workplace Simulations that Show Applicant Skills in Action

    Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
    Learn More
  • The fastest way to host, scale and get paid on WordPress Icon
    The fastest way to host, scale and get paid on WordPress

    For developers searching for a web hosting solution

    Lightning-fast hosting, AI-assisted site management, and enterprise payments all in one platform designed for agencies and growth-focused businesses.
    Learn More
  • 1
    Mihomo

    Mihomo

    A simple Python Pydantic model for Honkai

    Mihomo is a Python client library leveraging Pydantic to model parsed Honkai: Star Rail user data from the Mihomo public API. It provides structured types, type hints, and convenience methods to fetch and transform player profiles, daily stats, and character details efficiently.
    Downloads: 194 This Week
    Last Update:
    See Project
  • 2
    ...Aby wybrać ścieżkę do katalogu z obrazami należy w pliku 'settings.txt' zapisać ścieżkę. Następnie można wykonywać program z: -an, -mnb, -c, -i Link do GitHub: https://github.com/Duke-Axer/Duplicate-Finder Wszystkie pytania proszę pisać na b.gabka.nkn@gmail.com
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    supabase-py

    supabase-py

    Python Client for Supabase. Query Postgres from Flask, Django

    Python Client for Supabase. Query Postgres from Flask, Django, FastAPI. Python user authentication, security policies, edge functions, file storage, and realtime data streaming. Good first issue.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    SortPhotos

    SortPhotos

    SortPhotos is a Python script that organizes photos and videos

    ...SortPhotos includes options for copying versus moving files, recursive searches, silent or test modes, and customizable start times for when a “day” begins. It also prevents duplicate files by comparing content, with an option to keep duplicates if needed. With support for automation through launch agents or cron jobs, SortPhotos is well-suited for photographers, archivists, and anyone looking to streamline large personal or professional media collections.
    Downloads: 3 This Week
    Last Update:
    See Project
  • IT Asset Management (ITAM) Software Icon
    IT Asset Management (ITAM) Software

    Supercharge Your IT Assets, the Easy Way

    Drowning in misplaced IT assets, compliance headaches, and shadow IT? Navigate to clarity with an intuitive IT Asset Management solution. Experience crisp visibility, effortless control, and unshakable security – all while freeing up your budget with optimized software licenses. The best part? It’s easy.
    Learn More
  • 5
    DefectDojo

    DefectDojo

    DefectDojo is a DevSecOps and vulnerability management tool

    DefectDojo is a security orchestration and vulnerability management platform. DefectDojo allows you to manage your application security program, maintain product and application information, triage vulnerabilities and push findings to systems like JIRA and Slack. DefectDojo enriches and refines vulnerability data using a number of heuristic algorithms that improve with the more you use the platform. DefectDojo integrates with 85+ security tools. DefectDojo has bi-directional integration with...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 6
    diskover-community

    diskover-community

    Open source file indexing & storage analytics powered by Elasticsearch

    ...By indexing file metadata from sources such as local file systems, network shares like NFS and SMB, and cloud storage, the tool provides a centralized way to analyze heterogeneous storage environments. Diskover also helps identify outdated or unused files, duplicate data, and inefficient storage usage that can waste resources or increase operational costs. A Python-based indexing engine performs the scanning and indexing tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Video-subtitle-extractor

    Video-subtitle-extractor

    A GUI tool for extracting hard-coded subtitle (hardsub) from videos

    Video hard subtitle extraction, generate srt file. There is no need to apply for a third-party API, and text recognition can be implemented locally. A deep learning-based video subtitle extraction framework, including subtitle region detection and subtitle content extraction. A GUI tool for extracting hard-coded subtitles (hardsub) from videos and generating srt files. Use local OCR recognition, no need to set up and call any API, and do not need to access online OCR services such as Baidu...
    Downloads: 68 This Week
    Last Update:
    See Project
  • 8
    ydata-profiling

    ydata-profiling

    Create HTML profiling reports from pandas DataFrame objects

    ydata-profiling primary goal is to provide a one-line Exploratory Data Analysis (EDA) experience in a consistent and fast solution. Like pandas df.describe() function, that is so handy, ydata-profiling delivers an extended analysis of a DataFrame while allowing the data analysis to be exported in different formats such as html and json.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 9
    Meta Package Manager

    Meta Package Manager

    Wraps all package managers with a unifying CLI

    Meta Package Manager wraps all package managers with a unifying CLI, and provides the MPM CLI, a wrapper around all package managers. MPM is like yt-dlp, but for package managers instead of videos. MPM solves XKCD #1654 - Universal Install Script. List installed packages. List duplicate installed packages. Search for packages. Install a package, remove a package, and list outdated packages. Sync local package infos. Upgrade all outdated packages. Backup list of installed packages to TOML file. Restore/install list of packages from TOML files. Pin-point commands to a subset of package managers (include/exclude selectors). ...
    Downloads: 40 This Week
    Last Update:
    See Project
  • Wiz: #1 Cloud Security Software for Modern Cloud Protection Icon
    Wiz: #1 Cloud Security Software for Modern Cloud Protection

    Protect Everything You Build and Run in the Cloud

    Use the Wiz Cloud Security Platform to build faster in the cloud, enabling security, dev and devops to work together in a self-service model built for the scale and speed of your cloud development.
    Learn More
  • 10
    FramePack

    FramePack

    Lets make video diffusion practical

    FramePack explores compact representations for sequences of image frames, targeting tasks where many near-duplicate frames carry redundant information. The idea is to “pack” frames by detecting shared structure and storing differences efficiently, which can accelerate training or inference on video-like data. By reducing I/O and memory bandwidth, datasets become lighter to load while models still see the essential temporal variation. The repository demonstrates both packing and unpacking steps, making it straightforward to integrate into preprocessing pipelines. ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 11
    Pylint

    Pylint

    It's not just a linter that annoys you!

    Pylint is a static code analyzer for Python 2 or 3. The latest version supports Python 3.7.2 and above. Pylint analyses your code without actually running it. It checks for errors, enforces a coding standard, looks for code smells, and can make suggestions about how the code could be refactored. Projects that you might want to use alongside pylint include flake8 (faster and simpler checks with very few false positives), mypy, pyright or pyre (typing checks), bandit (security-oriented...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 12
    SuperDuperDB

    SuperDuperDB

    Integrate, train and manage any AI models and APIs with your database

    ...A single scalable deployment of all your AI models and APIs which is automatically kept up-to-date as new data is processed immediately. No need to introduce an additional database and duplicate your data to use vector search and build on top of it. SuperDuperDB enables vector search in your existing database. Integrate and combine models from Sklearn, PyTorch, HuggingFace with AI APIs such as OpenAI to build even the most complex AI applications and workflows. Train models on your data in your datastore simply by querying without additional ingestion and pre-processing.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    tumblr-crawler

    tumblr-crawler

    Python crawler to download photos and videos from Tumblr blogs

    tumblr-crawler is an open source Python-based utility designed to download media content from Tumblr blogs. It provides a script that automatically retrieves photos and videos from specified Tumblr sites and saves them locally for offline access. Users can specify one or multiple blogs to crawl by editing a configuration file or by passing parameters through the command line. Once executed, the script fetches media from the Tumblr API and stores the downloaded files in folders named after...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    kg-gen

    kg-gen

    Knowledge Graph Generation from Any Text

    kg-gen is an open-source framework developed by the STAIR Lab that automatically generates knowledge graphs from unstructured text using large language models. The system is designed to transform plain text sources such as documents, articles, or conversation transcripts into structured graphs composed of entities and relationships. Instead of relying on traditional rule-based extraction techniques, KG-Gen uses language models to identify entities and their relationships, producing...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Swirl

    Swirl

    Swirl queries any number of data sources with APIs

    Swirl queries any number of data sources with APIs and uses spaCy and NLTK to re-rank the unified results without extracting and indexing anything! Includes zero-code configs for Apache Solr, ChatGPT, Elastic Search, OpenSearch, PostgreSQL, Google BigQuery, RequestsGet, Google PSE, NLResearch.com, Miro & more! SWIRL adapts and distributes queries to anything with a search API - search engines, databases, noSQL engines, cloud/SaaS services etc - and uses AI (Large Language Models) to re-rank...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    littleutils

    Various small and useful command-line utilities

    The littleutils include duplicate file finders (repeats, repeats.pl, repeats.py), image optimizers (opt-jpg, opt-png, opt-gif, recomp-jpg), file rename tools (lowercase, uppercase, pren), archive recompressors (to-gzip, to-bzip2, to-bzip3, to-7zip, to-lzma, to-lzip, to-xz), a tempfile utility (tempname), file property tools (filedate, filemode, filenode, fileown, filesize, and lrealpath), and others.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 17
    The Timeline Project

    The Timeline Project

    Cross-platform app for displaying and navigating events on a timeline.

    The Timeline Project aims to create a free, cross-platform application for displaying and navigating events on a timeline.
    Leader badge
    Downloads: 145 This Week
    Last Update:
    See Project
  • 18
    Web Link Collector 1000

    Web Link Collector 1000

    Automatically collect all links from websites to a clean txt file

    .... ## Features - Two Collection Modes: Single page or multiple pages of specific website section, or even the entire domain! - Smart Filtering: Include only same-domain links or gather external links too - Duplicate Prevention: Automatically removes duplicate links - Website-Friendly: Uses respectful delays between requests - Custom File Naming: Save your collections with custom meaningful names - Modern Interface: Clean design with status updates - Link Normalization: Standardizes URLs for proper formatting
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    SortMyShit

    A tool designed to help you organize and manage your files

    SortMyShit is an open-source Python project designed to help you organize and manage your files effortlessly. It provides customizable sorting rules to keep your directories clean and structured.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    MiniMonitor

    MiniMonitor

    Lightweight capture card & mic monitoring with minimal resources

    MiniMonitor is a lightweight Windows application designed for monitoring capture cards and microphones with minimal system impact. Ideal for Elgato and other capture devices, it detects connected video and audio inputs, tests microphones, and provides real-time video and audio playback. Users can select devices through a simple GUI, toggle Fullscreen display, and quickly check functionality without heavy software overhead. Built with Python, OpenCV, PyAudio, and Tkinter, MiniMonitor is...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    MediaCrate — Video/Audio Downloader

    MediaCrate — Video/Audio Downloader

    Download video and audio from over 1,000+ websites with one click

    MediaCrate is a lightweight desktop application for downloading video and audio from various websites, including YouTube, Instagram, TikTok, Facebook and many others. It's rather simple to use. Paste a link, select format and quality, and download. MediaCrate is designed with performance and simplicity in mind, maintaining minimal CPU usage while idle and a small memory footprint during downloads. Project links: Website: justagwas.com/projects/mediacrate GitHub:...
    Leader badge
    Downloads: 9 This Week
    Last Update:
    See Project
  • 22
    TextureAtlas Toolbox

    TextureAtlas Toolbox

    A powerful, free and open-source tool for TextureAtlases/Spritesheets

    TextureAtlas Toolbox is an all-in-one solution for working with texture atlases and sprite sheets. Extract sprites into organized frame collections and GIF/WebP/APNG animations, generate optimized atlases from individual frames, or convert between 15+ atlas formats. Perfect for game developers, modders, and anyone creating showcases of game sprites. Formerly known as TextureAtlas to GIFs and Frames Licensed under AGPL-3.0 Third-party licenses: See...
    Leader badge
    Downloads: 53 This Week
    Last Update:
    See Project
  • 23
    garysfm

    garysfm

    An advanced file manager with qss themes and iso and folder previews

    garysfm which stands for Gary's File Manager is a file manager with some advanced features. Those features include bulk renaming and folder image previews. I has rather advanced search functions, tab browsing with persistence between launches. It remembers your folder sorting and view options in icon view. It also remembers your active tabs between sessions. It has progress dialog while doing large operations like copying large files, and folders with many files. python version works on...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 24
    text-dedup

    text-dedup

    All-in-one text de-duplication

    text-dedup is a Python library that enables efficient deduplication of large text corpora by using MinHash and other probabilistic techniques to detect near-duplicate content. This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage. It supports Jaccard similarity thresholding, parallel execution, and flexible deduplication strategies, making it ideal for cleaning web-scraped data, language model training datasets, or document archives.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    DuplicaErase
    Introducing a lightweight duplicate file remover tool for Windows. With this software, you can efficiently identify duplicate files, determine the amount of storage occupied by these files, and seamlessly remove them. Free up valuable disk space by eliminating redundant copies and optimizing your storage usage. Some Antivers may show false virus alert ignore it , this tool delete files so i think thats why they think this is virus.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB