Showing 253 open source projects for "data"

View related business solutions
  • AI-Powered Identity Governance Icon
    AI-Powered Identity Governance

    For IT Teams and MSPs in need of a solution to simplify, optimize and secure their SaaS, file, and device management operations

    Define governance policies, manage access, and optimize licenses with unified visibility across every identity, app, and file.
    Learn More
  • White Labeled Fintech Software Solutions | Centrex Icon
    White Labeled Fintech Software Solutions | Centrex

    Centrex is a full suite of white labeled fintech solutions built and designed for brokers, lenders, banks, investors, fintechs

    The Centrex products include: CRM, loan origination, loan and advance servicing software, syndication management, white labeled mobile app, money manager, underwriting, Esign, and website smart app builder. The Centrex services include: fintech software consulting, admin retainer services, and managed data cloud.
    Learn More
  • 1
    Changelog CI

    Changelog CI

    Changelog CI is a GitHub Action that enables a project

    ...First, it tries to get the latest release from the repository (If available). Then, it checks all the pull requests/commits merged after the last release using the GitHub API. After that, it parses the data and generates the changelog. It is able to use Markdown or reStructuredText to generate a Changelog. Finally, It writes the generated changelog at the beginning of the CHANGELOG.md/CHANGELOG.rst (or user-provided filename) file. In addition to that, if a user provides a configuration file (JSON/YAML), Changelog CI parses the user-provided configuration file and renders the changelog according to user's configuration.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Trafilatura

    Trafilatura

    Python & command-line tool to gather text on the Web

    ...Going from raw HTML to essential parts can alleviate many problems related to text quality, first by avoiding the noise caused by recurring elements (headers, footers, links/blogroll etc.) and second by including information such as author and date in order to make sense of the data. The extractor tries to strike a balance between limiting noise (precision) and including all valid parts (recall). It also has to be robust and reasonably fast, it runs in production on millions of documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    autocrawler

    autocrawler

    Multiprocess Selenium crawler for downloading images by keywords

    AutoCrawler is a Python-based image crawling tool designed to automatically download large numbers of images from search engines using automated browser interaction. It uses Selenium and a Chrome browser driver to navigate image search pages and collect image sources based on keywords provided by the user. AutoCrawler supports multiprocess and multithreaded downloading, which allows it to retrieve images faster by running several tasks simultaneously. Users provide search terms through a...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Echo HTML Viewer

    Echo HTML Viewer

    Fast offline HTML viewer for opening local HTML files on Windows

    Echo HTML Viewer is a lightweight desktop app for viewing local HTML files without a browser or internet connection. Designed for simplicity and privacy, it lets you open saved web pages, documentation, and archived content in a clean, distraction-free interface. Key features: • Open HTML files instantly • Drag & drop support • Fast startup and low resource usage • Fully offline — no telemetry, no tracking • No background services Use cases: • View saved websites...
    Leader badge
    Downloads: 60 This Week
    Last Update:
    See Project
  • All-in-one solution to control corporate spending Icon
    All-in-one solution to control corporate spending

    Issuance in seconds. Full spending control. Perfect for media buying.

    Wallester Business is a leading world-class solution to optimize your company’s financial processes! Issuing virtual and physical corporate expense cards with an IBAN account, expense monitoring, limit regulation, convenient accounting, subscription control — manage your finance on all-in-one platform in real time! Wallester Business benefits your business growth!
    Learn More
  • 5

    apache-logs-to-mysql

    Apache Log Parser and Data Normalization Application

    Apache Log Parser and Data Normalization Application Python handles File Processing & MySQL handles Data Processing ApacheLogs2MySQL consists of two Python Modules & one MySQL Schema to automate importing Access & Error files and normalizing data into database designed for reports & data analysis. Runs on Windows, Linux and MacOS & tested with MySQL versions 8.0.39, 8.4.3, 9.0.0 & 9.1.0. 4 LogFormats & 2 ErrorLogFormats can be loaded and 5 MySQL Stored Procedures can be processed in a single Python `ProcessLogs function` execution. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    TOMUSS

    TOMUSS: The Online Multi User Simple Spreadsheet

    TOMUSS is an interactive web application (groupware) allowing multiple concurrent users to edit data tables. Its primary goal is the management of students grades.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Plum Cave Twofish

    Plum Cave Twofish

    A version of Plum Cave that uses the ChaCha20 and Twofish ciphers

    A version of Plum Cave that employs the "ChaCha20 + Twofish-256 CBC + HMAC-SHA3-512" authenticated encryption scheme for data encryption and ML-KEM-1024 for quantum-resistant key exchange.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Plum Cave

    Plum Cave

    A cloud backup solution that employs advanced cryptography

    A cloud backup solution that employs the "ChaCha20 + Serpent-256 CBC + HMAC-SHA3-512" authenticated encryption scheme for data encryption and ML-KEM-1024 for quantum-resistant key exchange. Check it out at https://plum-cave.netlify.app/ GitHub page: https://github.com/Northstrix/plum-cave
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    fojecleno_downloader

    fojecleno_downloader

    Free video downloader, simple, fast, bilingual and ad-free

    fojecleno_downloader is a simple, fast, and completely free Windows application for downloading royalty-free videos from numerous platforms. No ads, no data collection, no complications—just a clear, efficient, and accessible tool for everyone. Main features: - Responsible one-click video downloading - Bilingual interface (French/English) - Automatic pasting from the clipboard - Choice of download folder - Activity log - Light/dark mode - Zoom and smooth scrolling - Collapsible sections - Portable version (no installation required) - Installable version with shortcuts and clean uninstallation - Compatible with Windows 10 and 11
    Downloads: 1 This Week
    Last Update:
    See Project
  • Wiz: #1 Cloud Security Software for Modern Cloud Protection Icon
    Wiz: #1 Cloud Security Software for Modern Cloud Protection

    Protect Everything You Build and Run in the Cloud

    Use the Wiz Cloud Security Platform to build faster in the cloud, enabling security, dev and devops to work together in a self-service model built for the scale and speed of your cloud development.
    Learn More
  • 10
    S3cmd

    S3cmd

    Command line tool for managing Amazon S3 and CloudFront services

    S3cmd (s3cmd) is a free command line tool and client for uploading, retrieving and managing data in Amazon S3 and other cloud storage service providers that use the S3 protocol, such as Google Cloud Storage or DreamHost DreamObjects. It is best suited for power users who are familiar with command-line programs. It is also ideal for batch scripts and automated backup to S3, triggered from cron, etc. S3cmd is written in Python.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    bilili

    bilili

    Command-line Bilibili video and danmaku downloader with batch support

    ...It focuses on enabling users to retrieve user-uploaded videos as well as serialized content such as bangumi episodes directly from the terminal environment. It provides automated downloading capabilities that handle video streams and associated data efficiently while minimizing manual interaction. bilili supports retrieving both the video files and danmaku comments, which are the scrolling overlay comments commonly associated with the platform’s videos. These danmaku comments can be automatically converted into ASS subtitle format for playback compatibility with media players. bilili also implements multi-threaded and segmented downloading techniques to improve download performance and reliability. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    Python Laboratory Operations Toolkit

    many useful snippets for using python in a laboratory

    A toolkit of Python software useful in a laboratory data acquisition and analysis environment. Includes support for such protocols as VXI-11 (and its extension, LXI), Vernier LabPro (now very old), and National Instruments DSTP (now very old). Also includes data analysis and modelling tidbits. Python3 updates are on the way in the very near future for the biggest packages. the vxi11 package is fully up-to-date, although see the blog post about python 3.13
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    dirhunt

    dirhunt

    Web crawler that finds hidden web directories without brute force

    Dirhunt is an open source security tool designed to discover web directories and analyze website structures without relying on brute-force techniques. Instead of sending large numbers of guess-based requests, it operates as a specialized crawler that intelligently explores websites to identify accessible or hidden directories. Dirhunt can detect directories that expose “Index Of” listings, which may reveal files and other resources that were not intended to be publicly visible. It can also...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    Crawlab

    Crawlab

    Distributed web crawler admin platform for spiders management

    ...Tasks are scheduled by the task scheduler module in the master node, and received by the task handler module in worker nodes, which executes these tasks in task runners. Task runners are actually processes running spider or crawler programs, and can also send data through gRPC (integrated in SDK) to other data sources, e.g. MongoDB.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    scraper-with-chatgpt
    It is a powerful data scraping tool that helps you extract information from various online sources. Easily collect data from Google SERP, Maps, Shopify, Zillow, and more. With a user-friendly interface, you can scrape and save data in JSON or Excel formats. Unlock insights from the web effortlessly with scrape-it.cloud API.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Cinemagoer

    Cinemagoer

    Python package to retrieve and manage data of the IMDb

    Cinemagoer is a Python package useful to retrieve and manage the data of the IMDb movie database about movies, people, characters and companies. Platform-independent, it can retrieve data from both the IMDb's web server and a local copy of the whole db.
    Downloads: 22 This Week
    Last Update:
    See Project
  • 17
    Security Log Generator

    Security Log Generator

    Generates logs of typical formats that would often be found in a SOC

    ...As of 31st January 2023, it supports IDS, Web Access and Endpoint log formats. Can generate a specific number of events in a linear fashion or use a waveform to add 'bumpiness' to your data. The code is modular and extensible, adding additional formats can be done with relative ease.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    DDPM-CD

    DDPM-CD

    Remote sensing change detection using denoising diffusion models

    This is the Pytorch implementation of Remote Sensing Change Detection using Denoising Diffusion Probabilistic Models. The generated images contain objects that we commonly see in real remote sensing images, such as buildings, trees, roads, vegetation, water surfaces, etc., demonstrating the powerful ability of the diffusion models to extract key semantics that can be further used in remote sensing change detection. We fine-tune a light-weight change detection head which takes multi-level...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 19
    DecryptLogin

    DecryptLogin

    Python library providing APIs for automated website login workflows

    ...DecryptLogin supports a wide variety of online services and platforms, including social media sites, developer platforms, cloud services, and other web portals. Developers can integrate these login routines into automation scripts, crawlers, or data collection tools that require authenticated sessions. It also provides example utilities and automation scripts demonstrating how the login APIs can be used in practical scenarios.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    AutoScraper

    AutoScraper

    A Smart, Automatic, Fast and Lightweight Web Scraper for Python

    This project is made for automatic web scraping to make scraping easy. It gets a URL or the HTML content of a web page and a list of sample data that we want to scrape from that page. This data can be text, URL or any HTML tag value of that page. It learns the scraping rules and returns similar elements. Then you can use this learned object with new URLs to get similar content or the exact same element of those new pages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    mlscraper

    mlscraper

    ML-based HTML scraper that learns extraction rules from examples

    mlscraper is a Python library designed to automatically extract structured data from HTML pages without requiring developers to manually write CSS selectors or XPath rules. Instead of defining extraction logic by hand, users provide a few examples of the data they want to retrieve from a webpage. It analyzes those examples within the HTML document and determines patterns or rules that can be used to extract the same type of information from similar pages.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    pspider

    pspider

    Simple Python framework for building multithreaded web crawlers

    ...It focuses on providing an easy-to-understand architecture while still supporting concurrent crawling for improved performance. It uses a multithreaded model that separates the crawling workflow into several components responsible for fetching, parsing, and saving data. Tasks are managed through queues, allowing different parts of the crawler to process work asynchronously and efficiently. PSpider defines a set of modules and utility classes that help developers manage crawling tasks, filter URLs, and process scraped content. By organizing crawling tasks into structured stages, PSpider allows developers to build scalable spiders while keeping the codebase relatively compact and readable. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Ungoogled Chromium Android

    Ungoogled Chromium Android

    Android build for ungoogled-chromium

    ...The goal is to offer an Android browser that feels familiar in capability and rendering fidelity but does not phone home, engage with proprietary APIs, or leak usage data to third-party providers. Because Android’s ecosystem is heavily tied to Google’s app services, this effort focuses on ensuring core browsing functionality, tab management, extension support (as available), and performance optimizations.
    Downloads: 79 This Week
    Last Update:
    See Project
  • 24
    CC-attack

    CC-attack

    Using Socks4/5 or http proxies to make a multithreading Http-flood

    Using Socks4/5 or http proxies to make a multithreading Http-flood/Https-flood (cc) attack.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    Scylla

    Scylla

    Intelligent proxy pool for collecting and managing public proxies

    ...In addition to the API, the system provides a web-based interface where users can view available proxies and monitor their global distribution through a visual dashboard. It is commonly used by developers who need scalable proxy management when gathering data from the internet or building datasets for machine learning.
    Downloads: 10 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB