Showing 22 open source projects for "python web crawler"

View related business solutions
  • Endpoint Protection Software for Businesses | HYPERSECURE Icon
    Endpoint Protection Software for Businesses | HYPERSECURE

    DriveLock protects systems, data, end devices from data loss and misuse.

    The HYPERSECURE endpoint protection platform is a comprehensive suite of products and services enhanced by European third-party solutions. It ensures our customers’ IT security, regulatory compliance, and digital sovereignty.
    Learn More
  • Enterprise-Class Managed File Transfer. Icon
    Enterprise-Class Managed File Transfer.

    For organizations that need to automate secure file transfers to protect sensitive data.

    Diplomat MFT by Coviant Software is a secure, reliable managed file transfer solution designed to simplify and automate SFTP, FTPS, and HTTPS file transfers. Built for seamless integration, Diplomat MFT works across major cloud storage platforms, including AWS S3, Azure Blob, Google Cloud, Oracle Cloud, SharePoint, Dropbox, Box, and more.
    Learn More
  • 1
    ArchiveBox

    ArchiveBox

    Open source self-hosted web archiving

    ArchiveBox is a powerful, self-hosted internet archiving solution to collect, save, and view websites offline. Without active preservation effort, everything on the internet eventually disappears or degrades. Archive.org does a great job as a centralized service, but saved URLs have to be public, and they can't save every type of content. ArchiveBox is an open source tool that lets organizations & individuals archive both public & private web content while retaining control over their data....
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    Rockstor

    Rockstor

    BTRFS based NAS and private cloud storage solution

    ...These Rock-ons, combined with advanced NAS features, turn Rockstor into a private cloud storage solution accessible from anywhere, giving users complete control of cost, ownership, privacy and data security. Rockstor UI is written in Javascript, making it simple to manage everything from your Web browser. The backend is written in Python and exposes RESTful APIs to easily extend functionality!
    Downloads: 41 This Week
    Last Update:
    See Project
  • 3
    Plum Cave

    Plum Cave

    A cloud backup solution that employs advanced cryptography

    A cloud backup solution that employs the "ChaCha20 + Serpent-256 CBC + HMAC-SHA3-512" authenticated encryption scheme for data encryption and ML-KEM-1024 for quantum-resistant key exchange. Check it out at https://plum-cave.netlify.app/ GitHub page: https://github.com/Northstrix/plum-cave
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Datahosting ipfs kubo-cluster

    Datahosting ipfs kubo-cluster

    Managed IPFS Kubo pinning with IPFS Cluster replication

    This project provides an open-source setup for IPFS Kubo (public and private pinning) combined with IPFS Cluster replication, designed for reliable, production-ready decentralized storage. It is suitable for developers, infrastructure operators, and Web3 projects that need predictable IPFS availability without maintaining complex node infrastructure. The system supports multiple retention plans for IPFS Kubo pinning, allowing users to choose how long data remains pinned, with transparent...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Outplacement, Executive Coaching and Career Development | Careerminds Icon
    Outplacement, Executive Coaching and Career Development | Careerminds

    Careerminds outplacement includes personalized coaching and a high-tech approach to help transition employees back to work faster.

    By helping to avoid the potential risks of RIFs or layoffs through our global outplacement services, companies can move forward with their goals while preserving their internal culture, employer brand, and bottom lines.
    Learn More
  • 5
    bitfarm-Archiv Document Management - DMS
    bitfarm-Archiv is a powerful Document Management (DMS), Enterprise Content Management (ECM) and Knowledge Management System (KMS) with Workflow Components. Help us! As we live in the internet age, the best thing, you can help, is to write a short statement about your scenario and your use of the DMS, along with your experiences and put it on your own website or in a blog or forum. It would help us best, if you can also add a hyperlink to our site http://www.bitfarm-archiv.com. By this...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 6
    Configuration Backup (ConfiBack)

    Configuration Backup (ConfiBack)

    Project for backing up network device configuration

    Using this project you can make backup and track changes of configuration of network devices like switches, routers, etc.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    A set of tools (command line and GUI) to provide a complete digital photo workflow for Unixes. EXIF headers are used as the central information repository, so users may change their software at any time without loosing any data.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    mediaTUM is free software written in Python for archiving and retrieval of images, documents and other research data. It was originally developed in the framework of the DFG project IntegraTUM and is continuously expanded with new functionalities as required.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Cloud Export is a tool to automatically extract your data from web applications and save it to your local file system for backup purposes, but more extensive than Google Takeout. Plans are based on http://www.dataliberation.org.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Papirfly: Best user-friendly DAM and Content Creation Software Icon
    Papirfly: Best user-friendly DAM and Content Creation Software

    The #1 solution to create and manage content. On‑brand. At scale.

    Papirfly provides a single online destination for all your employees and other stakeholders who are engaging with your brand, ensuring consistency in all aspects of their communications. Teams can produce infinite studio-standard marketing materials from bespoke templates, store, share and adapt them for their own markets and stay firmly educated on the brand’s purpose, guidelines and evolution – with no specialist skills or agency help necessary.
    Learn More
  • 10
    The archive-crawler project is building Heritrix: a flexible, extensible, robust, and scalable web crawler capable of fetching, archiving, and analyzing the full diversity and breadth of internet-accesible content.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 11
    Backup and restore of files to web mail systems, ftp, sftp. Uses free storage of gmail/hotmail etc. Archives files, splits large files, encrypts and uploads. Requires python (tested with python 2.5)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Mutualized distant storage space management tool (using a distributed system).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Arrowbase is a collection of tools for backup persoses. Together they combine a backup system that can be used on more then one Operating system. This makes the project not only widely spread but portable as wel.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    A configurable knowledge management framework. It works out of the box, but it's meant mainly as a framework to build complex information retrieval and analysis systems. The 3 major components: Crawler, Analyzer and Indexer can also be used separately.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    XSDB XML is to DATA as HTML is to DOCUMENT. Publish and combine data as easily as HTML format and web browsers publish and view documents. Implementations in Python, javascript, java, C#/.NET.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    CAIRN is a modular copy and restore program for the imaging of a computer. It copies every file on a computer and figures out how to recreate it from scratch. It is primarily network oriented but is also flexible enough to boot from any possible method.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Agile Author is a framework for developing networked repositories of digital information such as digital libraries and content management systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    An Open Source fork of the Redhat up2date client and the NRH-up2date server.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Cat-photo makes administration and web pages with photos easy.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    idyuts is \"I Dare You to Use This Shell\"; a pre-hibernate approach to replacing an ORM written with jython functors into a pure-Java language command pattern. The \"pipeline codegen artifacts\" are simple IoC templates, and trivial to adapt
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    View, track, filter, archive, alert, group, rotate logs through a GUI, CLI, or WebUI.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Rescuezilla

    Rescuezilla

    The Swiss Army Knife of System Recovery

    Rescuezilla is an easy-to-use disk cloning and imaging application that's fully compatible with Clonezilla — the industry-standard trusted by tens of millions. Yes, Rescuezilla is the Clonezilla GUI (graphical user interface) that you might have been looking for. **See: https://rescuezilla.com/ for download links** **NEW** Weekly rolling release downloads: https://github.com/rescuezilla/rescuezilla/releases Rescuezilla is a fork of Redo Backup and Recovery (now called Redo...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB