Showing 664 open source projects for "python web crawler"

View related business solutions
  • Agentic AI SRE built for Engineering and DevOps teams. Icon
    Agentic AI SRE built for Engineering and DevOps teams.

    No More Time Lost to Troubleshooting

    NeuBird AI's agentic AI SRE delivers autonomous incident resolution, helping team cut MTTR up to 90% and reclaim engineering hours lost to troubleshooting.
    Learn More
  • The Most Powerful Software Platform for EHSQ and ESG Management Icon
    The Most Powerful Software Platform for EHSQ and ESG Management

    Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

    Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.
    Learn More
  • 1
    List of independent blogs in Chinese

    List of independent blogs in Chinese

    List of independent blogs in Chinese

    List of independent blogs in Chinese is a curated open repository that aggregates and maintains a large list of independent Chinese-language blogs across technology, design, and personal knowledge domains. The project aims to promote the independent blogging ecosystem by making it easier for readers to discover high-quality personal sites outside major content platforms. It is community-driven, allowing contributors to submit and update blog entries so the directory remains current and...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    Maltrail

    Maltrail

    Malicious traffic detection system

    Maltrail is a malicious traffic detection system, utilizing publicly available (black)lists containing malicious and/or generally suspicious trails, along with static trails compiled from various AV reports and custom user-defined lists, where trail can be anything from domain name, URL, IP address (e.g. 185.130.5.231 for the known attacker) or HTTP User-Agent header value (e.g. sqlmap for automatic SQL injection and database takeover tool). Also, it uses (optional) advanced heuristic...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    OSRFramework

    OSRFramework

    OSRFramework, the Open Sources Research Framework is a AGPLv3+ project

    OSRFramework is a GNU AGPLv3+ set of libraries developed by i3visio to perform Open Source Intelligence collection tasks. They include references to a bunch of different applications related to username checking, DNS lookups, information leaks research, deep web search, regular expressions extraction and many others. At the same time, by means of ad-hoc Maltego transforms, OSRFramework provides a way of making these queries graphically as well as several interfaces to interact with like...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Weblate

    Weblate

    Web based localization tool with tight version control integration

    Weblate is a copylefted libre software web-based continuous localization system, used by over 2500 libre projects and companies in more than 165 countries. Copylefted libre software, used by over 2,500 libre software projects and companies in over 165 countries. Hosted service and standalone tool with tight version control integration. Simple and clean user interface, propagation of translations across components, quality checks and automatic linking to source files. There is infrastructure...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Turn traffic into pipeline and prospects into customers Icon
    Turn traffic into pipeline and prospects into customers

    For account executives and sales engineers looking for a solution to manage their insights and sales data

    Docket is an AI-powered sales enablement platform designed to unify go-to-market (GTM) data through its proprietary Sales Knowledge Lake™ and activate it with intelligent AI agents. The platform helps marketing teams increase pipeline generation by 15% by engaging website visitors in human-like conversations and qualifying leads. For sales teams, Docket improves seller efficiency by 33% by providing instant product knowledge, retrieving collateral, and creating personalized documents. Built for GTM teams, Docket integrates with over 100 tools across the revenue tech stack and offers enterprise-grade security with SOC 2 Type II, GDPR, and ISO 27001 compliance. Customers report improved win rates, shorter sales cycles, and dramatically reduced response times. Docket’s scalable, accurate, and fast AI agents deliver reliable answers with confidence scores, empowering teams to close deals faster.
    Learn More
  • 5
    AutoPkg

    AutoPkg

    Automating packaging and software distribution on macOS

    AutoPkg is a system that automatically prepares software for distribution to managed clients. Recipes allow you to specify a series of simple actions which combined together can perform complex tasks, similar to Automator workflows or Unix pipes.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    GPT All Star

    GPT All Star

    AI-powered code generation tool for scratch development of web apps

    AI-powered code generation tool for scratch development of web applications with a team collaboration of autonomous AI agents. This is a research project, and its primary value is to explore the possibility of autonomous AI agents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Slack Machine

    Slack Machine

    A simple, yet powerful and extendable Slack bot

    Slack Machine is a simple, yet powerful and extendable Slack bot framework. More than just a bot, Slack Machine is a framework that helps you develop your Slack workspace into a ChatOps powerhouse. Slack Machine is built with an intuitive plugin system that lets you build bots quickly but also allows for easy code organization. A plugin can look as simple as this:
    Downloads: 6 This Week
    Last Update:
    See Project
  • 8
    Mezzanine

    Mezzanine

    CMS framework for Django

    Mezzanine is a powerful open source content management platform built using the Django framework. In many ways it is like many other content management tools, offering an intuitive interface for managing all of your content. But Mezzanine is different in that it provides most of its functionality by default. While other platforms rely heavily on modules or reusable applications, Mezzanine comes ready with all the functionality you need, making it the more efficient choice. Mezzanine has a...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    Healthchecks

    Healthchecks

    A cron monitoring tool written in Python & Django

    We notify you when your nightly backups, weekly reports, cron jobs, and scheduled tasks don't run on time. Healthchecks is a cron job monitoring service. It listens for HTTP requests and email messages ("pings") from your cron jobs and scheduled tasks ("checks"). When a ping does not arrive on time, Healthchecks sends out alerts. Healthchecks comes with a web dashboard, API, 25+ integrations for delivering notifications, monthly email reports, WebAuthn 2FA support, and team management...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Data management solutions for confident marketing Icon
    Data management solutions for confident marketing

    For companies wanting a complete Data Management solution that is native to Salesforce

    Verify, deduplicate, manipulate, and assign records automatically to keep your CRM data accurate, complete, and ready for business.
    Learn More
  • 10
    SeleniumBase

    SeleniumBase

    A framework for browser automation and testing with Selenium

    SeleniumBase automatically handles common WebDriver actions such as launching web browsers before tests, saving screenshots during failures, and closing web browsers after tests. SeleniumBase lets you customize test runs from the command line. SeleniumBase uses simple syntax for commands. pytest includes automatic test discovery. If you don't specify a specific file or folder to run, pytest will automatically search through all subdirectories for tests to run. No More Flaky Tests!...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Awesome Free ChatGPT

    Awesome Free ChatGPT

    List of free ChatGPT mirror sites, continuously updated

    This is a curated directory of freely accessible ChatGPT-style services and mirror sites that offer AI chatbot interfaces without login or payment requirements. Resources often support multiple models like GPT-4, Claude, Gemini, and more. Data collected from multiple independent sites with descriptions and tags. Includes services with image upload and drawing capabilities. Aggregates free, no-login-required ChatGPT-like web services. Continually updated mirror list to maintain availability.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    cheat.sh

    cheat.sh

    The only cheat sheet you need

    cheat.sh is a compact, network-accessible cheat-sheet service that serves concise examples and usage notes for hundreds of shell commands, programming languages, and tools via a simple HTTP interface. You can query it from the terminal (for example curl cht.sh/rsync or curl cheat.sh/ls) or browse the web front page; it also supports a shorthand hostname (cht.sh) and provides both online and standalone/local installation modes. The repository contains the server and client code, instructions to run a local standalone instance (including Python virtualenv setup), and tooling to fetch or maintain the upstream cheat-sheet data; installation documentation explains disk-space needs and dependency setup for offline use. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Flask-Caching

    Flask-Caching

    A caching extension for Flask

    Flask-Caching is an extension to Flask that adds caching support for various backends to any Flask application. By running on top of cachelib it supports all of werkzeug’s original caching backends through a uniformed API. It is also possible to develop your own caching backend by subclassing flask_caching.backends.base.BaseCache class. Flask’s pluggable view classes are also supported. To cache them, use the same cached() decorator on the dispatch_request method. Using the same @cached...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Google CTF

    Google CTF

    Google CTF

    Google CTF is the public repository that houses most of the challenges from Google’s Capture-the-Flag competitions since 2017 and the infrastructure used to run them. It’s a learning and practice archive: competitors and educators can replay tasks across categories like pwn, reversing, crypto, web, sandboxing, and forensics. The code and binaries intentionally contain vulnerabilities—by design—so users can explore exploit chains and patching in realistic settings. The repo also includes...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Flask App Builder

    Flask App Builder

    Simple and rapid application development framework

    Simple and rapid application development framework, built on top of Flask. includes detailed security, auto CRUD generation for your models, google charts and much more. Automatic permissions lookup, based on exposed methods. Inserts on the Database all the detailed permissions possible on your application. Public (no authentication needed) and Private permissions. Role-based permissions. Authentication support for OpenID, Database and LDAP. Support for self-user registration. Automatic,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Wagtail

    Wagtail

    A Django content management system focused on flexibility & UX

    Wagtail is a powerful, open source content management system that’s focused on flexibility and user experience. Built on Django, Wagtail offers precise control and flexibility for designers, developers and editors. Designed by developers for developers, Wagtail plays nicely with everything else in your tech stack so you can do more and focus on perfecting your site. Designers will find Wagtail’s simple templating system ideal for building beautiful websites just the way they want, without...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    aws-cli

    aws-cli

    Universal Command Line Interface for Amazon Web Services

    The AWS CLI is the universal command-line interface for managing AWS services, automating tasks, and scripting cloud workflows. It exposes nearly every public API from EC2 and S3 to IAM, Lambda, and beyond, providing parity with the service SDKs in a tool you can run anywhere. Profiles, regions, single-sign-on, and credential helpers make it straightforward to switch contexts securely across accounts and environments. Its output controls and JMESPath querying let you slice, filter, and...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    react2shell-scanner

    react2shell-scanner

    High Fidelity Detection Mechanism for RSC/Next.js RCE

    react2shell-scanner is a security-oriented tool that bridges modern JavaScript (React) applications and shell scripting by auditing web front-ends for exposed interfaces that could be manipulated or controlled through command execution. It scans React codebases, identifies places where user input interacts with shell-executable contexts, and flags risky patterns that might lead to command injection, unvalidated arguments, or unsafe bindings between UI controls and underlying system actions....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Apprise

    Apprise

    Apprise - Push Notifications that work with just about every platform!

    Take advantage of Apprise through your network with a user-friendly API. Apprise API was designed to easily fit into existing (and new) eco-systems that are looking for a simple notification solution. There is a small built-in Configuration Manager that can be optionally accessed through your web browser allowing you to create and save as many configurations as you'd like. Each configuration is differentiated by a unique {KEY} that you decide on. Once you've saved your configuration, you'll...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    wger

    wger

    Self hosted FLOSS fitness/workout, nutrition and weight tracker

    wger Workout Manager is a free and open web application that manages your exercises, routines and nutrition. It started out as a personal project to replace my growing collection of spreadsheets but has turned into something that other people may find useful. You can create and manage flexible training routines for whatever goals you have. Select exactly what exercises you are going to do and how many repetitions, time or distance you want to do. You can also combine different workouts in...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 21
    Tencent Cloud Code Analysis

    Tencent Cloud Code Analysis

    Static code analysis

    Tencent Cloud Code Analysis (TCA for short, used internally by the R&D code CodeDog ) is a cloud-native, distributed, high-performance comprehensive code analysis and tracking platform that integrates many analysis tools, including server, web and client The three components have integrated a number of self-developed tools, and also support the dynamic integration of analysis tools of various programming languages ​​in the industry. Obtain the Tencent Cloud code analysis platform by...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    CineCLI

    CineCLI

    CineCLI is a cross-platform command-line movie browser

    CineCLI is a command-line utility designed to help movie lovers quickly browse, search, and access film information from the terminal without needing a graphical interface. It connects to popular online movie databases to fetch metadata such as titles, release dates, ratings, genres, casts, posters, and plot summaries, presenting all of that in a concise, text-friendly format suitable for terminals or scripts. Users can search by keyword, year, or exact title and then drill into detailed...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    Shumai

    Shumai

    Fast Differentiable Tensor Library in JavaScript & TypeScript with Bun

    ...Built on Bun and Flashlight, with ArrayFire as its numerical backend, Shumai brings GPU-accelerated tensor operations, automatic differentiation, and scientific computing tools directly to JavaScript developers. It allows seamless integration of machine learning, deep learning, and custom differentiable programs into web-based or server-side environments without relying on Python frameworks. The library supports matrix operations, gradient computation, and tensor conversions with intuitive APIs and near-native speed, thanks to Bun’s low-overhead FFI bindings. It can automatically leverage GPU acceleration on Linux (via CUDA) and CPU computation on macOS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Graph Notebook

    Graph Notebook

    Library extending Jupyter notebooks to integrate with Apache TinkerPop

    The graph notebook provides an easy way to interact with graph databases using Jupyter notebooks. Using this open-source Python package, you can connect to any graph database that supports the Apache TinkerPop, openCypher or the RDF SPARQL graph models. These databases could be running locally on your desktop or in the cloud. Graph databases can be used to explore a variety of use cases including knowledge graphs and identity graphs. This project includes many examples of Jupyter notebooks....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    DocsGPT

    DocsGPT

    Private AI platform for agents, enterprise search and RAG pipelines

    DocsGPT is an open-source AI platform for deploying private RAG pipelines, AI agents, and enterprise search on your own infrastructure. Connect any data source (PDFs, DOCX, CSV, Excel, HTML, audio, GitHub, databases, URLs) and get accurate, hallucination-free answers with source citations. Choose your LLM: OpenAI, Anthropic, Google Gemini, or local models. Works with Qdrant, MongoDB, and Elasticsearch and more. Deploy via Docker or Kubernetes with full data sovereignty. Build...
    Downloads: 3 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB