Showing 478 open source projects for "python web crawler"

View related business solutions
  • Skillfully - The future of skills based hiring Icon
    Skillfully - The future of skills based hiring

    Realistic Workplace Simulations that Show Applicant Skills in Action

    Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
    Learn More
  • Intelligent Retail Management Icon
    Intelligent Retail Management

    Retail space, product categories, planograms, automatic ordering, and shelf labels management

    Quant offers a wide range of solutions for retail. Within one integrated software system, it allows you to efficiently combine the management of retail space, shelf labels and marketing materials with task management, reporting and automatic replenishment.
    Learn More
  • 1
    crawler

    crawler

    Collection of JS reverse engineering examples for web scraping study

    crawler is a collection of web scraping and JavaScript reverse engineering examples designed for learning how modern websites protect their data and how those protections can be analyzed. It contains many case studies that demonstrate how to analyze and replicate request parameters, cookies, and encryption logic used by real websites. Each directory in the project focuses on a specific target service or scenario, showing how browser network requests and JavaScript code can be studied to reproduce API calls programmatically. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    EasySpider

    EasySpider

    A visual no-code/code-free web crawler/spider

    A visual code-free/no-code web crawler/spider, supporting both Chinese and English.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    spider_collection

    spider_collection

    Collection of Python web scraping scripts for data extraction tasks

    spider_collection is a collection of Python web crawler scripts created primarily for experimentation, learning, and practical scraping tasks. spider_collection gathers multiple independent spiders designed to collect data from different platforms and services, demonstrating a variety of scraping techniques and workflows. These crawlers make use of common Python scraping tools such as requests, parsel, BeautifulSoup, and the Scrapy framework to extract structured information from web pages. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    AUTOMATIC1111 Stable Diffusion web UI
    AUTOMATIC1111's stable-diffusion-webui is a powerful, user-friendly web interface built on the Gradio library that allows users to easily interact with Stable Diffusion models for AI-powered image generation. Supporting both text-to-image (txt2img) and image-to-image (img2img) generation, this open-source UI offers a rich feature set including inpainting, outpainting, attention control, and multiple advanced upscaling options. With a flexible installation process across Windows, Linux, and...
    Downloads: 356 This Week
    Last Update:
    See Project
  • ShareCRM is an AI-powered enterprise CRM platform designed to connect data and teams across the entire customer lifecycle. Icon
    ShareCRM is an AI-powered enterprise CRM platform designed to connect data and teams across the entire customer lifecycle.

    Trusted by 6000+ Large and Medium Enterprises

    ShareCRM is an AI-powered, customizable and affordable enterprise CRM solution to seamlessly integrate and empower every aspect of your business.
    Learn More
  • 5
    douyin

    douyin

    Open source Douyin crawler for collecting and downloading public data

    DouyinCrawler is an open source data collection tool designed to gather publicly available information from the Douyin platform. It demonstrates how to build a Python-based web crawler combined with a graphical interface and command line functionality. It allows users to collect data from various types of Douyin content, including user profiles, videos, hashtags, and music pages. DouyinCrawler supports both automated scraping and batch operations to process multiple targets efficiently. It also integrates with the Aria2 download utility to enable large-scale downloading of videos and images associated with collected content. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 6
    katana

    katana

    Fast CLI web crawler for discovering endpoints in modern web apps

    Katana is an open source command-line web crawling and spidering framework developed by ProjectDiscovery. It is designed to efficiently crawl websites and web applications in order to discover endpoints, resources, and other useful information that may not be easily visible through manual browsing. Katana focuses on speed and automation, making it suitable for use in security reconnaissance workflows and automated pipelines. Katana supports both standard HTTP crawling and headless browser...
    Downloads: 44 This Week
    Last Update:
    See Project
  • 7
    Stable Diffusion web UI for AMDGPUs

    Stable Diffusion web UI for AMDGPUs

    Stable Diffusion WebUI optimized for AMD GPUs with editing tools

    Stable Diffusion WebUI AMDGPU is a browser-based interface for generating images using Stable Diffusion, built with Gradio and adapted for AMD graphics hardware. It provides both text-to-image and image-to-image workflows, allowing users to create, refine, and upscale visuals within a single interface. It includes tools such as inpainting and outpainting for editing specific areas of an image, along with features like prompt matrix generation and attention controls to fine-tune outputs....
    Downloads: 13 This Week
    Last Update:
    See Project
  • 8
    Web Dev for Beginners

    Web Dev for Beginners

    About 24 Lessons, 12 Weeks, Get Started as a Web Developer

    Web-Dev-For-Beginners is Microsoft’s open source, project-based curriculum for learning web development from scratch. Designed as a 12-week, 24-lesson course, it covers HTML, CSS, and JavaScript fundamentals through hands-on projects like terrariums, browser extensions, and space games. Each lesson includes a mix of pre-lecture quizzes, written content, assignments, challenges, and post-lecture quizzes to reinforce learning. The course also offers global accessibility with translations in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    fess

    fess

    Open source enterprise search server for websites, files, and data

    ...Fess is built on top of OpenSearch and offers an integrated solution for crawling, indexing, and searching documents from websites, file systems, and various data stores. Fess includes a built-in crawler that can collect content from sources such as databases, CSV files, and shared storage, making it suitable for centralized knowledge discovery. It supports indexing and searching across many document formats including office documents, PDFs, and compressed archives. It also provides a web-based administrative interface that allows administrators to configure crawling targets, manage indexing tasks, and adjust search settings from a graphical dashboard.
    Downloads: 14 This Week
    Last Update:
    See Project
  • AI-powered SAST and AppSec platform that helps companies find and fix vulnerabilities. Icon
    AI-powered SAST and AppSec platform that helps companies find and fix vulnerabilities.

    Trusted by 750+ companies and performing 200k+ code scans monthly.

    ZeroPath (YC S24) is an AI-native application security platform that delivers comprehensive code protection beyond traditional SAST. Founded by security engineers from Tesla and Google, ZeroPath combines large language models with advanced program analysis to find and automatically fix vulnerabilities.
    Learn More
  • 10
    diskover-community

    diskover-community

    Open source file indexing & storage analytics powered by Elasticsearch

    Diskover Community Edition is an open source file system indexing and storage analytics platform designed to help organizations understand and manage large volumes of file data. It crawls file systems and indexes metadata using Elasticsearch, enabling fast search, analysis, and organization of files stored across different storage systems. It allows administrators and users to explore file structures, monitor storage usage, and gain insights into how data is distributed across...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Pydoll

    Pydoll

    Async Python library in automating Chromium browsers without WebDriver

    Pydoll is a Python library designed for automating Chromium-based web browsers such as Chrome and Edge without relying on a traditional WebDriver layer. Instead of using external drivers, it connects directly to the Chrome DevTools Protocol through WebSocket, allowing scripts to control browser behavior more efficiently and with fewer compatibility issues.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 12
    FastRTC

    FastRTC

    The python library for real-time communication

    FastRTC is a Python library designed to simplify real-time communication (RTC), especially for audio and video streaming applications. It abstracts away much of the complexity that typically comes with implementing WebRTC by providing a simple interface — e.g. a Stream class — that can be mounted within a web backend (for example a FastAPI application).
    Downloads: 10 This Week
    Last Update:
    See Project
  • 13
    Mercury Browser

    Mercury Browser

    Privacy-focused web browser fork of Firefox

    Mercury Browser is an optimized, privacy-focused web browser that is a fork of Mozilla Firefox. It incorporates compiler optimizations such as AVX, AES, LTO, and PGO to enhance performance and security. With features derived from projects like LibreWolf, Waterfox, and Ghostery, Mercury disables telemetry and debugging elements by default, ensuring a more private browsing experience. It also includes usability patches that bring back features like the classic top bar and supports unsigned...
    Downloads: 76 This Week
    Last Update:
    See Project
  • 14
    JS Beautifier

    JS Beautifier

    Beautifier for javascript

    ...The beautifier can be added on your page as web library. JS Beautifier is hosted on two CDN services: cdnjs and rawgit. You can beautify javascript using JS Beautifier in your web browser, or on the command-line using node.js or python.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 15
    videodl

    videodl

    Lightweight Python tool for downloading videos from many platforms

    Videodl is a lightweight video downloader implemented entirely in Python that allows users to retrieve videos from a wide range of online media platforms. It focuses on providing a fast and simple way to parse video pages and download media files, often prioritizing high-definition versions without watermarks when available. It supports numerous video platforms across both Chinese and international streaming ecosystems, enabling users to fetch content from many popular services through a...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    WanGP

    WanGP

    AI video generator optimized for low VRAM and older GPUs use

    Wan2GP is an open source AI video generation toolkit designed to make modern generative models accessible on consumer-grade hardware with limited GPU memory. It acts as a unified interface for running multiple video, image, and audio generation models, including Wan-based models as well as other systems like Hunyuan Video, Flux, and Qwen. A key focus of the project is reducing VRAM requirements, enabling some workflows to run on as little as 6 GB while still supporting older Nvidia and...
    Downloads: 39 This Week
    Last Update:
    See Project
  • 17
    UFONet

    UFONet

    UFONet - Denial of Service Toolkit

    UFONet is a powerful and controversial Python-based toolkit for testing and conducting Distributed Denial of Service (DDoS) attacks using unconventional methods, such as leveraging third-party web applications as attack vectors. It automates the discovery of vulnerable targets and enables attackers or researchers to launch large-scale amplification attacks without directly using botnets.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 18
    LaVague

    LaVague

    Framework for building AI agents that automate complex web tasks

    LaVague is an open source framework designed to help developers build AI-powered web agents capable of automating tasks across websites and web applications. It implements the concept of a Large Action Model framework, allowing agents to interpret a user-provided objective and translate it into a sequence of actions performed in a browser. These agents can navigate web pages, retrieve information, fill out forms, and execute multi-step workflows automatically. LaVague is centered around a...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 19
    Search with Lepton

    Search with Lepton

    Lightweight demo to build a conversational AI search engine quickly

    ...It includes both a backend service written in Python and a web interface that allows users to interact with the search engine in a conversational format. Developers can configure different search providers and language models through environment variables, making it flexible for experimentation and prototyping.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    reNgine

    reNgine

    Automated framework for web application reconnaissance and scanning

    reNgine is an automated reconnaissance framework designed to simplify and enhance the process of gathering information about web applications during security assessments. It provides a streamlined workflow for penetration testers, bug bounty hunters, and security teams who need to perform reconnaissance efficiently and at scale. The platform integrates multiple open-source reconnaissance tools into a unified environment with a configurable scanning engine and an intuitive web interface....
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    Scalene

    Scalene

    High-performance CPU, GPU, and memory profiler for Python

    Scalene is a high-performance CPU, GPU and memory profiler for Python that does a number of things that other Python profilers do not and cannot do. It runs orders of magnitude faster than other profilers while delivering far more detailed information. Once Scalene has profiled your program, it will launch a web browser with an interactive user interface (all processing is done locally). Hover over bars to see breakdowns of CPU and memory consumption, and click on underlined column headers to sort the columns. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Seeker

    Seeker

    Accurately Locate Smartphones using Social Engineering

    Seeker is an open source project that demonstrates how to obtain precise location information from devices using social engineering and web-based techniques. The tool sets up a phishing page that asks for location permissions, allowing GPS and other device data to be shared if the user consents. It can capture latitude, longitude, accuracy, altitude, direction, and even speed, with results displayed in a terminal. The project supports both manual deployment and tunneling services like Ngrok...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 23
    Odoo

    Odoo

    Odoo. Open Source Apps To Grow Your Business

    Odoo is a comprehensive suite of open source business applications designed to manage and streamline various organizational operations. It provides an integrated ecosystem of tools that cover core business functions such as CRM, accounting, eCommerce, inventory, HR, project management, and manufacturing. Each Odoo app can be deployed individually to meet specific business needs or combined to form a powerful all-in-one ERP system. The platform’s modular architecture allows businesses to...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 24
    SwarmUI

    SwarmUI

    Modular AI image and video generation web UI with extensible tools

    SwarmUI is a modular web-based user interface designed for AI-driven image generation, with a strong focus on usability, performance, and extensibility. It serves as a unified environment for working with multiple AI models, including Stable Diffusion and newer image and video generation systems, allowing users to create and manage outputs through a browser interface. SwarmUI is built to accommodate both beginners and advanced users by offering a simple “Generate” interface alongside more...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 25
    Harbor LLM

    Harbor LLM

    Run a full local LLM stack with one command using Docker

    Harbor is an open source, containerized toolkit designed to simplify running local large language model (LLM) environments. It combines a CLI and companion app to launch backends, frontends, and supporting services with minimal setup. With a single command, users can start preconfigured tools like Ollama and Open WebUI, enabling chat, workflows, and integrations immediately. Harbor supports multiple inference engines, including llama.cpp and vLLM, and connects them seamlessly to user...
    Downloads: 16 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB