Showing 304 open source projects for "python web crawler"

View related business solutions
  • Next-Gen Encryption for Post-Quantum Security | CLEAR by Quantum Knight Icon
    Next-Gen Encryption for Post-Quantum Security | CLEAR by Quantum Knight

    Lock Down Any Resource, Anywhere, Anytime

    CLEAR by Quantum Knight is a FIPS-140-3 validated encryption SDK engineered for enterprises requiring top-tier security. Offering robust post-quantum cryptography, CLEAR secures files, streaming media, databases, and networks with ease across over 30 modern platforms. Its compact design, smaller than a single smartphone image, ensures maximum efficiency and low energy consumption.
    Learn More
  • The Most Powerful Software Platform for EHSQ and ESG Management Icon
    The Most Powerful Software Platform for EHSQ and ESG Management

    Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

    Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.
    Learn More
  • 1
    AI-Crawler

    AI-Crawler

    Crawl a website starting from a URL, find relevant pages

    AI Crawler is an experimental AI-powered web crawling and data extraction tool that uses natural language prompts to guide the discovery and retrieval of relevant information across websites. Unlike traditional web scrapers that rely on static selectors and manual scripting, it uses AI to dynamically identify and prioritize pages based on user intent, making it more flexible and resilient to changes in website structure.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    GPT Crawler

    GPT Crawler

    Crawl a site to generate knowledge files to create your own custom GPT

    GPT Crawler is an open-source tool designed to automatically crawl websites and generate structured knowledge that can be used to build AI assistants and retrieval systems. It focuses on extracting high-quality textual content from web pages and preparing it in formats suitable for embedding, indexing, or fine-tuning workflows. The project is especially useful for teams that want to turn documentation sites or knowledge bases into conversational AI backends without building custom scrapers from scratch. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    AUTOMATIC1111 Stable Diffusion web UI
    AUTOMATIC1111's stable-diffusion-webui is a powerful, user-friendly web interface built on the Gradio library that allows users to easily interact with Stable Diffusion models for AI-powered image generation. Supporting both text-to-image (txt2img) and image-to-image (img2img) generation, this open-source UI offers a rich feature set including inpainting, outpainting, attention control, and multiple advanced upscaling options. With a flexible installation process across Windows, Linux, and...
    Downloads: 259 This Week
    Last Update:
    See Project
  • 4
    Amazing-Python-Scripts

    Amazing-Python-Scripts

    Curated collection of Amazing Python scripts

    Amazing-Python-Scripts is a collaborative repository that collects a wide variety of Python scripts designed to demonstrate practical programming techniques and automation tasks. The project includes scripts ranging from beginner-level utilities to more advanced applications involving machine learning, data processing, and system automation. Its goal is to provide developers with useful coding examples that can solve everyday problems, automate repetitive tasks, or serve as learning exercises. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • The AI workplace management platform Icon
    The AI workplace management platform

    Plan smart spaces, connect teams, manage assets, and get insights with the leading AI-powered operating system for the built world.

    By combining AI workflows, predictive intelligence, and automated insights, OfficeSpace gives leaders a complete view of how their spaces are used and how people work. Facilities, IT, HR, and Real Estate teams use OfficeSpace to optimize space utilization, enhance employee experience, and reduce portfolio costs with precision.
    Learn More
  • 5
    Python Code Tutorials

    Python Code Tutorials

    The Python Code Tutorials

    Python Code Tutorials is a large educational repository that aggregates programming tutorials from the “The Python Code” website into a structured collection of Python projects and learning materials. The repository covers a wide range of programming topics including cybersecurity, networking, web scraping, machine learning, GUI development, and automation scripts.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Text Generation Web UI

    Text Generation Web UI

    Oobabooga - The definitive Web UI for local AI, with powerful features

    A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA. Dropdown menu for switching between models. Notebook mode that resembles OpenAI's playground. Chat mode for conversation and role playing. Instruct mode compatible with Alpaca and Open Assistant formats. Nice HTML output for GPT-4chan. Markdown output for GALACTICA, including LaTeX rendering. Custom chat characters. Advanced chat features (send images, get audio responses with TTS)....
    Downloads: 44 This Week
    Last Update:
    See Project
  • 7
    python-whatsapp-bot

    python-whatsapp-bot

    Build AI WhatsApp Bots with Pure Python

    python-whatsapp-bot is an open-source framework that demonstrates how to build AI-powered WhatsApp bots using pure Python and the official WhatsApp Cloud API. The project provides a practical implementation of a messaging automation system using the Flask web framework to handle webhook events and process incoming messages in real time.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Portia SDK Python

    Portia SDK Python

    Portia Labs Python SDK for building agentic workflows

    portia‑sdk‑python is an open-source Python SDK by Portia Labs for creating reliable, stateful, authenticated multi-agent AI workflows. It supports tool-backed agents capable of real-world interactions—like web browsing, API access, and human-in-the-loop clarifications—while maintaining transparency and auditability through structured plans and execution hooks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    PaSa

    PaSa

    An advanced paper search agent powered by large language models

    PaSa is an open-source “paper search agent” built around large language models (LLMs), designed to automate the process of academic literature retrieval with human-like decision making. Instead of simply translating a query into keywords and returning a flat list of matching papers, PaSa uses a dual-agent architecture (Crawler + Selector) that can iteratively search, read, analyze, and filter academic publications — simulating how a researcher might dig through citation networks, expand...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Premier Construction Software Icon
    Premier Construction Software

    Premier is a global leader in financial construction ERP software.

    Rated #1 Construction Accounting Software by Forbes Advisor in 2022 & 2023. Our modern SAAS solution is designed to meet the needs of General Contractors, Developers/Owners, Homebuilders & Specialty Contractors.
    Learn More
  • 10
    DeerFlow

    DeerFlow

    Deep Research framework, combining language models with tools

    DeerFlow is an open-source, community-driven “deep research” framework / multi-agent orchestration platform developed by ByteDance. It aims to combine the reasoning power of large language models (LLMs) with automated tool-use — such as web search, web crawling, Python execution, and data processing — to enable complex, end-to-end research workflows. Instead of a monolithic AI assistant, DeerFlow defines multiple specialized agents (e.g. “planner,” “searcher,” “coder,” “report generator”) that collaborate in a structured workflow, allowing tasks like literature reviews, data gathering, data analysis, code execution, and final report generation to be largely automated. ...
    Downloads: 535 This Week
    Last Update:
    See Project
  • 11
    web-eval-agent MCP Server

    web-eval-agent MCP Server

    An MCP server that autonomously evaluates web applications

    web-eval-agent is a Model Context Protocol (MCP) server that spins up a browser-use–capable debugging agent to autonomously run and evaluate web apps straight from your editor. It’s positioned as a “let the coding agent debug itself” companion: the agent launches the app, navigates flows, captures evidence, and iterates on failures without manual copy-pasting of logs. The repository focuses on developer ergonomics, exposing typed MCP tools so clients like Claude Desktop can start sessions,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Gemini-API

    Gemini-API

    Reverse-engineered Python API for Google Gemini web app

    Gemini-API is a community-created asynchronous Python wrapper for the web interface of Google’s Gemini models (formerly Bard). It is the result of reverse-engineering the Gemini web app and exposing its functionality through a programmatic API. This enables developers to incorporate Gemini into Python applications, scripts, bots, or tools without relying solely on official SDKs. The wrapper supports streaming responses, model selection, and handling of the web-based authentication/session mechanisms used by Google’s interface. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    Agent Development Kit (ADK)

    Agent Development Kit (ADK)

    Open-source, code-first Python toolkit for building, evaluating, etc.

    ADK Python helps developers verify hardware-backed keys, work with JSON Web Tokens (JWT), and integrate with Android’s Key Attestation infrastructure.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 14
    ScrapeGraphAI

    ScrapeGraphAI

    Python scraper based on AI

    Extracting content from websites and local documents using LLM. ScrapeGraphAI is a web scraping python library that uses LLM and direct graph logic to create scraping pipelines for websites and local documents (XML, HTML, JSON, Markdown, etc.). Just say which information you want to extract and the library will do it for you.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Browser Use

    Browser Use

    Make websites accessible for AI agents

    Browser-Use is a framework that makes websites accessible for AI agents, enabling automated interactions and data extraction from web pages.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    Browser Use MCP Server

    Browser Use MCP Server

    Browse the web, directly from Cursor etc.

    A browser automation server implementing the Model Context Protocol, designed to allow AI assistants to browse the web directly from applications like Cursor. It supports natural language commands for web navigation and interaction. ​
    Downloads: 5 This Week
    Last Update:
    See Project
  • 17
    Memvid

    Memvid

    Video-based AI memory library. Store millions of text chunks in MP4

    Memvid encodes text chunks as QR codes within MP4 frames to build a portable “video memory” for AI systems. This innovative approach uses standard video containers and offers millisecond-level semantic search across large corpora with dramatically less storage than vector DBs. It's self-contained—no DB needed—and supports features like PDF indexing, chat integration, and cloud dashboards.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 18
    edge-tts

    edge-tts

    Use Microsoft Edge's online text-to-speech service from Python

    edge-tts is a Python module and command-line tool that gives you direct access to Microsoft Edge’s online text-to-speech service without needing the Edge browser, Windows, or any API key. It wraps the same cloud voices used by Edge, exposing them through a simple CLI (edge-tts, edge-playback) and a Python API, so you can script high-quality speech generation in your own applications.
    Downloads: 19 This Week
    Last Update:
    See Project
  • 19
    FastSD CPU

    FastSD CPU

    Fast stable diffusion on CPU and AI PC

    FastSD CPU is an optimized fork of Stable Diffusion designed to run efficiently on CPUs and devices without dedicated GPUs by leveraging Latent Consistency Models and Adversarial Diffusion Distillation techniques that accelerate inference. It focuses on bringing fast text-to-image generation to mainstream hardware like desktop CPUs, lower-end laptops, or edge devices without requiring high-end graphics processors. The repository contains multiple interfaces including a desktop GUI for simple...
    Downloads: 43 This Week
    Last Update:
    See Project
  • 20
    GPT-SoVITS

    GPT-SoVITS

    1 min voice data can also be used to train a good TTS model

    GPT‑SoVITS is a state-of-the-art voice conversion and TTS system that enables zero‑shot and few‑shot synthesis based on a short vocal sample (e.g., 5 seconds). It supports cross‑lingual speech synthesis across English, Chinese, Japanese, Korean, Cantonese, and more. It's powered by VITS architecture enhanced for few‑sample adaptation and real‑time usability.
    Downloads: 55 This Week
    Last Update:
    See Project
  • 21
    Dendrite

    Dendrite

    Tools to build web AI agents that can authenticate

    Dendrite Python SDK is a toolkit for building web AI agents that can authenticate, interact with, and extract data from any website, facilitating web automation tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    ChatTTS webUI & API

    ChatTTS webUI & API

    A simple native web interface that uses ChatTTS to synthesize text

    ChatTTS-ui is a local web interface and API wrapper around the ChatTTS speech synthesis system, designed to make advanced TTS models easy to use from a browser. It runs a small backend server (Python + Torch + ffmpeg) and exposes a simple webpage where you can type text, adjust parameters, and generate audio. The project supports Chinese, English, and mixed text with digits and control symbols, making it suitable for bilingual content and numerically heavy text like announcements or prompts. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 23
    Taipy

    Taipy

    Turns Data and AI algorithms into production-ready web applications

    From simple pilots to production-ready web applications in no time. No more compromise on performance, customization, and scalability. Taipy enhances performance with caching control of graphical events, optimizing rendering by selectively updating graphical components only upon interaction. Effortlessly manage massive datasets with Taipy's built-in decimator for charts, intelligently reducing the number of data points to save time and memory without losing the essence of your data's shape....
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Substra

    Substra

    Low-level Python library used to interact with a Substra network

    An open-source framework supporting privacy-preserving, traceable federated learning and machine learning orchestration. Offers a Python SDK, high-level FL library (SubstraFL), and web UI to define datasets, models, tasks, and orchestrate secure, auditable collaborations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    AgenticSeek

    AgenticSeek

    Fully Local Manus AI. No APIs, No $200 monthly bills

    AgenticSeek is a fully local autonomous AI assistant designed as a privacy-focused alternative to cloud-based agent platforms. It runs entirely on the user’s hardware and can autonomously browse the web, write code, and plan multi-step tasks without sending data to external services. The system is optimized for local reasoning models and emphasizes zero cloud dependency to maintain full user control. AgenticSeek includes intelligent agent selection, allowing it to determine the best internal...
    Downloads: 10 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB