python crawler free download

43 projects for "python crawler" with 1 filter applied:

BSD Clear Filters & Widen Search

Business password and access manager solution for IT security teams
Simplify Access, Secure Your Business

European businesses use Uniqkey to simplify password management, reclaim IT control and reduce password-based cyber risk. All in one super easy-to-use tool.

Learn More
Free Website Monitoring Service | UptimeRobot
The free online uptime monitoring service with an App is available for iOS and Android.

With the Free Plan, you can monitor up to 50 URLs, check for a website's content (using the keyword monitor), ping your server or monitor your ports in 5-minute intervals. You can create a status page to showcase your uptime. SMS or Call alerts can be bought anytime.

Learn More
1

crawler

Collection of JS reverse engineering examples for web scraping study

crawler is a collection of web scraping and JavaScript reverse engineering examples designed for learning how modern websites protect their data and how those protections can be analyzed. It contains many case studies that demonstrate how to analyze and replicate request parameters, cookies, and encryption logic used by real websites. Each directory in the project focuses on a specific target service or scenario, showing how browser network requests and JavaScript code can be studied to...

Downloads: 0 This Week

Last Update: 1 day ago
See Project
2

tumblr-crawler

Python crawler to download photos and videos from Tumblr blogs

tumblr-crawler is an open source Python-based utility designed to download media content from Tumblr blogs. It provides a script that automatically retrieves photos and videos from specified Tumblr sites and saves them locally for offline access. Users can specify one or multiple blogs to crawl by editing a configuration file or by passing parameters through the command line.

Downloads: 2 This Week

Last Update: 3 days ago
See Project
3

Weibo Crawler

Python crawler for collecting and downloading Sina Weibo user data

weibo-crawler is a Python-based data collection tool designed to retrieve information from Sina Weibo user accounts. It automates the process of gathering posts, user profile details, and engagement metrics from one or more target accounts. weibo-crawler can extract comprehensive information about users, including profile attributes such as nickname, follower count, following count, and account metadata.

Downloads: 0 This Week

Last Update: 3 days ago
See Project
4

Python API for JMComic

Python crawler and API for downloading JMComic albums and images

JMComic-Crawler-Python is a Python library and crawler framework designed to programmatically access and download comic content from the JMComic platform. It provides a structured API that allows developers to retrieve albums, chapters, and images using simple Python code while handling the necessary network requests and data processing behind the scenes.

Downloads: 3 This Week

Last Update: 2026-04-07
See Project
The top-rated AI recruiting platform for faster, smarter hiring.
Humanly is an AI recruiting platform that automates candidate conversations, screening, and scheduling.

Humanly is an AI-first recruiting platform that helps talent teams hire in days, not months—without adding headcount. Our intuitive CRM pairs with powerful agentic AI to engage and screen every candidate instantly, surfacing top talent fast. Built on insights from over 4 million candidate interactions, Humanly delivers speed, structure, and consistency at scale—engaging 100% of interested candidates and driving pipeline growth through targeted outreach and smart re-engagement. We integrate seamlessly with all major ATSs to reduce manual work, improve data flow, and enhance recruiter efficiency and candidate experience. Independent audits ensure our AI remains fair and bias-free, so you can hire confidently.

Learn More
5

dxy-covid-19-crawler

Realtime crawler for COVID-19 outbreak statistics from DXY data

DXY-COVID-19-Crawler is a Python-based project designed to collect real-time COVID-19 infection data from the public dataset provided by Ding Xiang Yuan (DXY). The crawler periodically retrieves pandemic statistics and stores them in a database so that historical changes in the outbreak can be preserved and analyzed later. It was created to make up-to-date infection data more accessible for developers, researchers, and analysts who wanted to build visualizations or conduct data analysis during the early stages of the pandemic. ...

Downloads: 8 This Week

Last Update: 2 days ago
See Project
6

Python-Spider

Python3 web crawler practice

Python-Spider is a repository intended to teach or provide examples for writing web spiders / crawlers in Python — part of a broader learning and resource collection by its author. The code and documentation are oriented toward beginners or intermediate learners who want to learn how to fetch, parse, and extract data from websites programmatically. As part of the author’s public learning-path repositories, python-spider likely includes examples of HTTP requests, HTML parsing, maybe...

Downloads: 0 This Week

Last Update: 2025-12-08
See Project
7

FEAPDER

Powerful Python crawler framework for scalable web scraping tasks

...It also integrates monitoring and alerting capabilities to help developers track crawler performance and detect issues during execution. feapder includes browser rendering support for handling dynamic web pages and provides mechanisms for large-scale data deduplication during crawling.

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
8

spider_collection

Collection of Python web scraping scripts for data extraction tasks

spider_collection is a collection of Python web crawler scripts created primarily for experimentation, learning, and practical scraping tasks. spider_collection gathers multiple independent spiders designed to collect data from different platforms and services, demonstrating a variety of scraping techniques and workflows. These crawlers make use of common Python scraping tools such as requests, parsel, BeautifulSoup, and the Scrapy framework to extract structured information from web pages. ...

Downloads: 2 This Week

Last Update: 3 days ago
See Project
9

douyin

Open source Douyin crawler for collecting and downloading public data

DouyinCrawler is an open source data collection tool designed to gather publicly available information from the Douyin platform. It demonstrates how to build a Python-based web crawler combined with a graphical interface and command line functionality. It allows users to collect data from various types of Douyin content, including user profiles, videos, hashtags, and music pages. DouyinCrawler supports both automated scraping and batch operations to process multiple targets efficiently. It also integrates with the Aria2 download utility to enable large-scale downloading of videos and images associated with collected content. ...

Downloads: 6 This Week

Last Update: 2026-03-13
See Project
Inspections+ Mobile forms for Dynamics 365 - Resco.net
Start collecting field data without the hassles of complicated development thanks to resco.Inspections' native integration with Dynamics 365.

Equip your frontline teams with a robust digital solution to simplify data collection and reporting. Handle inspections and audits effortlessly, even in remote locations, and create comprehensive reports on the spot, all integrated with Dynamics 365.

Learn More
10

autocrawler

Multiprocess Selenium crawler for downloading images by keywords

...Users provide search terms through a simple keyword file, and the crawler organizes downloaded images into directories for each keyword. It can download either thumbnails or full resolution images and supports multiple image formats such as JPG, GIF, and PNG. It also includes configuration options such as headless mode, download limits, proxy usage, and thread count to customize crawling behavior.

Downloads: 1 This Week

Last Update: 3 days ago
See Project
11

PaSa

An advanced paper search agent powered by large language models

PaSa is an open-source “paper search agent” built around large language models (LLMs), designed to automate the process of academic literature retrieval with human-like decision making. Instead of simply translating a query into keywords and returning a flat list of matching papers, PaSa uses a dual-agent architecture (Crawler + Selector) that can iteratively search, read, analyze, and filter academic publications — simulating how a researcher might dig through citation networks, expand...

Downloads: 0 This Week

Last Update: 2025-12-02
See Project
12

news-please

Python tool for crawling and extracting structured data from news site

news-please is an open source news crawler and information extraction tool designed to collect and structure articles from online news websites. It provides an integrated pipeline that crawls news sites, retrieves article pages, and extracts structured information such as headlines, authors, publication dates, and article text. news-please can recursively follow internal links and read RSS feeds to gather both recent and archived articles from a news outlet when given only the root URL of a site. ...

Downloads: 0 This Week

Last Update: 3 days ago
See Project
13

diskover-community

Open source file indexing & storage analytics powered by Elasticsearch

Diskover Community Edition is an open source file system indexing and storage analytics platform designed to help organizations understand and manage large volumes of file data. It crawls file systems and indexes metadata using Elasticsearch, enabling fast search, analysis, and organization of files stored across different storage systems. It allows administrators and users to explore file structures, monitor storage usage, and gain insights into how data is distributed across...

Downloads: 1 This Week

Last Update: 2026-03-11
See Project
14

IPRanges

Daily updated lists of cloud, bot, and service IP ranges

ipranges is an open source repository that provides continuously updated lists of IP address ranges associated with major cloud providers, search engine crawlers, and online services. ipranges collects IP ranges from publicly available sources and organizes them into structured files that can be easily used in security, networking, and automation workflows. It includes address ranges from providers such as Google Cloud, Amazon AWS, Microsoft, Oracle Cloud, and DigitalOcean, as well as well...

Downloads: 2 This Week

Last Update: 5 days ago
See Project
15

watercrawl

AI-ready web crawler that extracts and structures website content

WaterCrawl is an open source web crawling and data extraction platform designed to transform website content into structured data suitable for machine learning and AI workflows. It enables developers and researchers to crawl web pages, extract meaningful information, and convert it into formats that are easier to process and analyze. It provides a modern crawling system that can automatically navigate links, control crawl depth, and collect content from targeted sections of a website....

Downloads: 4 This Week

Last Update: 2026-03-11
See Project
16

python-fxxk-spider

Collection of 100+ Python web scraping projects and crawler examples

python-fxxk-spider is a curated collection of Python web scraping and crawler projects gathered in a single repository for reference and learning. It aggregates many independent scraping examples that target a wide range of websites, online services, and public data sources. Instead of being a single crawler tool, it functions as a catalog of ready-made Python spider implementations that demonstrate different scraping techniques. python-fxxk-spider includes scrapers for social media, e-commerce platforms, job listings, music services, video platforms, and various content sites. ...

Downloads: 3 This Week

Last Update: 3 days ago
See Project
17

dirhunt

Web crawler that finds hidden web directories without brute force

Dirhunt is an open source security tool designed to discover web directories and analyze website structures without relying on brute-force techniques. Instead of sending large numbers of guess-based requests, it operates as a specialized crawler that intelligently explores websites to identify accessible or hidden directories. Dirhunt can detect directories that expose “Index Of” listings, which may reveal files and other resources that were not intended to be publicly visible. It can also...

Downloads: 6 This Week

Last Update: 2026-03-11
See Project
18

DecryptLogin

Python library providing APIs for automated website login workflows

DecryptLogin is a Python library designed to simplify automated login processes for many popular websites by providing ready-to-use APIs that simulate authentication behavior. It focuses on implementing login mechanisms through HTTP requests, allowing developers to programmatically authenticate with supported services without manually replicating complex login flows. It includes modules that handle different authentication modes such as PC login, mobile login, and QR code login depending on...

Downloads: 0 This Week

Last Update: 3 days ago
See Project
19

grab-site

Web crawler for archiving and backing up sites into WARC archives

grab-site is an open source web crawling tool designed to archive and back up websites by recursively downloading their content. It works by taking a starting URL and systematically following links across the site, capturing pages and resources and saving them into WARC archive files for long-term preservation. Internally, the crawler uses a fork of the wpull engine to fetch and process web pages efficiently during large-scale crawls. grab-site includes a built-in dashboard that displays...

Downloads: 0 This Week

Last Update: 3 days ago
See Project
20

pspider

Simple Python framework for building multithreaded web crawlers

...Its modular design also makes it easier to extend the framework with additional features or integrate it into existing Python projects.

Downloads: 1 This Week

Last Update: 3 days ago
See Project
21

instagram-profilecrawl

Instagram profile crawler that extracts posts, tags, and stats

...It also provides scripts for downloading images from crawled profiles and logging statistics into CSV files for tracking metrics like followers, likes, and comments. Authentication is optional, meaning the crawler can access public profile data without logging in.

Downloads: 3 This Week

Last Update: 3 days ago
See Project
22

lxspider

Educational Python web scraping case collection for many sites

lxSpider is a collection of web scraping examples designed primarily for learning and experimentation with data extraction techniques. It gathers numerous crawler implementations that demonstrate how to collect data from a wide range of websites and online services. It focuses heavily on practical cases that illustrate how different platforms handle requests, authentication parameters, and anti-scraping protections. lxSpider includes examples targeting areas such as e-commerce platforms,...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
23

ECommerceCrawlers

Collection of Python ecommerce and website crawler examples projects

ECommerceCrawlers is a collection of practical Python web crawler projects designed to gather data from a variety of ecommerce platforms, websites, and online services. It aggregates many independent crawler examples created by contributors and organized into separate subprojects that target specific sites or data sources. These examples demonstrate how to build and operate web scrapers capable of collecting structured information such as product listings, news content, job postings, social media data, and other publicly available web data. ...

Downloads: 8 This Week

Last Update: 3 hours ago
See Project
24

Photon

Incredibly fast crawler designed for OSINT

Photon is an extremely fast web crawler built specifically for OSINT and reconnaissance use cases. It is designed to extract URLs, endpoints, files, and other intelligence artifacts from target websites with minimal overhead. The crawler prioritizes speed and breadth, making it suitable for mapping web attack surfaces and discovering hidden resources. Photon is commonly used during early reconnaissance phases to build a comprehensive inventory of reachable assets.

Downloads: 6 This Week

Last Update: 2026-03-03
See Project
25

mzitu

Python crawler that downloads image galleries and analyzes titles

mzitu is a Python-based web crawling project designed to automatically download and organize image galleries from a specific photography site. It demonstrates how to build a scraper that navigates gallery pages, retrieves image links, and saves the images locally in a structured directory layout. It focuses on automating the collection of large sets of images by programmatically parsing page content and iterating through gallery entries. mzitu also includes a simple analysis script that...

Downloads: 3 This Week

Last Update: 3 days ago
See Project