Scrapyd can manage multiple projects and each project can have multiple versions uploaded, but only the latest one will be used for launching new spiders. A common (and useful) convention to use for the version name is the revision number of the version control tool you’re using to track your Scrapy project code. For example: r23. The versions are not compared alphabetically but using a smarter algorithm (the same packaging uses) so r10 compares greater to r9, for example. Scrapyd is an application (typically run as a daemon) that listens to requests for spiders to run and spawns a process for each one. Scrapyd also runs multiple processes in parallel, allocating them in a fixed number of slots given by the max_proc and max_proc_per_cpu options, starting as many processes as possible to handle the load.

Features

  • Scrapyd is a service for running Scrapy spiders
  • It allows you to deploy your Scrapy projects and control their spiders using an HTTP JSON API
  • Documentation available
  • Scrapyd comes with a minimal web interface
  • For monitoring running processes and accessing logs
  • You can use ScrapydWeb to manage your Scrapyd cluster

Project Samples

Project Activity

See All Activity >

Categories

Web Scrapers

License

BSD License

Follow Scrapyd

Scrapyd Web Site

Other Useful Business Software
The full-stack observability platform that protects your dataLayer, tags and conversion data Icon
The full-stack observability platform that protects your dataLayer, tags and conversion data

Stop losing revenue to bad data today. and protect your marketing data with Code-Cube.io.

Code-Cube.io detects issues instantly, alerts you in real time and helps you resolve them fast. No manual QA. No unreliable data. Just data you can trust and act on.
Learn More
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Scrapyd!

Additional Project Details

Programming Language

Python

Related Categories

Python Web Scrapers

Registered

2023-04-10