python module free download

Showing 3 open source projects for "python module"

View related business solutions

Web Scrapers Linux Clear Filters & Widen Search

Ango Hub | All-in-one data labeling platform
For AI teams and Computer Vision team in organizations of all size

AI-Assisted features of the Ango Hub will automate your AI data workflows to improve data labeling efficiency and model RLHF, all while allowing domain experts to focus on providing high-quality data.

Learn More
Tremendous is the global payouts platform for businesses sending gift cards and money at scale.
Getting started is simple: add a funding method and place your first order in minutes.

Trusted by 20,000+ leading organizations, Tremendous has delivered billions of rewards and enables businesses to reach recipients across 230+ countries and regions. Recipients have 2,500+ payout options to choose from, including gift cards, prepaid cards, cash transfers, and charitable donations.

Learn More
1

Scrapy-Redis

Redis-based components for Scrapy

...Scraped items gets pushed into a redis queued meaning that you can start as many as needed post-processing processes sharing the items queue. Scheduler + Duplication Filter, Item Pipeline, Base Spiders. Default requests serializer is pickle, but it can be changed to any module with loads and dumps functions. Note that pickle is not compatible between python versions. Version 0.3 changed the requests serialization from marshal to cPickle, therefore persisted requests using version 0.2 will not able to work on 0.3. The class scrapy_redis.spiders.RedisSpider enables a spider to read the urls from redis. The urls in the redis queue will be processed one after another, if the first request yields more requests, the spider will process those requests before fetching another url from redis.

Downloads: 2 This Week

Last Update: 2024-07-06
See Project
2

Crawlab

Distributed web crawler admin platform for spiders management

...Master node and worker nodes communicate with each other via gRPC (a RPC framework). Tasks are scheduled by the task scheduler module in the master node, and received by the task handler module in worker nodes, which executes these tasks in task runners. Task runners are actually processes running spider or crawler programs, and can also send data through gRPC (integrated in SDK) to other data sources, e.g. MongoDB.

Downloads: 8 This Week

Last Update: 2023-07-26
See Project
3

AutoScraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

This project is made for automatic web scraping to make scraping easy. It gets a URL or the HTML content of a web page and a list of sample data that we want to scrape from that page. This data can be text, URL or any HTML tag value of that page. It learns the scraping rules and returns similar elements. Then you can use this learned object with new URLs to get similar content or the exact same element of those new pages.

Downloads: 0 This Week

Last Update: 2023-04-12
See Project