web scraping free download

Showing 7 open source projects for "web scraping"

View related business solutions

Formats and Protocols Linux Clear Filters & Widen Search

No-Nonsense Code-to-Cloud Security for Devs | Aikido
Connect your GitHub, GitLab, Bitbucket or Azure DevOps account to start scanning your repos for free.

Aikido provides a unified security platform for developers, combining 12 powerful scans like SAST, DAST, and CSPM. AI-driven AutoFix and AutoTriage streamline vulnerability management, while runtime protection blocks attacks.

Learn More
Powering the next decade of business messaging | Twilio MessagingX
For organizations interested programmable APIs built on a scalable business messaging platform

Build unique experiences across SMS, MMS, Facebook Messenger, and WhatsApp – with our unified messaging APIs.

Learn More
1

Rod

A Devtools driver for web automation and scraping

Rod is a high-level driver for DevTools Protocol. It's widely used for web automation and scraping. Rod can automate most things in the browser that can be done manually. Chained context design, intuitive to timeout or cancel the long-running task. Auto-wait elements to be ready. Debugging friendly, auto input tracing, remote monitoring headless browser. Thread-safe for all operations. Automatically find or download browser.

Downloads: 0 This Week

Last Update: 2024-07-12
See Project
2

Happy DOM

Happy DOM is a JavaScript implementation of a web browser

Happy DOM is a JavaScript implementation of a web browser without its graphical user interface. It includes many web standards from WHATWG DOM and HTML. The goal of Happy DOM is to emulate enough of a web browser to be useful for testing, scraping web sites, and server-side rendering. Happy DOM focuses heavily on performance and can be used as an alternative to JSDOM. Happy DOM now supports Declarative Shadow DOM which can be used for server-side rendering of web components. ...

Downloads: 3 This Week

Last Update: 2026-04-13
See Project
3

WebHarvest - web data extraction tool

Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.

14 Reviews

Downloads: 3 This Week

Last Update: 2025-10-27
See Project
4

unfluff

Automatically extract body content (and other cool stuff) from HTML

unfluff is a Node.js library designed to automatically extract the main content from an HTML document — stripping away navigation bars, ads, footers and other boilerplate to leave you with the “body content”, metadata (title, author, date) and other useful fields. It’s a tool very much aimed at content-analysis, web scraping, building datasets, or repurposing article text for downstream processing (like machine-learning or summarization). The API is simple: you feed in raw HTML and it returns a structured object with the extracted text and other fields. It supports caching internal representations to speed up repeated extractions. While its language support is best for English, it is still widely used in web-content-processing pipelines. ...

Downloads: 0 This Week

Last Update: 2025-11-14
See Project
Secure your business by securing your people.
Over 100,000 businesses trust 1Password

Take the guesswork out of password management, shadow IT, infrastructure, and secret sharing so you can keep your people safe and your business moving.

Learn More
5

Simple-Scrape

Simple-Scrape is a simple web-scraping library that allows for programmatic access to HTML code. No further techniques are needed and the library is very compact and thus easy to use.

Downloads: 0 This Week

Last Update: 2017-04-28
See Project
6

datalus

PHP web API designed to simplify object handling(loading, saving, querying, displaying, and editing), abstract the data from its display structure, and layout and allow the target data to be delivered to any supported format without special logic.

Downloads: 0 This Week

Last Update: 2016-05-28
See Project
7

Xidel

Xidel is a cli webpage scraping tool supporting XPath/XQuery 3 and CSS

Xidel is a command line tool to download web pages and extract data from them. This data can be extracted using XPath/XQuery 3.0 (with a compatibility modes for XPath 2.0 and XQuery 1.0), JSONiq, CSS 3 selectors, and custom, pattern-matching templates that are like an annotated version of the processed page. It can download files over HTTP/S connections, follow redirections, links, or extracted values, and also process local files. The extracted values can then be exported as...

3 Reviews

Downloads: 0 This Week

Last Update: 2017-05-12
See Project