Search Results for "web crawler source code"

Showing 80 open source projects for "web crawler source code"

View related business solutions
  • End-To-End Document Management Software Icon
    End-To-End Document Management Software

    UnForm is ideal for businesses focusing on distribution, manufacturing ERP solutions, and general accounting.

    UnForm® is a platform-independent software product that creates, delivers, stores and retrieves graphically enhanced documents from ERP application printing. A complete, end-to-end document management solution, UnForm interfaces at the point of printing to produce documents in various formats for printing and electronic delivery.
    Learn More
  • anny is an all-in-one platform for managing hybrid workplaces and shared resources. Icon
    anny is an all-in-one platform for managing hybrid workplaces and shared resources.

    For Businesses looking for a flexible solution for internal and external bookings

    Enable your employees to easily book desks, meeting rooms, parking spots, equipment, and more – all in one place. With flexible rules and group permissions, you stay in full control of who can access what.
    Learn More
  • 1
    Every Code

    Every Code

    Local AI coding agent CLI with multi-agent orchestration tools

    Every Code (often referred to simply as Code) is a fast, local AI-powered coding agent designed to run directly in the terminal environment. It is a community-driven fork of the Codex CLI, with a strong emphasis on improving real-world developer ergonomics and workflows. Every Code enhances the traditional coding assistant model by introducing multi-agent orchestration, allowing multiple AI agents to collaborate, compare solutions, and refine outputs in parallel. It supports integration with...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 2
    fess

    fess

    Open source enterprise search server for websites, files, and data

    ...Fess includes a built-in crawler that can collect content from sources such as databases, CSV files, and shared storage, making it suitable for centralized knowledge discovery. It supports indexing and searching across many document formats including office documents, PDFs, and compressed archives. It also provides a web-based administrative interface that allows administrators to configure crawling targets, manage indexing tasks, and adjust search settings from a graphical dashboard.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    mslearn-tailspin-spacegame-web

    mslearn-tailspin-spacegame-web

    Code used in Microsoft Learn modules to support Azure DevOps

    The Tailspin Space Game Web project is a sample application created by Microsoft as part of its learning resources. It’s a web-based game application used in Microsoft Learn modules and documentation to demonstrate concepts such as Azure App Services, continuous integration and delivery (CI/CD) pipelines, and DevOps practices with GitHub Actions and Azure Pipelines. The project is intentionally lightweight and easy to deploy so learners can quickly experiment with cloud deployment, testing,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    watercrawl

    watercrawl

    AI-ready web crawler that extracts and structures website content

    WaterCrawl is an open source web crawling and data extraction platform designed to transform website content into structured data suitable for machine learning and AI workflows. It enables developers and researchers to crawl web pages, extract meaningful information, and convert it into formats that are easier to process and analyze. It provides a modern crawling system that can automatically navigate links, control crawl depth, and collect content from targeted sections of a website....
    Downloads: 3 This Week
    Last Update:
    See Project
  • Ango Hub | All-in-one data labeling platform Icon
    Ango Hub | All-in-one data labeling platform

    For AI teams and Computer Vision team in organizations of all size

    AI-Assisted features of the Ango Hub will automate your AI data workflows to improve data labeling efficiency and model RLHF, all while allowing domain experts to focus on providing high-quality data.
    Learn More
  • 5
    WWWBasic

    WWWBasic

    wwwBASIC is an implementation of BASIC that runs on Node.js & the Web

    wwwBASIC is a JavaScript-based implementation of the classic BASIC programming language designed to run seamlessly in web browsers and Node.js environments. Created by Google, it allows developers and enthusiasts to write and execute BASIC programs directly within HTML pages or via command-line tools. The interpreter compiles BASIC source code into JavaScript at load time, enabling efficient execution within modern web environments without requiring external emulators or plugins. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    multiOTP open source

    multiOTP open source

    PHP strong authentication library, web interface & CLI, OATH certified

    multiOTP is a PHP class, a powerful command line utility and a web interface developed by SysCo systèmes de communication sa in order to provide a completely free and easy operating system independent server side implementation for strong two factors authentication solution. multiOTP supports hardware and software tokens with different One-Time Password algorithms like OATH/HOTP, OATH/TOTP and mOTP (Mobile-OTP). QRcode generation is also embedded in order to support provisioning of Google...
    Downloads: 50 This Week
    Last Update:
    See Project
  • 7
    Lotion

    Lotion

    Unofficial Notion.so app for Linux

    Welcome! This is an unofficial version of Notion.so electron app. Since NotionHQ is busy doing other amazing feature developments, Linux is low on its priority. Before you go ahead and install Lotion, I've found a better implementation called notion-enhancer which works seamlessly. You can try it out and if that solution works for you please use that instead. Lotion is Not actively maintained at this point, I might start working again at a later time. Thanks for all your support! During set...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    ChatGLM3

    ChatGLM3

    ChatGLM3 series: Open Bilingual Chat LLMs | Open Source Bilingual Chat

    ChatGLM3 is ZhipuAI & Tsinghua KEG’s third-gen conversational model suite centered on the 6B-parameter ChatGLM3-6B. It keeps the series’ smooth dialog and low deployment cost while adding native tool use (function calling), a built-in code interpreter, and agent-style workflows. The family includes base and long-context variants (8K/32K/128K). The repo ships Python APIs, CLI and web demos (Gradio/Streamlit), an OpenAI-format API server, and a compact fine-tuning kit. Quantization (4/8-bit),...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    DeepAnalyze

    DeepAnalyze

    Autonomous LLM agent for end-to-end data science workflows

    ...It integrates execution-based reasoning by generating and running code as part of its analysis process, allowing it to iteratively refine results and produce more accurate outputs. DeepAnalyze provides multiple interaction interfaces, including a web-based UI, a command-line interface, and a Jupyter-style notebook environment for interactive workflows.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Evertune | Improve Your Brand's Visibility in AI Search Icon
    Evertune | Improve Your Brand's Visibility in AI Search

    For enterprise marketing teams looking for a platform to understand and influence how AI models like ChatGPT recommend their products or services.

    Evertune is the Generative Engine Optimization (GEO) platform that helps brands improve visibility in AI search across ChatGPT, AI Overview, Gemini, Claude and more.
    Learn More
  • 10
    kimuraframework

    kimuraframework

    AI-first Ruby framework for building fast, flexible web scraping spide

    Kimurai is an open source web scraping framework written in Ruby that simplifies the process of building automated data extraction tools. It provides a clean domain-specific language that allows developers to define scraping logic and data schemas with minimal boilerplate code. Kimurai can use AI-assisted extraction to identify where data resides in HTML pages, automatically generating selectors that are cached for future use so subsequent scraping runs operate with pure Ruby performance. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    newspaper4k

    newspaper4k

    Python library for scraping and analyzing online news articles easily

    Newspaper4k is a Python library designed for extracting, processing, and analyzing news articles from websites. It is a continuation and active fork of the original newspaper3k library, which had stopped receiving updates, with the goal of keeping the ecosystem maintained while adding improvements and bug fixes. It provides developers with tools to automatically download web pages, extract the main article content, and collect associated metadata such as titles, authors, images, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    OSV.dev

    OSV.dev

    Open source vulnerability DB and triage service

    ...The platform includes a web UI, API, and a Go-based dependency scanner that checks software dependencies, container images, SBOMs (SPDX, CycloneDX), and Git repositories for known vulnerabilities. This repository contains the full infrastructure code for deploying osv.dev on Google Cloud Platform, including Terraform configurations, APIs, data pipelines, indexers, and background workers for vulnerability ingestion and impact analysis.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    Open SaaS

    Open SaaS

    Open source SaaS boilerplate for React, NodeJS apps with Wasp stack

    Open SaaS is a free and open source starter template designed to help developers quickly build and launch Software-as-a-Service applications. It is built on the Wasp full stack framework, which combines React, NodeJS, and Prisma to manage both client and server code within a unified architecture. Open SaaS provides a production-ready foundation that includes common SaaS functionality such as authentication, payments, analytics, and file uploads.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    GitDiagram

    GitDiagram

    AI tool that converts GitHub repositories into interactive diagrams

    GitDiagram is an open source web application designed to help developers quickly understand the structure and architecture of GitHub repositories by automatically generating interactive diagrams. It analyzes repository metadata such as the file tree and project documentation to build a visual representation of how different components of a project relate to one another. It uses an AI-powered pipeline to interpret repository structure and transform that information into system design diagrams...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    wappskafander_t2

    wappskafander_t2

    Wraps a HTTP server that might optionally run PHP by using FastCGI.

    wappskafander_t2 wraps an old version of a Hiawatha web server (hiawatha-webserver.org). If FastCGI and PHP are available, then the web server probably can execute PHP. As of 2022_11 this "branch" of the wappskafander_t2 will not be incrementally updated, because it, the wrapping code, NOT the wrapped web server, is an old mess that needs a total rewrite. The current version of the wrapping code is usable as a functional "blob" that serves HTTP and optionally PHP generated content from...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    twitch-batch-downloader

    Automate the download of entire Twitch.tv channels

    Automate the download of entire Twitch.tv channels with its metadata. Save each Twitch video into its own folder, with date and time values, video ID, stream metadata, frame screenshot, .ts parts list and sha256 hash. Keep the original ts files and generate mp4 files from them. It requires a shell and some command line utilities. See README.md for details in the Code/git section.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 17
    ZestISO

    ZestISO

    An easy to use Arch Linux based OS

    UPDATE: New distro name coming soon. Daily builds are temporarily suspended while I fix network issues. Is your Windows PC or Mac slow or unsupported? Don't throw it away, try ZestISO today! ZestISO is an easy to use, highly customisable Arch Linux based OS. It's fast, secure, and supports all your favourite software and games*. No ads, bloat or telemetry here! Editions: KDE Plasma Gaming (for beginners and gamers), XFCE (for low-end PCs) and IceWM (for servers and ultra low-end...
    Leader badge
    Downloads: 123 This Week
    Last Update:
    See Project
  • 18

    phpMariaEdit

    Continuation of phpMyEdit

    phpMyEdit development was stopped several years ago at version 5.7.1, with no support for newer MySQL or MariaDB, no support for newer PHP, and a lot of open TODOs. This kind of software still has a lot of use today but it needed an update to support the current technologies. As the original repository is frozen, new works have started in this repository. Version 5.7.2 supports PHP 7.x., uses mysqli_ and is for the rest very close to 5.7.1. Version 5.7.3 is a bugfix release with a lot...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    QBPWCF

    QBPWCF

    PHP library for not only web-based application in Fedora Linux

    此專案的目的是要建立簡單、易用、參數說明完整且富有調整性的PHP元件庫,讓PHP程式設計開發者可以輕鬆地建立高度客製化的應用。 套用當代的術語而言,就是要作為LOW CODE平台的函式庫。
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    Kobo XCSoar Launcher

    a customizable boot menu for your Kobo Mini

    Mainly this project aims at easing startup and use of XCSoar (see xcsoar.org) on Kobo Mini. But the scope of use should not be limited solely to XCSoar. The Launcher should be small in code size, reuse most of the original libraries, provide flexible configuration to any application that might be ported to the Kobo Mini. It supports customizable fonts, toolbox-pages, buttons (also graphical), labels, autostart and sleep timers.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 21
    Open Crypto Tracker

    Open Crypto Tracker

    Bitcoin Alts portfolio tracker, email / text / alexa / telegram alerts

    100% FREE / open source / PRIVATE cryptocurrency portfolio tracker. Email / text / alexa / telegram price alerts, price charts, mining calcs, leverage / gain / loss / balance stats, news feeds +more. Privately track Bitcoin / Ethereum / unlimited cryptocurrencies. Customize as many assets / markets / alerts / charts as you want. Over 50 Exchanges / 40 Trading Pairs Supported (exchanges / pairings list at bottom of README.txt): https://tinyurl.com/ct-readme Nearly Unlimited Assets...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    Reflex Platform

    Reflex Platform

    A curated package set and set of tools that let you build Haskell

    Reflex Platform is a curated package set and set of tools that let you build Haskell packages so they can run on a variety of platforms. Reflex Platform is built on top of the nix package manager. The core packages in Reflex Platform are known to work together and are tested together. the core packages in Reflex Platform are cached so you can download prebuilt binaries from the public cache instead of building from scratch. Nix locks down dependencies even outside the Haskell ecosystem...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    LlamaGPT

    LlamaGPT

    Self-hosted ChatGPT-like chatbot powered by Llama models locally

    LlamaGPT is a self-hosted chatbot application designed to provide a conversational AI experience similar to ChatGPT while running entirely on local hardware. It uses Llama-based large language models to generate responses and operate without requiring external AI services. Because the system runs locally, it keeps all interactions and data on the user's device, enabling a fully private environment for experimentation with AI chat interfaces. LlamaGPT includes both a user interface and an API...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 24
    WordPress Hardened

    WordPress Hardened

    Secure and performant Wordpress installation on Kubernetes cluster

    Hardened version of official WordPress container, with special support for Kubernetes. You can skip installation wizard by installing WordPress on container startup. This container uses wp-cli to install WordPress and plugins allowing you to prepare a fully automated website. git-clone-controller is a Kubernetes controller allowing to clone a GIT repository before a Pod is launched, can be used to automatically fetch your website theme within just few seconds before Pod starts.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Moriarty Project

    Moriarty Project

    Web-based OSINT tool for investigating phone number information

    Moriarty Project is an open source web-based investigation tool designed to gather publicly available information about phone numbers. It allows users to input a phone number and analyze various details related to that number through multiple investigation features. It performs information gathering by scraping data from online sources to retrieve insights such as owner information, spam risk, and related web references.
    Downloads: 10 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next
MongoDB Logo MongoDB