Search Results for "web crawler source code"

Showing 782 open source projects for "web crawler source code"

View related business solutions
  • Skillfully - The future of skills based hiring Icon
    Skillfully - The future of skills based hiring

    Realistic Workplace Simulations that Show Applicant Skills in Action

    Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
    Learn More
  • Contract Management Software | Concord Icon
    Contract Management Software | Concord

    AI-powered contract management that helps businesses track spending, negotiate smarter, and never miss deadlines.

    Concord serves small and mid-sized businesses and Fortune 500 companies. This robust, web-based platform is used by human resource, sales, procurement, and legal teams, and virtually anyone who deals with contracts.
    Learn More
  • 1
    WebMagic

    WebMagic

    A scalable web crawler framework for Java

    WebMagic is a scalable crawler framework. It covers the whole lifecycle of crawler, downloading, url management, content extraction and persistent. It can simplify the development of a specific crawler. WebMagic is a simple but scalable crawler framework. You can develop a crawler easily based on it. WebMagic has a simple core with high flexibility, a simple API for html extracting. It also provides annotation with POJO to customize a crawler, and no configuration is needed. Some other...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    Heritrix

    Heritrix

    Internet Archive's open-source, web-scale, web crawler project

    Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project. Heritrix (sometimes spelled heretrix, or misspelled or missaid as heratrix/heritix/heretix/heratix) is an archaic word for heiress (woman who inherits). Since our crawler seeks to collect and preserve the digital artifacts of our culture for the benefit of future researchers and generations, this name seemed apt.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    fess

    fess

    Open source enterprise search server for websites, files, and data

    ...Fess includes a built-in crawler that can collect content from sources such as databases, CSV files, and shared storage, making it suitable for centralized knowledge discovery. It supports indexing and searching across many document formats including office documents, PDFs, and compressed archives. It also provides a web-based administrative interface that allows administrators to configure crawling targets, manage indexing tasks, and adjust search settings from a graphical dashboard.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 4
    QR Code generator library

    QR Code generator library

    High-quality QR Code generator library in Java, TypeScript/JavaScript

    This project aims to be the best, clearest library for generating QR Codes. My primary goals are flexible options and absolute correctness. The secondary goals are compact implementation size and good documentation comments. This work is an independent implementation based on reading the official ISO specification documents. I believe that my library has a more intuitive API and shorter code length than competing libraries out there. The library is designed first in Java and then ported to...
    Downloads: 17 This Week
    Last Update:
    See Project
  • Gearset | The complete Salesforce DevOps solution Icon
    Gearset | The complete Salesforce DevOps solution

    Salesforce DevOps done right.

    Gearset is the only platform you need for unparalleled deployment success, continuous delivery, automated testing and backups.
    Learn More
  • 5
    The Apache Struts web framework

    The Apache Struts web framework

    Mirror of Apache Struts

    The Apache Struts web framework is a free open-source solution for creating Java web applications. Web applications differ from conventional websites in that web applications can create a dynamic response. Many websites deliver only static pages. A web application can interact with databases and business logic engines to customize a response. Web applications based on JavaServer Pages sometimes commingle database code, page design code, and control flow code. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    IntelliJ Community Edition

    IntelliJ Community Edition

    IntelliJ IDEA & IntelliJ Platform

    IntelliJ Community is the open source upstream of JetBrains’ IntelliJ IDEA, forming the core of a powerful, extensible, and intelligent development environment. It provides foundational features like a robust editor with code completion, syntax highlighting, refactoring tools, version control integrations, terminal, debugger, and plugin architecture. Since it’s open, community developers can contribute to language supports, UI tweaks, and platform enhancements.
    Downloads: 2,163 This Week
    Last Update:
    See Project
  • 7
    Odigos

    Odigos

    Distributed tracing without code changes

    ...Manage and configure collectors via a convenient web UI. Installing Odigos takes less than 5 minutes, and requires no code changes.
    Downloads: 36 This Week
    Last Update:
    See Project
  • 8
    Brokk

    Brokk

    Brokk brings code intelligence to AI

    Brokk is a code intelligence assistant framework designed to let large language models (LLMs) understand code semantically (not just as raw text) so that they can work effectively on large codebases that don’t fit wholly in a prompt context. It helps bridge the gap between LLMs and real-world engineering code by offering tooling to index, analyze, query, and augment code context, so that AI can meaningfully reason about existing code, suggest edits, and navigate across projects. Modular...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    SQLite JDBC Driver

    SQLite JDBC Driver

    SQLite JDBC Driver

    SQLite JDBC is a library for accessing and creating SQLite database files in Java. Our SQLiteJDBC library requires no configuration since native libraries for major OSs, including Windows, Mac OS X, Linux etc., are assembled into a single JAR (Java Archive) file. The usage is quite simple; download our sqlite-jdbc library, then append the library (JAR file) to your classpath. SQLite JDBC is a library for accessing SQLite databases through the JDBC API. SQLite supports on-memory database...
    Downloads: 492 This Week
    Last Update:
    See Project
  • Component Content Management System for Software Documentation Icon
    Component Content Management System for Software Documentation

    Great tool for serious technical writers

    Paligo is an end-to-end Component Content Management System (CCMS) solution for technical documentation, policies and procedures, knowledge management, and more.
    Learn More
  • 10
    Codename One

    Codename One

    Cross-platform framework for building truly native mobile apps

    An open-source mobile-first toolkit for building high-quality, cross-platform native apps for Android, iOS, Desktop & Web. Rapid cross-platform app development using Java or Kotlin with 100% code reuse. Apps are compiled down to native code for maximum performance and a smooth user experience. Write, debug, and test apps all inside your IDE (IntelliJ, Eclipse, VSCode or NetBeans) using the Codename One simulator.
    Downloads: 19 This Week
    Last Update:
    See Project
  • 11
    OpenRefine

    OpenRefine

    A free, open source, powerful tool for working with messy data

    OpenRefine is a powerful Java-based tool designed to work with messy data and improve it. With OpenRefine you can load data, understand it, clean it up, transform it, reconcile it, and augment it with web services and external data. It allows you to do this all from a web browser and in the convenience and privacy of your own computer. OpenRefine keeps all data securely in your computer by running a small server on it, using your web browser to interact with it. When you're ready, then that...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 12
    DevoxxGenie

    DevoxxGenie

    DevoxxGenie is a plugin for IntelliJ IDEA that uses local LLM's

    Devoxx Genie is a fully Java-based LLM Code Assistant plugin for IntelliJ IDEA, designed to integrate with local LLM providers such as Ollama, LMStudio, GPT4All, Llama.cpp, and Exo but also cloud-based LLMs such as OpenAI, Anthropic, Mistral, Groq, Gemini, DeepInfra, DeepSeek, OpenRouter and Azure OpenAI.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 13
    Eclipse Open VSX

    Eclipse Open VSX

    An open-source registry for VS Code extensions

    Open VSX is a vendor-neutral open-source alternative to the Visual Studio Marketplace. It provides a server application that manages VS Code extensions in a database, a web application similar to the VS Code Marketplace, and a command-line tool for publishing extensions similar to vsce. The default frontend is the one bundled in the Docker image, and is also used for testing in the development environment.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    SikuliX

    SikuliX

    SikuliX version 2.0.0+ (2019+)

    ...It uses image recognition powered by OpenCV to identify GUI components and can act on them with mouse and keyboard actions. This is handy in cases when there is no easy access to a GUI's internals or the source code of the application or web page you want to act on.
    Downloads: 152 This Week
    Last Update:
    See Project
  • 15
    J2CL

    J2CL

    Java to Closure JavaScript transpiler

    J2CL is a lightweight transpiler developed by Google that converts Java source code into highly optimized JavaScript designed to work seamlessly with the Closure Compiler. It allows developers to write applications in Java while targeting web environments, enabling strong type safety and code reuse across platforms. Unlike monolithic frameworks, J2CL focuses purely on transpilation, leaving optimization, bundling, and runtime concerns to the broader toolchain, which provides flexibility in modern development workflows. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    JeecgBoot

    JeecgBoot

    Low-code enterprise web development platform

    JeecgBoot is a low-code platform built on Spring Boot that accelerates enterprise application development with online forms, code generation, and a modern Vue-based frontend. It can generate CRUD screens, data dictionaries, and menu structures from database schemas, producing clean starter code that developers can extend. The platform integrates common enterprise features—RBAC permissions, data scopes, dictionary management, logging, and file/OSS integration—so teams start from a...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    Java JWT JSON

    Java JWT JSON

    Java JWT: JSON Web Token for Java and Android

    JJWT aims to be the easiest-to-use and understand library for creating and verifying JSON Web Tokens (JWTs) and JSON Web Keys (JWKs) on the JVM and Android. JJWT is a pure Java implementation based exclusively on the JOSE Working Group RFC specifications.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 18
    Hutool

    Hutool

    A set of tools that keep Java sweet

    Hutool is a small but comprehensive Java tool class library. Through static method encapsulation, it reduces the learning cost of related APIs, improves work efficiency, makes Java as elegant as a functional language, and makes the Java language "sweet". The tools and methods in Hutool come from each user's meticulous attention to detail. It covers all aspects of the underlying code of Java development. It is not only a sharp tool to solve small problems in large-scale project development,...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 19
    J2ObjC

    J2ObjC

    A Java to iOS Objective-C translation tool and runtime

    J2ObjC is an open-source command-line tool from Google that translates Java source code to Objective-C for the iOS (iPhone/iPad) platform. This tool enables Java source to be part of an iOS application's build, as no editing of the generated files is necessary. The goal is to write an app's non-UI code (such as application logic and data models) in Java, which is then shared by web apps (using GWT), Android apps, and iOS apps.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    AtlantaFX

    AtlantaFX

    Modern JavaFX CSS theme collection with additional controls

    Modern JavaFX CSS theme collection with additional controls.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 21
    AstronRPA

    AstronRPA

    Agent-ready RPA suite with visual workflow automation tools engine

    Astron RPA is an enterprise-grade robotic process automation platform designed to help organizations and developers build automated workflows for desktop and web applications. It provides a visual workflow designer that supports low-code and no-code development, allowing users to create automation processes through a drag-and-drop interface instead of writing extensive code. It enables automation of common desktop software and browser-based tasks, making it suitable for repetitive business...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    Testcontainers Java

    Testcontainers Java

    Testcontainers is a Java library that supports JUnit tests

    Testcontainers for Java is a Java library that supports JUnit tests, providing lightweight, throwaway instances of common databases, Selenium web browsers, or anything else that can run in a Docker container. Use a containerized instance of a MySQL, PostgreSQL or Oracle database to test your data access layer code for complete compatibility, but without requiring complex setup on developers' machines and safe in the knowledge that your tests will always start with a known DB state. Any other...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 23
    Takes

    Takes

    True object-oriented Java web framework without NULLs

    Takes is a true object-oriented and immutable Java8 web development framework. Pay attention that UTF-8 encoding is set on the command line. The entire framework relies on your default Java encoding, which is not necessarily UTF-8 by default. To be sure, always set it on the command line with file.encoding Java argument. We decided not to hard-code "UTF-8" in our code mostly because this would be against the entire idea of Java localization, according to which a user always should have a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Jenkins

    Jenkins

    Build great things at any scale

    Jenkins is the leading open-source automation server that allows you to build great things at any scale. Jenkins is built with Java and provides hundreds of plugins for building, deploying and automating virtually anything, allowing you to focus on more important things. Jenkins is often used for building projects, running tests, analyzing static code and deployment. Whatever is done repetitively, Jenkins can most likely execute and execute well, saving you time and optimizing your...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 25
    Eclipse Che

    Eclipse Che

    Next-gen container development platform, workspace server & cloud IDE

    Eclipse Che is a Kubernetes-native IDE that makes Kubernetes development accessible for development teams. It places everything a developer could need into containers in Kube pods including dependencies, embedded containerized runtimes, a web IDE, and project code. With the Kubernetes application in your development environment and an in-browser IDE, you can code, build, test and run applications exactly as they run on production from any machine.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB