Showing 213 open source projects for "web crawler source code"

View related business solutions
  • Iris Powered By Generali - Iris puts your customer in control of their identity. Icon
    Iris Powered By Generali - Iris puts your customer in control of their identity.

    Increase customer and employee retention by offering Onwatch identity protection today.

    Iris Identity Protection API sends identity monitoring and alerts data into your existing digital environment – an ideal solution for businesses that are looking to offer their customers identity protection services without having to build a new product or app from scratch.
    Learn More
  • Field Sales+ for MS Dynamics 365 and Salesforce Icon
    Field Sales+ for MS Dynamics 365 and Salesforce

    Maximize your sales performance on the go.

    Bring Dynamics 365 and Salesforce wherever you go with Resco’s solution. With powerful offline features and reliable data syncing, your team can access CRM data on mobile devices anytime, anywhere. This saves time, cuts errors, and speeds up customer visits.
    Learn More
  • 1
    WebMagic

    WebMagic

    A scalable web crawler framework for Java

    WebMagic is a scalable crawler framework. It covers the whole lifecycle of crawler, downloading, url management, content extraction and persistent. It can simplify the development of a specific crawler. WebMagic is a simple but scalable crawler framework. You can develop a crawler easily based on it. WebMagic has a simple core with high flexibility, a simple API for html extracting. It also provides annotation with POJO to customize a crawler, and no configuration is needed. Some other...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    fess

    fess

    Open source enterprise search server for websites, files, and data

    ...Fess includes a built-in crawler that can collect content from sources such as databases, CSV files, and shared storage, making it suitable for centralized knowledge discovery. It supports indexing and searching across many document formats including office documents, PDFs, and compressed archives. It also provides a web-based administrative interface that allows administrators to configure crawling targets, manage indexing tasks, and adjust search settings from a graphical dashboard.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 3
    Takes

    Takes

    True object-oriented Java web framework without NULLs

    Takes is a true object-oriented and immutable Java8 web development framework. Pay attention that UTF-8 encoding is set on the command line. The entire framework relies on your default Java encoding, which is not necessarily UTF-8 by default. To be sure, always set it on the command line with file.encoding Java argument. We decided not to hard-code "UTF-8" in our code mostly because this would be against the entire idea of Java localization, according to which a user always should have a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Eclipse Che

    Eclipse Che

    Next-gen container development platform, workspace server & cloud IDE

    Eclipse Che is a Kubernetes-native IDE that makes Kubernetes development accessible for development teams. It places everything a developer could need into containers in Kube pods including dependencies, embedded containerized runtimes, a web IDE, and project code. With the Kubernetes application in your development environment and an in-browser IDE, you can code, build, test and run applications exactly as they run on production from any machine.
    Downloads: 1 This Week
    Last Update:
    See Project
  • The AI workplace management platform Icon
    The AI workplace management platform

    Plan smart spaces, connect teams, manage assets, and get insights with the leading AI-powered operating system for the built world.

    By combining AI workflows, predictive intelligence, and automated insights, OfficeSpace gives leaders a complete view of how their spaces are used and how people work. Facilities, IT, HR, and Real Estate teams use OfficeSpace to optimize space utilization, enhance employee experience, and reduce portfolio costs with precision.
    Learn More
  • 5
    HTTP Kit

    HTTP Kit

    Clojure HTTP server/client library with WebSocket support

    http-kit is a minimalist, event-driven, high-performance Clojure HTTP server/client library with WebSocket and asynchronous support. A simple, high-performance event-driven HTTP client+server for Clojure. HTTP Kit is an (almost) drop-in replacement for the standard Ring Jetty adapter. So you can use it with all your current libraries (e.g. Compojure) and middleware. Using an event-driven architecture like Nginx, HTTP-kit is very, very fast. It comfortably handles tens of thousands of...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    Lobo Evolution - Java Web Browser

    Lobo Evolution - Java Web Browser

    Lobo Evolution is an extensible all-Java web browser and RIA platform

    ...I'm waiting your first commit! Source code: https://github.com/LoboEvolution/LoboEvolution
    Leader badge
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    Brill Software

    Brill Software

    A faster way to develop React Web Applications

    The Brill Framework allows React web applications to be built quickly using a "Low Code" approach. A Content Management System (CMS) supports editing of pages containing React components. The React components communicate with each other and the Server using a middleware that's based on WebSockets. With a "No Code" solution, there's always something you require that's not support. You spend ages bending the product to your requirements or pay the supplier to provide the components...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    GeoNetwork opensource - Metadata Catalog
    ...You can also connect directly with the companies supporting the development. Source code available on github https://github.com/geonetwork/
    Leader badge
    Downloads: 172 This Week
    Last Update:
    See Project
  • 9
    JForum2

    JForum2

    Bug fixes and enhancements for JForum 2.x

    JForum is a powerful and robust discussion board system implemented in Java. It provides an attractive interface, an efficient forum engine, an easy to use administrative panel, an advanced permission control system and much more. Built from the ground up around an MVC framework , it can be deployed on any Servlet 3.1 container or application server running at least Java 8, such as Tomcat, Jetty and JBoss/WildFly. Its clean design and implementation make JForum easy to customize and...
    Leader badge
    Downloads: 66 This Week
    Last Update:
    See Project
  • Outbound sales software Icon
    Outbound sales software

    Unified cloud-based platform for dialing, emailing, appointment scheduling, lead management and much more.

    Adversus is an outbound dialing solution that helps you streamline your call strategies, automate manual processes, and provide valuable insights to improve your outbound workflows and efficiency.
    Learn More
  • 10
    JVx - Enterprise Application Framework

    JVx - Enterprise Application Framework

    Java Application Framework

    Develop professional database applications, highly performant with little source code. JVx is a full-stack application framework to create multi tier applications with Single Sourcing for different technologies (Swing, vaadin, react, ...). Nightly builds are available: https://dev.sibvisions.com/jvx.nightly/ Maven snapshots are available: https://oss.sonatype.org/content/repositories/snapshots Eclipse plugin is available: http://marketplace.eclipse.org/search/site/eplug
    Downloads: 8 This Week
    Last Update:
    See Project
  • 11
    MGB OpenSource Guestbook
    MGB is a free OpenSource Guestbook completely written in PHP, using JavaScript and a MySQL Database. Easy to use, flexible and customizable with templates to make it fit 100% to your homepage.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 12
    Crawlab

    Crawlab

    Distributed web crawler admin platform for spiders management

    Golang-based distributed web crawler management platform, supporting various languages including Python, NodeJS, Go, Java, PHP and various web crawler frameworks including Scrapy, Puppeteer, Selenium. Please use docker-compose to one-click to start up. By doing so, you don't even have to configure MongoDB database. The frontend app interacts with the master node, which communicates with other components such as MongoDB, SeaweedFS and worker nodes. Master node and worker nodes communicate...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    Stripes
    Stripes is a Java web framework with the goal of making Servlet/JSP based web development in Java as easy, intuitive and straight-forward as possible. Stripes has always been guided by the following principles: * Convention over configuration (CoC) * Extremely lightweight with very few external dependencies. * Quick, iterative code/deploy/test experience for developers. * Application stack agnostic. Developers can integrate Stripes into their existing application stacks. * Do a few...
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    ACHE Focused Crawler

    ACHE Focused Crawler

    ACHE is a web crawler for domain-specific search

    ACHE is a focused web crawler. It collects web pages that satisfy some specific criteria, e.g., pages that belong to a given domain or that contain a user-specified pattern. ACHE differs from generic crawlers in sense that it uses page classifiers to distinguish between relevant and irrelevant pages in a given domain. A page classifier can be defined as a simple regular expression (e.g., that matches every page that contains a specific word) or a machine-learning-based classification model....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    ftserver-android

    ftserver-android

    Self-hosted search engine with web service to share discoveries with

    Full Text Search Engine for Android Mobile, Windows Desktop, Linux Server. You can use the KeyWord to find relative WebSites, dig in important information, search answers. It has a web server inside, use it to share discoveries with people. App's Source Codes included, can be freely distributed over the internet in an unchanged or changed form. Check the file size after downloaded the Android APK. https://sourceforge.net/projects/ftserver-android/files/ The Code Repository includes FTServer Android Version Source Code (Android) FTServer Java Server Version Source Code (Linux Windows) FTServer .NET Server Version Source Code (Linux Windows) https://sourceforge.net/p/ftserver-android/code/
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    JAMon API

    JAMon API

    Monitor Java applications - SQL, HTTP, Methods, Exceptions and more.

    JAMon API is a free, simple, high performance, thread safe, Java API that allows developers to easily monitor the performance and scalability of production applications. JAMon tracks hits, execution times (total, avg, min, max, std dev), and more. * JAMon Users Manual: For more on the JAMon, including installing, configuring, and using it, see http://jamonapi.sourceforge.net/. * Support: If you have any questions about usage please post a question on the forum at ...
    Leader badge
    Downloads: 44 This Week
    Last Update:
    See Project
  • 17
    SiteofSiteIDE

    SiteofSiteIDE

    Static site IDE is a Static Site Generators aka Static Site Editor

    Static website generator instead of php/asp for maximum speed (an element valued by SEO strategies). In reality, a minimum of php/asp code is used to establish the browser language and cookie management. Support for the GDPR is included as an example (it should be modified according to the processing of the website owner's data).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    JDynamiTe, Dynamic Template in Java

    JDynamiTe, Dynamic Template in Java

    Dynamically generate documents from templates

    JDynamiTe is a tool which allows you to dynamically create documents in any format from "template" documents. And very few lines of code (or no line at all!) are needed to do that. Some typical usage domains of JDynamiTe are: - dynamic Web pages creation, - text document generation, - source code generation... In fact, it can be useful in any case where pre-defined documents (templates) have to be dynamically populated with data. The main benefit of JDynamiTe is to allow a true separation between data (content), presentation (container) and content generation code (written in Java). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    This is software to create web dictionaries, esp. for Esperanto like Reta Vortaro (http://reta-vortaro.de). A dictionary is made from articles written in a special XML dialect by transformations using XSLT, ant and some Java code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    lxspider

    lxspider

    Educational Python web scraping case collection for many sites

    lxSpider is a collection of web scraping examples designed primarily for learning and experimentation with data extraction techniques. It gathers numerous crawler implementations that demonstrate how to collect data from a wide range of websites and online services. It focuses heavily on practical cases that illustrate how different platforms handle requests, authentication parameters, and anti-scraping protections. lxSpider includes examples targeting areas such as e-commerce platforms,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    SonarQube Plugin for Swift

    SonarQube Plugin for Swift

    Open source Swift plugin for SonarQube (also supports Objective-C)

    sonar-swift is an open-source plugin for integrating Swift and Objective-C code analysis into SonarQube. It allows developers to detect code quality issues, bugs, and vulnerabilities in iOS and macOS projects, offering automated insights to improve and maintain code standards.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    Muon SSH Terminal/SFTP client

    Muon SSH Terminal/SFTP client

    Graphical SFTP client and terminal emulator with helpful utilities

    Easy and fun way to work with remote servers over SSH. This project is being renamed as previous name "Snowflake" is confusing since there is already a popular product with the same name. Muon is a graphical SSH client. It has an enhanced SFTP file browser, SSH terminal emulator, remote resource/process manager, server disk space analyzer, remote text editor, huge remote log viewer, and lots of other helpful tools, which makes it easy to work with remote servers. Muon provides functionality...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 23
    Cicada

    Cicada

    Fast lightweight HTTP service framework

    Fast, lightweight Web framework based on Netty; without too much dependency, and the core jar package is only 30KB. Configuration files can also be read in multiple environments, just add VM parameters, also ensure that the parameter name and file name are consistent.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Cerberus Content Management System

    Cerberus Content Management System

    Cerberus Content Management System

    Cerberus Content Management System is a Monolithic and Modular Content Management System that is written in 100% Pure PHP code with 100% Pure HTML output, and it supports multiple Database Management Systems. Cerberus Content Management System source code is completely handwritten by the author(s). The CerberusCMS project is focused on data security and ease of use, therefore we have decided to make very little use of JavaScript in the PurePHP Releases. The still-secure, and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    panFMP
    panFMP is a generic framework suitable for harvested XML metadata that is searchable through Apache Lucene without any additional RDBMS. Fields can be defined by XPath allowing for full text queries on all types of fields including numerical ranges. The code was moved to Github: https://github.com/pangaea-data-publisher/panfmp
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB