Showing 38 open source projects for "python web crawler"

View related business solutions
  • Contract Management Software | Concord Icon
    Contract Management Software | Concord

    AI-powered contract management that helps businesses track spending, negotiate smarter, and never miss deadlines.

    Concord serves small and mid-sized businesses and Fortune 500 companies. This robust, web-based platform is used by human resource, sales, procurement, and legal teams, and virtually anyone who deals with contracts.
    Learn More
  • World class QA, 100% done-for-you Icon
    World class QA, 100% done-for-you

    For engineering teams in search of a solution to design, manage and maintain E2E tests for their apps

    MuukTest is a test automation service that combines our own proprietary, AI-powered software with expert QA services to help you achieve world class test automation at a fraction of the in-house costs.
    Learn More
  • 1
    Pholcus

    Pholcus

    Distributed high-concurrency crawler software written in pure golang

    Pholcus is a high-concurrency crawler software written in pure Go language that supports distributed, only used for programming learning and research. It supports three operating modes of stand-alone, server and client, and has three operating interfaces, Web, GUI, and command line; simple and flexible rules, concurrent batch tasks, and rich output methods (mysql/mongodb/kafka/csv/excel, etc.); In addition, it also supports horizontal and vertical grabbing modes, and a series of advanced functions such as simulated login and task suspension and cancellation. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    JupyterLab

    JupyterLab

    JupyterLab computational environment

    JupyterLab is the next-generation web-based user interface for Project Jupyter. Try it on Binder. JupyterLab follows the Jupyter Community Guides. JupyterLab enables you to work with documents and activities such as Jupyter notebooks, text editors, terminals, and custom components in a flexible, integrated, and extensible manner. You can arrange multiple documents and activities side by side in the work area using tabs and splitters. Documents and activities integrate with each other,...
    Downloads: 253 This Week
    Last Update:
    See Project
  • 3
    Render Farm Manager, Project Tracker.

    Render Farm Manager, Project Tracker.

    CGRU: Afanasy render farm manager and RULES project tracker.

    CGRU is an open source CG tools pack, includes Afanasy render farm manager and RULES project tracker.
    Leader badge
    Downloads: 31 This Week
    Last Update:
    See Project
  • 4

    Ganglia

    Scalable, distributed monitoring system for high-performance computing

    Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. Supports clusters up to 2000 nodes in size.
    Downloads: 25 This Week
    Last Update:
    See Project
  • Empower Your Workforce and Digitize Your Shop Floor Icon
    Empower Your Workforce and Digitize Your Shop Floor

    Benefits to Manufacturers

    Easily connect to most tools and equipment on the shop floor, enabling efficient data collection and boosting productivity with vital insights. Turn information into action to generate new ideas and better processes.
    Learn More
  • 5
    GloVe

    GloVe

    GloVe model for distributed word representation

    ...It collects unigram counts, constructs and shuffles cooccurrence data, and trains a simple version of the GloVe model. It also runs a word analogy evaluation script in python to verify word vector quality.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    The goal of this project is to make possible to access Progress database from any external program that can use sockets. The server (broker and agents) are written in Progress 4GL and made use of sockets capabilities of Progress V9.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    ReorJS

    Distributed Computing with JavaScript

    Create your own distributed computer that can distributed javascript based applications to any computer with a web browser, headless browser or node.js installation. For more information and updates please see our website - http://reorjs.com.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    RainforestCluster

    Dynamically manage Amazon EC2 clusters

    RainforestCluster is an Amazon EC2 python program that manages and load-balances dynamic clusters to allow for maximum workflow flexibility and speed at minimal cost. It enables one to quickly and cheaply create dynamic compute clusters in the cloud, which can then run computational pipelines generically. It is also able to optimize the use of spot instances - idle computers in Amazon's cloud that are available at drastically reduced cost (5x-10x cheaper) - but can be terminated at any...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    Ganglia Job Monarch

    Batch system monitoring and archiving

    Job Monarch is an addon to the Ganglia Monitoring System that provides batch job monitoring and archiving plus a graphical overview of clusters and assorted batch systems. Fully supported batch system: Torque, PBS and SLURM. Experimental: LSF, SGE
    Downloads: 0 This Week
    Last Update:
    See Project
  • Resco toolkit for building mobile apps Icon
    Resco toolkit for building mobile apps

    A no-code toolkit for building responsive and resilient mobile business applications for Microsoft Power Platform, Dynamics 365, Dataverse and Salesfo

    Deploying mobile apps with Resco takes days, not months—all without writing a single line of code. Workers can download the Resco app from AppStore, Google Play, or Windows Store, log into your company environment, and instantly use the app you have published on any device.
    Learn More
  • 10
    Portable Linux

    Portable Linux

    Portable Ubuntu Linux for Scientific Computing

    Released August 22, 2013 Lubuntu Blends: Biochemistry 13.04 (Raring) v5.44 Linux Kernel Image 3.8.0-29 Lubuntu Blends are pre-installed Wubi disk image remixes of Ubuntu and Debian Science meta packages, A custom boot loader allows installations to be copied and automatically booted from most external or USB flash drives. Once up and running, use earlier Lubuntu Remix README instructions here until documentation is updated....
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    WatchTower

    WatchTower

    WatchTower is a cloud server monitoring and management tool

    WatchTower is a cloud server monitoring and management tool. This is actually a suite of tools that includes a dashboard and associated RESTful web services required for managing the servers and services. The dashboard uses PHP/MySQL (requires php5+), html, and css. It's all open source and very easy to work with and make changes. The client I'm using is included and is written in python. Currently tested with Python 2.4 and 2.6, but should work with any version. I'm not using anything special. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Ex-Crawler
    Ex-Crawler is divided into 3 subprojects (Crawler Daemon, distributed gui Client, (web) search engine) which together provide a flexible and powerful search engine supporting distributed computing. More informations: http://ex-crawler.sourceforge.net
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Spyse is a software framework for building multi-agent systems. It allows Python developers to build distributed intelligent systems of multiple cooperative agents based on FIPA, OWL, SOA and many others. Spyse is designed for ease-of-use and fun.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    A highly modular client remote/web services library written in Python supporting multiple protocols and transports through a unified interface. All modules are as independent as possible from each other to ensure high re-usability.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    IDEAIS is a enteprise service bus integration plataform for software development tools and activities. It uses Web Services (SOAP/HTTP) to integrate best of the breed software development tools (Eclipse, Subversion, Bugzilla, dotProject, vTiger).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    An implementation model for unifying Aspect Oriented Programming and Service Oriented Architecture.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    iROS is a meta-operating system for technology-rich "interactive rooms". The core components (Event Heap, DataHeap, iCrafter) provide communication, data storage, and service management for an iRoom.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    GRIDportal is a web-based application portal for High Performance Computing. It facilitates easier access to GRID applications through a comfortable web interface. It was designed for use with NorduGrid, but is very generic by nature.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    A Console-based BitTorrent Client with built-in scheduler for handling multiple sessions. It is designed to manage sessions in queue easily without heavy-weight GUI. External module can search for new torrents in trackers and submit it automatically.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 20
    OSE is a C++ library, with some Python wrappers, containing generic classes, as well as support for event driven systems, interprocess communications and a request/reply, publish/subscribe service agent framework with RPC over HTTP interface.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 21
    phpMyLibrary is a PHP MySQL Library automation application. The program consist of cataloging, circulation, and the webpac module. The programs also has an import export feature. The program strictly follow the USMARC standard for adding materials.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    LemFS is a distributed, redundant data storage system, designed to utilize unused disk space on networked work stations and desktop PCs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Cheshire3 is a fast Z39.50, SRW, XML search engine, written in Python for extensability and using C libraries for speed. Next generation of the Cheshire system (http://cheshire.berkeley.edu) and designed around a distributable, object oriented model.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    Dowser is a research tool for the web. It clusters results from search engines, associates words that appear in previous searches, and keeps a local cache of all the results you click on in a searchable database. It helps you keep track of what you find.
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Framework for software component integration, interoperability and adoptability through a XML based vocabulary: Software Component Integration Mark-up Language (SCIML)
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB