Showing 54 open source projects for "python web crawler"

View related business solutions
  • Field Service+ for MS Dynamics 365 & Salesforce Icon
    Field Service+ for MS Dynamics 365 & Salesforce

    Empower your field service with mobility and reliability

    Resco’s mobile solution streamlines your field service operations with offline work, fast data sync, and powerful tools for frontline workers, all natively integrated into Dynamics 365 and Salesforce.
    Learn More
  • The AI workplace management platform Icon
    The AI workplace management platform

    Plan smart spaces, connect teams, manage assets, and get insights with the leading AI-powered operating system for the built world.

    By combining AI workflows, predictive intelligence, and automated insights, OfficeSpace gives leaders a complete view of how their spaces are used and how people work. Facilities, IT, HR, and Real Estate teams use OfficeSpace to optimize space utilization, enhance employee experience, and reduce portfolio costs with precision.
    Learn More
  • 1
    A collection of tools for working with the comparative data analysis ontology including import/export facilities for common phylogenetic file formats, and also a triple-store framework.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    This software enables easy creation and sharing of district maps online.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    The Virtual Commons (http://commons.asu.edu) is an open software initiative devoted to computational experiments on collective action and resource governance and funded by Arizona State University's Center for Behavior, Institutions, and the Environment (http://cbie.asu.edu). NOTE: we've moved our development to GitHub at https://github.com/virtualcommons - please look for the latest versions there.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    PowerTalk automatically speaks Microsoft PowerPoint presentations. For presenters who find speaking difficult, audiences containing people with visual impairments and fun educational uses. Uses synthesised computer speech provided with Windows
    Downloads: 24 This Week
    Last Update:
    See Project
  • Collect! is a highly configurable debt collection software Icon
    Collect! is a highly configurable debt collection software

    Everything that matters to debt collection, all in one solution.

    The flexible & scalable debt collection software built to automate your workflow. From startup to enterprise, we have the solution for you.
    Learn More
  • 5
    Web-as-corpus tools in Java. * Simple Crawler (and also integration with Nutch and Heritrix) * HTML cleaner to remove boiler plate code * Language recognition * Corpus builder
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Crawl a set of files, accumulating information on the temporal and spatial extent of the data in each file, for later search and retrieval.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    This Project moved to https://sourceforge.net/projects/synbiowave/ because the name GeneWave is a registered trademark... Please do not use this project anymore.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    This project aims to provide an open source fleet management system with special focus on modularity and integration.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    The Stats Jam project is an extension to Mediawiki that allows users to embed database queries and visualisations into their wiki pages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Simplify Purchasing For Your Business Icon
    Simplify Purchasing For Your Business

    Manage what you buy and how you buy it with Order.co, so you have control over your time and money spent.

    Simplify every aspect of buying for your business in Order.co. From sourcing products to scaling purchasing across locations to automating your AP and approvals workstreams, Order.co is the platform of choice for growing businesses.
    Learn More
  • 10
    JLink lets users author flow charts based on ISO 5807 and IBM standards. Developers can use JLink to add flowcharts to applications, serve a flow chart over the web in PDF or PNG, or dynamically create a flowchart with Javascript, Python or Ruby scripts
    Leader badge
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    iDocs is a intellectual document work flow with text mining options project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Crow - Computational Representation Of Whatever. A platform for the integration and mining of complex and distributed data. Represents cross-linked semantic web documents as a network of software objects and offers easy ways to filter, and sort them.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Wattos is a collection of mostly Java programs for Structural Biology and NMR Spectroscopy. It's programs analyze, annotate, parse, archive, and disseminate experimental NMR data deposited by authors world wide into the PDB and BMRB.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    The goal of zAutomation project is to design/implement hardware, firmware and software for remote control and monitoring of physical objects, by using the ZigBee technology and internet. The field of application is robotics and domotics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Design and develop Recommendation and Adaptive Prediction Engines to address eCommerce opportunities. Build a portfolio of engines by creating and porting algorithms from multiple disciplines to a usable form. Try to solve NetFlix and other challenges.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Crawl-By-Example runs a crawl, which classifies the processed pages by subjects and finds the best pages according to examples provided by the operator. Crawl-By-Example is a plugin to the Heritrix crawler, and was done as a part of GSoC06 program.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    iROS is a meta-operating system for technology-rich "interactive rooms". The core components (Event Heap, DataHeap, iCrafter) provide communication, data storage, and service management for an iRoom.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    BioMa is a specimen based Biodiversity database Manager. It is designed to store, organize, and manipulate biodiversity-related scientific data, either for the purposes of museums, scientific collections, or research projects.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    A configurable knowledge management framework. It works out of the box, but it's meant mainly as a framework to build complex information retrieval and analysis systems. The 3 major components: Crawler, Analyzer and Indexer can also be used separately.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    A dialect of XUL implementing most of Mozilla XUL's Fourth Draft. XML User Interface Language (XUL) is a method for easily creating GUI applications. Lux XUL supports Python scripting via Jython 2.1.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Pödznsatch is a open and distributed hypergoogle of love. It is a semantic web application for social networking, word-of-mouth analysis and profiling. The Pödznsatch architecture includes a bot crawler, an inference engine and a query interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    A.I. security app. Development ceased.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    python enteprise integration framework project. Powerfull class library based on EAI patterns and a modeling and simulation tool.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Healthcare Xchange Protocol for interoperative communications. Data exchange/transfer, platform independent,XML-RPC, HL7, SOAP, EDIFACT, simple,easy, authenticated, secure, transparent, no geo-restrictions, open sourced, peer reviewed, collab development
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    The Comparative Toxicogenomics Database (under development) will be a publicly-available, web-based database of genes and proteins of human toxicological significance. It is being developed using an Oracle 9i database, Tomcat, and Python.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB