Search Results for "web crawler source code" - Page 5

Showing 373 open source projects for "web crawler source code"

View related business solutions
  • Skillfully - The future of skills based hiring Icon
    Skillfully - The future of skills based hiring

    Realistic Workplace Simulations that Show Applicant Skills in Action

    Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
    Learn More
  • Get More Customers For Your Auto Repair Shop Icon
    Get More Customers For Your Auto Repair Shop

    Drive the Right Business to Your Auto Repair Shop with KUKUI.

    Kukui's All-in-One Success Platform is a robust integrated marketing software solution that helps businesses in the automotive repair industry to grow their brand and take it to the next level. Kukui offers tools for conversion rate optimization, POS integration, email marketing and retention as well as revenue tracking.
    Learn More
  • 1
    Mezzanine

    Mezzanine

    CMS framework for Django

    Mezzanine is a powerful open source content management platform built using the Django framework. In many ways it is like many other content management tools, offering an intuitive interface for managing all of your content. But Mezzanine is different in that it provides most of its functionality by default. While other platforms rely heavily on modules or reusable applications, Mezzanine comes ready with all the functionality you need, making it the more efficient choice. Mezzanine has a...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    Composio

    Composio

    Composio equip's your AI agents & LLMs

    Empower your AI agents with Composio - a platform for managing and integrating tools with LLMs & AI agents using Function Calling. Equip your agent with high-quality tools & integrations without worrying about authentication, accuracy, and reliability in a single line of code.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    nanochat

    nanochat

    The best ChatGPT that $100 can buy

    nanochat is a from-scratch, end-to-end “mini ChatGPT” that shows the entire path from raw text to a chatty web app in one small, dependency-lean codebase. The repository stitches together every stage of the lifecycle: tokenizer training, pretraining a Transformer on a large web corpus, mid-training on dialogue and multiple-choice tasks, supervised fine-tuning, optional reinforcement learning for alignment, and finally efficient inference with caching. Its north star is approachability and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Eel

    Eel

    A Python library for making simple Electron-like HTML/JS GUI apps

    Eel is a little Python library for making simple Electron-like offline HTML/JS GUI apps, with full access to Python capabilities and libraries. Eel hosts a local webserver, then lets you annotate functions in Python so that they can be called from Javascript, and vice versa. Eel is designed to take the hassle out of writing short and simple GUI applications. If you are familiar with Python and web development, probably just jump to this example which picks random file names out of the given...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Dragonfly | An In-Memory Data Store without Limits Icon
    Dragonfly | An In-Memory Data Store without Limits

    Dragonfly Cloud is engineered to handle the heaviest data workloads with the strictest security requirements.

    Dragonfly is a drop-in Redis replacement that is designed for heavy data workloads running on modern cloud hardware. Migrate in less than a day and experience up to 25X the performance on half the infrastructure.
    Learn More
  • 5
    SWE-agent

    SWE-agent

    SWE-agent takes a GitHub issue and tries to automatically fix it

    SWE-agent turns LMs (e.g. GPT-4) into software engineering agents that can resolve issues in real GitHub repositories. On the SWE-bench, the SWE-agent resolves 12.47% of issues, achieving state-of-the-art performance on the full test set. We accomplish our results by designing simple LM-centric commands and feedback formats to make it easier for the LM to browse the repository, and view, edit, and execute code files. We call this an Agent-Computer Interface (ACI).
    Downloads: 8 This Week
    Last Update:
    See Project
  • 6
    FastKoko

    FastKoko

    Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model

    FastKoko is a self-hosted text-to-speech server built around the Kokoro-82M model and exposed through a FastAPI backend. It is designed to be easy to deploy via Docker, with separate CPU and GPU images so that users can choose between pure CPU inference and NVIDIA GPU acceleration. The project exposes an OpenAI-compatible speech endpoint, which means existing code that talks to the OpenAI audio API can often be pointed at a Kokoro-FastAPI instance with minimal changes. It supports multiple...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    Mosec

    Mosec

    A high-performance ML model serving framework, offers dynamic batching

    Mosec is a high-performance and flexible model-serving framework for building ML model-enabled backend and microservices. It bridges the gap between any machine learning models you just trained and the efficient online service API.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    VectorVein

    VectorVein

    No-code AI workflow

    Use the power of AI to build your personal knowledge base + automated workflow. No programming, just dragging to create a strong workflow and automate all tasks. Vector vein is affected LangChain as well as langflow The uncode AI workflow software developed by the inspiration aims to combine the powerful capabilities of large language models and allow users to realize the intelligibility and automation of various daily workflows through simple drag. After the software is opened normally,...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    langrocks

    langrocks

    Tools like web browser, computer access and code runner for LLMs

    Langrocks is a programming language experimentation toolkit that enables developers to create, test, and optimize custom programming languages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Shoplogix Smart Factory Platform Icon
    Shoplogix Smart Factory Platform

    For manufacturers looking for a powerful Manufacturing Execution solution

    Real-time Visibility into Your Shop Floor's Performance. The Shoplogix smart factory platform enables manufacturers to increase overall equipment effectiveness, reduce operational costs, sustain growth and improve profitability by allowing them to visualize, integrate and act on production and machine performance in real-time. Manufacturers that trust us to drive efficiency in their factories. Real-time visual data and analytics provide valuable insights to make better informed decisions. Uncover hidden shop floor potential and drive rapid time to value. Develop a continuously improving culture through training, education and data-driven decisions. Compete in the i4.0 world by making the Shoplogix Smart Factory Platform the cornerstone of your digital transformation. Connect to any equipment or device to automate data collection and exchange it with other manufacturing technologies. Automatically monitor, report and analyze machine states to track real-time production.
    Learn More
  • 10
    Lagent

    Lagent

    A lightweight framework for building LLM-based agents

    Lagent is a lightweight open-source framework designed to help developers build autonomous agents powered by large language models. The framework provides tools and abstractions that allow language models to interact with external tools, execute tasks, and perform multi-step reasoning processes. Instead of using LLMs only for text generation, Lagent enables developers to transform models into agents capable of performing actions such as retrieving data, executing code, or interacting with APIs. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    Google CTF

    Google CTF

    Google CTF

    Google CTF is the public repository that houses most of the challenges from Google’s Capture-the-Flag competitions since 2017 and the infrastructure used to run them. It’s a learning and practice archive: competitors and educators can replay tasks across categories like pwn, reversing, crypto, web, sandboxing, and forensics. The code and binaries intentionally contain vulnerabilities—by design—so users can explore exploit chains and patching in realistic settings. The repo also includes...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    cheat.sh

    cheat.sh

    The only cheat sheet you need

    cheat.sh is a compact, network-accessible cheat-sheet service that serves concise examples and usage notes for hundreds of shell commands, programming languages, and tools via a simple HTTP interface. You can query it from the terminal (for example curl cht.sh/rsync or curl cheat.sh/ls) or browse the web front page; it also supports a shorthand hostname (cht.sh) and provides both online and standalone/local installation modes. The repository contains the server and client code, instructions...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    newspaper4k

    newspaper4k

    Python library for scraping and analyzing online news articles easily

    Newspaper4k is a Python library designed for extracting, processing, and analyzing news articles from websites. It is a continuation and active fork of the original newspaper3k library, which had stopped receiving updates, with the goal of keeping the ecosystem maintained while adding improvements and bug fixes. It provides developers with tools to automatically download web pages, extract the main article content, and collect associated metadata such as titles, authors, images, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    dots.ocr

    dots.ocr

    Multilingual Document Layout Parsing in a Single Vision-Language Model

    dots.ocr is a cutting-edge multilingual document parsing system built on a unified vision-language model that combines layout detection, text recognition, and structural understanding into a single architecture. Unlike traditional OCR pipelines that rely on multiple specialized components, dots.ocr integrates these processes end-to-end, reducing error propagation and improving consistency across tasks. The model is designed to recognize virtually any human script, making it highly effective...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Gradio

    Gradio

    Create UIs for your machine learning model in Python in 3 minutes

    Gradio is the fastest way to demo your machine learning model with a friendly web interface so that anyone can use it, anywhere! Gradio can be installed with pip. Creating a Gradio interface only requires adding a couple lines of code to your project. You can choose from a variety of interface types to interface your function. Gradio can be embedded in Python notebooks or presented as a webpage. A Gradio interface can automatically generate a public link you can share with colleagues that...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 16
    AI Agents Masterclass

    AI Agents Masterclass

    Follow along with my AI Agents Masterclass videos

    AI Agents Masterclass is an educational open-source repository designed to teach developers how to build, train, and deploy intelligent AI agents using modern tooling and workflow patterns. The project includes structured lessons, code examples, and practical exercises that cover foundational concepts like prompt engineering, chaining agents, tool usage, plan execution, evaluation, and safety considerations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    self-llm

    self-llm

    Tutorial tailored for Chinese babies on rapid fine-tuning

    self-llm is an open source educational project created by the Datawhale community that serves as a practical guide for deploying, fine-tuning, and using open-source large language models on Linux systems. The repository focuses on helping beginners and developers understand how to run and customize modern LLMs locally rather than relying solely on hosted APIs. It provides step-by-step tutorials covering environment setup, model deployment, inference workflows, and efficient fine-tuning...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Rhino

    Rhino

    On-device Speech-to-Intent engine powered by deep learning

    Rhino is Picovoice's Speech-to-Intent engine. It directly infers intent from spoken commands within a given context of interest, in real-time. The end-to-end platform for embedding private voice AI into any software in a few lines of code. Design with no limits on top of a modular platform. Create use-case-specific voice AI models in seconds. Develop voice features with a few lines of code using intuitive and cross-platform SDKs. Deliver voice AI everywhere: on-device, mobile, web browsers,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Helium

    Helium

    Lighter web automation with Python

    Helium is a Python library built on top of Selenium to make browser automation more intuitive and human-friendly. It replaces verbose boilerplate code with natural language-like API calls such as click("Login") or write("hello", into="Name"). Helium manages browser setup, waits, and teardown, enabling quick development of scripts for testing, scraping, or task automation without requiring deep Selenium knowledge.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    WhatsApp MCP Server

    WhatsApp MCP Server

    WhatsApp MCP server enabling AI access to chats and messaging

    whatsapp-mcp is an open source Model Context Protocol (MCP) server that enables AI agents to interact directly with a user’s WhatsApp account through a structured interface. It acts as a bridge between WhatsApp and large language models, allowing controlled access to messages, chats, and contacts. whatsapp-mcp is composed of two main components: a Go-based bridge that connects to the WhatsApp Web API and stores data locally, and a Python-based MCP server that exposes tools for AI...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    Motor

    Motor

    The async Python driver for MongoDB and Tornado or asyncio

    Motor is an asynchronous Python driver for MongoDB that enables developers to work with MongoDB using non-blocking I/O patterns, making it ideal for high-performance and scalable applications. Built on top of Python’s Tornado and asyncio frameworks, Motor lets you issue database operations without blocking the event loop, enabling concurrency in web servers, real-time systems, and microservices. It provides a familiar API surface similar to the official synchronous PyMongo driver, so you can...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    OWL

    OWL

    Optimized Workforce Learning for General Multi-Agent Assistance

    OWL (Optimized Workforce Learning) is a sophisticated open-source framework built on the CAMEL-AI ecosystem for orchestrating teams of AI agents to collaboratively solve complex, real-world tasks with dynamic planning and automation capabilities. Unlike single-agent systems, it treats task completion as a collaborative workforce where agents take on specialized roles (planning, execution, analysis) and coordinate via a modular multi-agent architecture that supports flexible teamwork across...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    OSV.dev

    OSV.dev

    Open source vulnerability DB and triage service

    ...The platform includes a web UI, API, and a Go-based dependency scanner that checks software dependencies, container images, SBOMs (SPDX, CycloneDX), and Git repositories for known vulnerabilities. This repository contains the full infrastructure code for deploying osv.dev on Google Cloud Platform, including Terraform configurations, APIs, data pipelines, indexers, and background workers for vulnerability ingestion and impact analysis.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    django-helpdesk

    django-helpdesk

    A Django application to manage tickets for an internal helpdesk

    A Django application to manage tickets for an internal helpdesk. Formerly known as Jutda Helpdesk. django-helpdesk was formerly known as Jutda Helpdesk, named after the company which originally created it. As of January 2011 the name has been changed to reflect what it really is: a Django-powered ticket tracker with contributors reaching far beyond Jutda. django-helpdesk includes a basic demo Django project so that you may easily get started with testing or developing django-helpdesk. The...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    GitDiagram

    GitDiagram

    AI tool that converts GitHub repositories into interactive diagrams

    GitDiagram is an open source web application designed to help developers quickly understand the structure and architecture of GitHub repositories by automatically generating interactive diagrams. It analyzes repository metadata such as the file tree and project documentation to build a visual representation of how different components of a project relate to one another. It uses an AI-powered pipeline to interpret repository structure and transform that information into system design diagrams...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB