Showing 27 open source projects for "data processing"

View related business solutions
  • Skillfully - The future of skills based hiring Icon
    Skillfully - The future of skills based hiring

    Realistic Workplace Simulations that Show Applicant Skills in Action

    Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
    Learn More
  • Transform months of data modeling and coding into days. Icon
    Transform months of data modeling and coding into days.

    Automatically generate, document, and govern your entire data architecture.

    Efficiently model your business and data models, and generate code for your data pipelines, data lakehouse, and analytical applications
    Learn More
  • 1
    ChatLab

    ChatLab

    Local-first AI chat analysis tool for insights from conversation data

    ...ChatLab emphasizes a local-first approach, meaning all chat data is processed and stored on the user’s device rather than being uploaded to external servers. It supports large-scale datasets through streaming parsing and multi-worker processing, allowing it to handle millions of messages efficiently. ChatLab also includes visualization features that present trends, activity patterns, and interaction metrics in a clear and accessible format.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    Jimp

    Jimp

    An image processing library written entirely in JavaScript for Node

    An image processing library for Node written entirely in JavaScript, with zero native dependencies. If you're using this library with TypeScript the method of importing slightly differs from JavaScript. Instead of using require, you must import it with ES6 default import scheme. If you're using a web bundles (webpack, rollup, parcel) you can benefit from using the module build of jimp. Using the module build will allow your bundler to understand your code better and exclude things you aren't...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 3
    Search-Index

    Search-Index

    A persistent, network resilient, full text search library

    Search-Index is a lightweight and fast JavaScript-based search engine that enables full-text search indexing and retrieval for web applications.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 4
    compromise

    compromise

    Modest natural-language processing

    Language is complicated and there's a gazillion words. Compromise is a javascript library that interprets and pre-parses text and makes some reasonable decisions so things are way easier. Compromise tries its best to parse text. it is small, quick, and often good-enough. It is not as smart as you'd think. Conjugate and negate verbs in any tense. Play between plural, singular and possessive forms. Interpret plain-text numbers. Handle implicit terms. Use it on the client-side or as an...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Digital business card + lead capture + contact enrichment Icon
    Digital business card + lead capture + contact enrichment

    Your complete in-person marketing platform

    Share digital business cards, capture leads, and enrich validated contact info - at events, in the field, and beyond. Powered by AI and our proprietary data engine, Popl drives growth for companies around the world, turning every handshake into an opportunity.
    Learn More
  • 5
    AI-Media2Doc

    AI-Media2Doc

    AI tool converting video/audio into structured documents instantly

    ...It is designed to transform multimedia inputs into formats such as knowledge notes, summaries, mind maps, and social-style articles, making content easier to review and reuse. AI-Media2Doc emphasizes privacy by processing media locally in the browser using WebAssembly-based ffmpeg, ensuring that original video files are not uploaded externally. It separates client-side media handling from backend AI processing, reducing data exposure while still enabling transcription and document generation. AI-Media2Doc supports flexible customization through prompts, allowing users to tailor output styles based on their needs. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6
    Open Semantic Search

    Open Semantic Search

    Open source semantic search and text analytics for large document sets

    ...It integrates text mining and analytics capabilities that allow users to examine relationships, topics, and structured data within document collections.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    Lemon AI

    Lemon AI

    Full-stack Open-source Self-Evolving General AI Agent

    LemonAI is an open-source full-stack framework for building autonomous AI agents capable of performing complex tasks such as research, programming, data analysis, and document processing. The platform is designed to run primarily on local infrastructure, providing a privacy-focused alternative to cloud-dependent agent platforms. It integrates with local large language models through tools such as Ollama, vLLM, and other model runtimes while also allowing optional connections to external cloud models. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    Preswald

    Preswald

    Python tool for browser-based interactive data apps in one file

    Preswald is an open source Python-based framework and static-site generator designed for building interactive data applications that run entirely in the browser. It packages application logic, data processing, and user interface components into a single self-contained output, enabling easy sharing and deployment without requiring local dependencies. Preswald leverages a WebAssembly runtime along with technologies like Pyodide and DuckDB to execute Python code directly in the browser environment. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    Kener

    Kener

    Kener is a Modern Self hosted Status Page, batteries included

    Kener: Open-source Node.js status page tool, designed to make service monitoring and incident handling a breeze. It offers a sleek and user-friendly interface that simplifies tracking service outages and improves how we communicate during incidents. And the best part? Kener integrates seamlessly with GitHub, making incident management a team effort—making it easier for us to track and fix issues together in a collaborative and friendly environment.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Top Corporate LMS for Training | Best Learning Management Software Icon
    Top Corporate LMS for Training | Best Learning Management Software

    Deliver and Track Online Training and Stay Compliant - with Axis LMS!

    Axis LMS enables you to deliver online and virtual learning and training through a scalable, easy-to-use LMS that is designed to enhance your training, automate your workflows, engage your learners and keep you compliant.
    Learn More
  • 10
    Deep Research

    Deep Research

    Use any LLMs (Large Language Models) for Deep Research

    Deep Research is a local-first research agent that orchestrates multiple LLMs to generate in-depth reports in minutes. It combines “thinking” and “task” model roles with live internet access to plan, search, read, and synthesize findings into structured outputs. The project emphasizes privacy: processing and storage happen locally, avoiding server-side retention of your queries and notes. A simple web UI lets you enter topics and configure models, while the backend streams progress as...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    Vectorize MCP Server

    Vectorize MCP Server

    Official Vectorize MCP Server

    The Vectorize MCP Server is a Model Context Protocol server that integrates with Vectorize, offering advanced vector retrieval and text extraction capabilities. ​
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Markdownify MCP Server

    Markdownify MCP Server

    Convert files and web content into clean, usable Markdown easily

    ...By standardizing content into Markdown, it helps unify inputs across different sources for better processing and integration with AI tools and developer environments.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 13
    Rivet

    Rivet

    Visual AI IDE for building agents with prompt chains and graphs

    ...Rivet supports multiple large language model providers and integrates with services such as embeddings and transcription systems, allowing developers to create richer AI-powered features. Its architecture emphasizes composability, where different components like prompts, APIs, and data processing steps can be combined into reusable pipelines.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 14
    TensorFlow

    TensorFlow

    TensorFlow is an open source library for machine learning

    Originally developed by Google for internal use, TensorFlow is an open source platform for machine learning. Available across all common operating systems (desktop, server and mobile), TensorFlow provides stable APIs for Python and C as well as APIs that are not guaranteed to be backwards compatible or are 3rd party for a variety of other languages. The platform can be easily deployed on multiple CPUs, GPUs and Google's proprietary chip, the tensor processing unit (TPU). TensorFlow...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 15
    PapersGPT

    PapersGPT

    A powerful Zotero AI and MCP plugin with ChatGPT, Gemini 3.1, Claude

    PapersGPT is an AI-powered plugin that integrates directly into Zotero to transform how researchers interact with academic papers and literature collections. It enables users to chat with individual PDFs or entire collections, allowing them to extract insights, generate summaries, and explore connections between documents without leaving the Zotero environment. The plugin supports a wide range of state-of-the-art language models, including GPT, Claude, Gemini, and open-source alternatives,...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    AstronRPA

    AstronRPA

    Agent-ready RPA suite with visual workflow automation tools engine

    ...It enables automation of common desktop software and browser-based tasks, making it suitable for repetitive business operations and system integrations. Astron RPA includes a large library of reusable components that handle tasks such as user interface operations, data processing, and system interactions, allowing workflows to be assembled from modular building blocks. Astron RPA also integrates with intelligent agent systems so that automated processes and AI-driven workflows can work together in broader automation scenarios.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    RuoYi AI

    RuoYi AI

    Enterprise AI platform for building, deploying, and managing apps

    RuoYi AI is a full-stack enterprise-oriented AI development platform designed to help developers rapidly build, deploy, and manage intelligent applications using modern large language models and AI ecosystems. It provides a unified framework for integrating multiple AI models from different providers, allowing teams to switch or combine models through a consistent interface without vendor lock-in. RuoYi AI includes built-in support for retrieval-augmented generation, enabling organizations...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 18
    AI Deadlines

    AI Deadlines

    AI conference deadline countdowns

    ...The repository includes configuration files and data sources that allow contributors to add or update conferences through pull requests, enabling community-driven maintenance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    MITRE Annotation Toolkit

    A toolkit for managing and manipulating text annotations

    ...The goal of MAT is not to help you configure your training engine (in the default case, the Carafe CRF system) to achieve the best possible performance on your data. MAT is for "everything else": all the tools you end up wishing you had.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    Parsr

    Parsr

    Transforms PDF, Documents and Images into Enriched Structured Data

    Parsr is an open-source document parsing tool that converts PDFs, scanned images, and other structured documents into structured, machine-readable data formats.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    SQLFlow

    SQLFlow

    SQL compiler bridging databases and machine learning workflows

    SQLFlow is an open source project designed to bridge the gap between traditional SQL-based data processing and modern machine learning workflows by extending SQL syntax with AI capabilities. It acts as a compiler that translates SQL programs into executable workflows, enabling users to train, evaluate, and deploy machine learning models directly from SQL statements. It integrates with multiple database engines such as MySQL, Hive, and MaxCompute, while also supporting machine learning frameworks like TensorFlow and XGBoost. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 22
    fastText

    fastText

    Library for fast text classification and representation

    FastText is an open-source, free, lightweight library that allows users to learn text representations and text classifiers. It works on standard, generic hardware. Models can later be reduced in size to even fit on mobile devices. ext classification is a core problem to many applications, like spam detection, sentiment analysis or smart replies. In this tutorial, we describe how to build a text classifier with the fastText tool. The goal of text classification is to assign documents (such as...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Rasa-UI

    Rasa-UI

    Rasa UI is a frontend for the Rasa Framework

    Rasa UI is a web application built on top of, and for Rasa. Rasa UI provides a web application to quickly and easily be able to create and manage bots, NLU components (Regex, Examples, Entities, Intents, etc.) and Core components (Stories, Actions, Responses, etc.) through a web interface. It also provides some convenience features for Rasa, like training and loading your models, monitoring usage or viewing logs.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    tracking.js

    tracking.js

    A modern approach for Computer Vision on the web

    The tracking.js library brings different computer vision algorithms and techniques into the browser environment. By using modern HTML5 specifications, we enable you to do real-time color tracking, face detection and much more, all that with a lightweight core (~7 KB) and intuitive interface. To get started, download the project. This project includes all of the tracking.js examples, source code dependencies you'll need to get started. Unzip the project somewhere on your local drive. The...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    WebDjVuTextEd

    Edit the OCR text layer of DjVu documents in a web browser

    WebDjVuTextEd allows to edit the text layer of OCR'ed DjVu documents in a web browser. You can modify the structure (paragraphs, lines, words...) create, delete, edit text nodes, modify their container box by mouse, and run a spellchecker. The program does not directly read the DjVu files, it requires exported XML text data and images. When using without a webserver, you can open and save local files, but cannot take advantages of auto-save and spell checking. Note that current SVN...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB