Showing 23 open source projects for "text pattern analyser"

View related business solutions
  • Agentic AI SRE built for Engineering and DevOps teams. Icon
    Agentic AI SRE built for Engineering and DevOps teams.

    No More Time Lost to Troubleshooting

    NeuBird AI's agentic AI SRE delivers autonomous incident resolution, helping team cut MTTR up to 90% and reclaim engineering hours lost to troubleshooting.
    Learn More
  • Deliver trusted data with dbt Icon
    Deliver trusted data with dbt

    dbt Labs empowers data teams to build reliable, governed data pipelines—accelerating analytics and AI initiatives with speed and confidence.

    Data teams use dbt to codify business logic and make it accessible to the entire organization—for use in reporting, ML modeling, and operational workflows.
    Learn More
  • 1
    Tesseract OCR

    Tesseract OCR

    Open Source OCR Engine

    Tesseract is an open source OCR or optical character recognition engine and command line program. OCR is a technology that allows for the recognition of text characters within a digital image. With the latest version of Tesseract, there is a greater focus on line recognition, however it still supports the legacy Tesseract OCR engine which recognizes character patterns. Tesseract can recognize over 100 languages out-of-the-box, and can be trained to recognize other languages. It supports...
    Downloads: 3,184 This Week
    Last Update:
    See Project
  • 2
    Regex

    Regex

    Generate matching and non matching strings based on regex patterns

    Generate matching and non-matching strings. This is a java library that, given a regex pattern, allows to generation of matching strings. Iterate through unique matching strings. Generate not matching strings. Follow the link to Online IDE with created project: JDoodle. Enter your pattern and see the results. By design a+, a* and a{n,} patterns in regex imply an infinite number of characters should be matched. When generating data, that would mean values of infinite length might be...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Stanza

    Stanza

    Stanford NLP Python library for many human languages

    Stanza is a collection of accurate and efficient tools for the linguistic analysis of many human languages. Starting from raw text to syntactic analysis and entity recognition, Stanza brings state-of-the-art NLP models to languages of your choosing. Stanza is a Python natural language analysis package. It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    IronClaw

    IronClaw

    IronClaw is OpenClaw inspired but focused on privacy & security

    IronClaw is a security-first, open-source personal AI assistant built in Rust and designed to keep your data fully under your control. It operates on the principle that your AI should work for you, not external vendors, ensuring all data is stored locally, encrypted, and never shared. The platform emphasizes transparency, offering auditable code with no hidden telemetry or data harvesting. IronClaw runs untrusted tools inside isolated WebAssembly (WASM) sandboxes with strict capability-based...
    Downloads: 56 This Week
    Last Update:
    See Project
  • The fastest way to host, scale and get paid on WordPress Icon
    The fastest way to host, scale and get paid on WordPress

    For developers searching for a web hosting solution

    Lightning-fast hosting, AI-assisted site management, and enterprise payments all in one platform designed for agencies and growth-focused businesses.
    Learn More
  • 5
    Matter AI

    Matter AI

    Matter AI is open-source AI Code Reviewer Agent

    Matter AI is an AI-powered platform designed to enhance productivity through automated content generation, data analysis, and decision support. It leverages machine learning models to process text, analyze patterns, and generate insights, making it suitable for businesses looking to optimize data-driven decision-making. Matter AI integrates with various data sources and provides customizable AI workflows tailored to different industries.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    KubeAI

    KubeAI

    Private Open AI on Kubernetes

    Get inferencing running on Kubernetes: LLMs, Embeddings, Speech-to-Text. KubeAI serves an OpenAI compatible HTTP API. Admins can configure ML models by using the Model Kubernetes Custom Resources. KubeAI can be thought of as a Model Operator (See Operator Pattern) that manages vLLM and Ollama servers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    yek

    yek

    Serialize repositories into LLM-ready context w/ smart prioritization

    Yek is a Rust-based CLI tool designed to serialize text-based files from a repository or directory into a single structured output for large language model use. It scans projects using .gitignore rules to exclude irrelevant files and automatically filters out binary or oversized content. Yek prioritizes files based on Git history, placing more important content later in the output to align with how language models process context. Yek supports multiple directories, individual files, and glob...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Jina

    Jina

    Build cross-modal and multimodal applications on the cloud

    ...Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer. Build applications that deliver fresh insights from multiple data types such as text, image, audio, video, 3D mesh, PDF with Jina AI’s DocArray. Polyglot gateway that supports gRPC, Websockets, HTTP, GraphQL protocols with TLS. Intuitive design pattern for high-performance microservices. Seamless Docker container integration: sharing, exploring, sandboxing, versioning and dependency control via Jina Hub. Fast deployment to Kubernetes, Docker Compose and Jina Cloud. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Pattern

    Pattern

    Web mining module for Python, with tools for scraping

    Pattern is an open-source Python library that provides tools for web mining, natural language processing, machine learning, and network analysis. The project integrates multiple capabilities into a single framework that allows developers to collect, process, and analyze textual data from the web. It includes modules for web scraping and crawling that can retrieve information from sources such as social media platforms, search engines, and online knowledge bases.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Intelligent testing agents | Checksum.ai Icon
    Intelligent testing agents | Checksum.ai

    Checksum generates, runs, and maintains end-to-end tests automatically so your team ships with confidence as code output grows.

    Coding agents write the code. Checksum runs it—continuously testing against real APIs, real data, real edge cases—before it ever reaches production.
    Learn More
  • 10
    Pattern Recognition and Machine Learning

    Pattern Recognition and Machine Learning

    Repository of notes, code and notebooks in Python

    Pattern Recognition and Machine Learning is an open-source repository that provides Python implementations and interactive notebooks for algorithms presented in the book Pattern Recognition and Machine Learning by Christopher Bishop. The project recreates many of the mathematical concepts and diagrams from the book using executable Jupyter notebooks, allowing readers to experiment directly with the algorithms described in the text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    NLP.js

    NLP.js

    An NLP library for building bots

    NLP.js is an NLP library for building bots, with entity extraction, sentiment analysis, automatic language identifier, and much more. "NLP.js" is a general natural language utility for nodejs. Search the best substring of a string with less Levenshtein distance to a given pattern. Get stemmers and tokenizers for several languages. Sentiment Analysis for phrases (with negation support). Named Entity Recognition and management, multi-language support, and acceptance of similar strings, so the introduced text does not need to be exact. Natural Language Processing Classifier, to classify an utterance into intents. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    cocoNLP

    cocoNLP

    A Chinese information extraction tool

    cocoNLP is a lightweight natural-language processing toolkit geared toward practical information extraction from raw text, especially for Chinese and mixed Chinese–English content. Instead of requiring a heavy pipeline, it focuses on quick wins such as extracting names, places, organizations, emails, phone numbers, and dates directly from unstructured sentences. The project blends pattern-based methods with NLP heuristics, giving developers dependable results for real-world texts like chats, comments, and user-generated content. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Welsh Natural Language Toolkit
    The project supports the Welsh Language Technology domain with a set of NLP tools that drive innovation and advance the development of sophisticated textual analysis solutions. The WNLT project delivers four core NLP modules; a) Word Segmentation for separating text into words b) Sentence Boundary Disambiguation for finding sentence boundaries c) Part of Speech Tagger for determining the part of speech of each word d) Morphological Analyser for identifying the root form (lemma) of words. The modules are written in JAVA and ‘wrapped’ for execution under the General Architecture for Text Engineering (GATE) framework. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    First-Algerian-Sentiment-Analyser

    Sentiment Analysis System for Vernacular Algerian Language

    This project is a free GPL licenced Lexicon-based Sentiment Analysis System for Vernacular Algerian Language, it contain 4 lexicons (L1, L2, L3 and L4) and a data set. It aims to give the polarity and the subjectivity for a given text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Welsh Natural Language Toolkit

    Welsh Natural Language Toolkit

    WNLT is a suite of open source natural language modules for the Welsh

    The project supports the Welsh Language Technology domain with a set of NLP tools that drive innovation and advance the development of sophisticated textual analysis solutions. The WNLT project delivers four core NLP modules; a) Word Segmentation for separating text into words b) Sentence Boundary Disambiguation for finding sentence boundaries c) Part of Speech Tagger for determining the part of speech of each word d) Morphological Analyser for identifying the root form (lemma) of words. The modules are written in JAVA and ‘wrapped’ for execution under the General Architecture for Text Engineering (GATE) framework. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Vision2u

    Vision2u

    free image processing software

    Vision2u offers a free image processing software for personal use and research. Primary tasks of the image processing can be realized during simple operation of the software. Every Web cam owner can have simplest measuring, counting or tasks of monitoring done without high capital outlays.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    OntologyManager

    OntologyManager

    Ontology-Manager ist ein Ontologisches Anwendungssystem

    Ontology-Manager ist ein ontologisches Anwendungssystem, das heißt es ist ein Anwendungssystem, welches aus einer ontologischen Datenbank (NoSQL-Datenbank mit ontologischem Datenmodell) und einer wachsenden Modulsammlung besteht. Die Module dienen zur interaktiven und automatisierten Arbeit mit den ontologischen Daten.
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    DrawPad

    DrawPad

    Pattern recognition tool for image, pdf and handwritings

    The tool is an optical recognition tool which runs in following three mode : 1. Drawing Pad : Here the user can draw a character and the tool will recognize which character it is. 2. Image OCR : Image based OCR tool to recognize text and barcodes present in the image. It also supports saving the OCR output. 3. PDF OCR : PDF OCR is the advanced form of OCR, where PDF is parsed into image and OCR is run on that result. At present, PDF OCR comes with low maturity.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    TextBlob

    TextBlob

    TextBlob is a Python library for processing textual data

    Simple, Pythonic, text processing, Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. TextBlob stands on the giant shoulders of NLTK and pattern, and plays nicely with both.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    TexLexAn is an open source text analyser for Linux, able to estimate the readability and reading time, to classify and summarize texts. It has some learning abilities and accepts html, doc, pdf, ppt, odt and txt documents. Written in C and Python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Collection of Statistical Language Processing Tools and Modules for Information Retrieval, Document Classification, Vectorization, Pattern Matching, Knowledge/Text Mining related problems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Neural network based pattern recognition software. Aim is to provide a program what is able to find pictures by a text line input. For example -> find all pictures with a red car?
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    GraphSpider is a pattern matcher which searches parsed text in phrase-structure tree or dependency graph format for syntactic structures matching a set of patterns in MPL, a regexp-like pattern language. Applications: information extraction, text mining.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB