Search Results for "unstructured data" - Page 2

Showing 48 open source projects for "unstructured data"

View related business solutions
  • IT Asset Management (ITAM) Software Icon
    IT Asset Management (ITAM) Software

    Supercharge Your IT Assets, the Easy Way

    EZO AssetSonar is a comprehensive IT asset management platform that provides real-time visibility into your entire digital infrastructure. Track and optimize hardware, software, and license management to reduce risks, control IT spend, and improve compliance.
    Learn More
  • Supercharge Your Manufacturing with Easy MRP and MES Software Icon
    Supercharge Your Manufacturing with Easy MRP and MES Software

    Designed for SME manufacturers who want to reduce wasteful manual processing, save time and increase profits.

    Flowlens eliminates stock-outs, shortage and overstocks, avoiding costly production delays. Stay in control of inventory levels and keep production running smoothly with real-time visibility and easy-to-use stock management. Import bulk data with ease.
    Learn More
  • 1
    LangKit

    LangKit

    An open-source toolkit for monitoring Language Learning Models (LLMs)

    LangKit is an open-source text metrics toolkit for monitoring language models. It offers an array of methods for extracting relevant signals from the input and/or output text, which are compatible with the open-source data logging library whylogs. Productionizing language models, including LLMs, comes with a range of risks due to the infinite amount of input combinations, which can elicit an infinite amount of outputs. The unstructured nature of text poses a challenge in the ML observability space - a challenge worth solving, since the lack of visibility on the model's behavior can have serious consequences.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    FinGPT

    FinGPT

    Open-Source Financial Large Language Models

    FinGPT is an open-source, finance-specialized large language model framework that blends the capabilities of general LLMs with real-time financial data feeds, domain-specific knowledge bases, and task-oriented agents to support market analysis, research automation, and decision support. It extends traditional GPT-style models by connecting them to live or historical financial datasets, news APIs, and economic indicators so that outputs are grounded in relevant and recent market conditions...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    DocArray

    DocArray

    The data structure for multimodal data

    DocArray is a library for nested, unstructured, multimodal data in transit, including text, image, audio, video, 3D mesh, etc. It allows deep-learning engineers to efficiently process, embed, search, recommend, store, and transfer multimodal data with a Pythonic API. Door to multimodal world: super-expressive data structure for representing complicated/mixed/nested text, image, video, audio, 3D mesh data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Milvus Bootcamp

    Milvus Bootcamp

    Dealing with all unstructured data, such as reverse image search

    Milvus Bootcamp is a collection of tutorials, examples, and best practices for using Milvus, an open-source vector database designed for AI-powered similarity search and retrieval applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Professional Email Hosting for Small Business | Greatmail Icon
    Professional Email Hosting for Small Business | Greatmail

    Ready to switch to a more reliable and secure email hosting solution?

    Dependable cloud based email hosting with spam filtering, antivirus protection, generous storage and webmail. Compatible with Outlook and all other POP3/IMAP clients. High volume SMTP service for responsible senders. Outbound relay service for transactional messages, email marketing campaigns, newsletters and other applications. Dedicated email servers, clustering and multiple IP load balancing for high volume senders. Fixed monthly cost with unlimited sending and reputation monitoring. Greatmail is an email service provider (ESP) specializing in business class email hosting, SMTP hosting and email servers. For ISPs, application programmers and cloud developers, we also provide custom solutions including dedicated IP servers and process specific, load balanced configurations with multiple servers.
    Learn More
  • 5
    AI Powered Knowledge Graph Generator

    AI Powered Knowledge Graph Generator

    AI Powered Knowledge Graph Generator

    AI-Powered Knowledge Graph is an open-source project focused on building knowledge graph systems that integrate artificial intelligence and machine learning to represent complex relationships between data entities. Knowledge graphs organize information as networks of nodes and relationships, allowing applications to analyze connections between concepts, datasets, or real-world entities. By incorporating AI techniques such as natural language processing and semantic reasoning, the project...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    MindsDB

    MindsDB

    Making Enterprise Data Intelligent and Responsive for AI

    MindsDB is an AI data solution that enables humans, AI, agents, and applications to query data in natural language and SQL, and get highly accurate answers across disparate data sources and types. MindsDB connects to diverse data sources and applications, and unifies petabyte-scale structured and unstructured data. Powered by an industry-first cognitive engine that can operate anywhere (on-prem, VPC, serverless), it empowers both humans and AI with highly informed decision-making capabilities. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    docext

    docext

    An on-premises, OCR-free unstructured data extraction

    docext is a document intelligence toolkit that uses vision-language models to extract structured information from documents such as PDFs, forms, and scanned images. The system is designed to operate entirely on-premises, allowing organizations to process sensitive documents without relying on external cloud services. Unlike traditional document processing pipelines that rely heavily on optical character recognition, docext leverages multimodal AI models capable of understanding both visual...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    spacy-llm

    spacy-llm

    Integrating LLMs into structured NLP pipelines

    ...With only a few (and sometimes no) examples, an LLM can be prompted to perform custom NLP tasks such as text categorization, named entity recognition, coreference resolution, information extraction and more. This package integrates Large Language Models (LLMs) into spaCy, featuring a modular system for fast prototyping and prompting, and turning unstructured responses into robust outputs for various NLP tasks, no training data required.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Wanwu AI Agent Platform

    Wanwu AI Agent Platform

    Enterprise AI agent platform for workflows, models, and RAG apps

    ...It includes comprehensive model lifecycle management capabilities, enabling users to configure, monitor, and manage different models efficiently. Wanwu also supports knowledge base construction, allowing organizations to incorporate structured and unstructured data into their AI applications. With a focus on openness and extensibility, it encourages developers to build on top of its ecosystem while maintaining a secure and compliant architecture for business use cases.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Haystack is a modern, engaging, and intuitive intranet platform that employees actually use. Icon
    Haystack is a modern, engaging, and intuitive intranet platform that employees actually use.

    You Deserve the Best Intranet Experience

    With customizable iOS and Android mobile apps, Slack and Microsoft Teams integrations, and an intuitive design employees love, Haystack brings an outstanding digital employee experience to your entire workforce, no matter where their work takes them.
    Learn More
  • 10
    LangExtract

    LangExtract

    A Python library for extracting structured information

    LangExtract is a Python library developed by Google that leverages large language models (LLMs) to extract structured information from unstructured text—such as clinical notes, research papers, or literary works—based on user-defined instructions. It is designed to transform free-form text into reliable, schema-constrained data while maintaining traceability back to the source material. Each extracted entity is precisely grounded in its original context, allowing visual inspection and validation via automatically generated interactive HTML visualizations. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    marqo

    marqo

    Tensor search for humans

    ...Marqo helps you configure deep-learning models like CLIP to pull semantic meaning from images. It can seamlessly handle image-to-image, image-to-text and text-to-image search and analytics. Marqo adapts and stores your data in a fully schemaless manner. It combines tensor search with a query DSL that provides efficient pre-filtering. Tensor search allows you to go beyond keyword matching and search based on the meaning of text, images and other unstructured data. Be a part of the tribe and help us revolutionize the future of search. Whether you are a contributor, a user, or simply have questions about Marqo, we got your back.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    kg-gen

    kg-gen

    Knowledge Graph Generation from Any Text

    kg-gen is an open-source framework developed by the STAIR Lab that automatically generates knowledge graphs from unstructured text using large language models. The system is designed to transform plain text sources such as documents, articles, or conversation transcripts into structured graphs composed of entities and relationships. Instead of relying on traditional rule-based extraction techniques, KG-Gen uses language models to identify entities and their relationships, producing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    LangChain Extract

    LangChain Extract

    Did you say you like data?

    LangChain Extract is an open-source reference application designed to demonstrate how large language models can be used to extract structured data from unstructured text and document files. The project implements a lightweight web service that allows developers to define extraction schemas and apply them to various sources such as plain text, HTML, or PDF documents. Built using FastAPI and the LangChain framework, the application exposes a REST API that can process documents and return structured outputs that match user-defined JSON schemas. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    towhee

    towhee

    Framework that is dedicated to making neural data processing

    Towhee is an open-source machine-learning pipeline that helps you encode your unstructured data into embeddings. You can use our Python API to build a prototype of your pipeline and use Towhee to automatically optimize it for production-ready environments. From images to text to 3D molecular structures, Towhee supports data transformation for nearly 20 different unstructured data modalities. We provide end-to-end pipeline optimizations, covering everything from data decoding/encoding, to model inference, making your pipeline execution 10x faster. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Obsei

    Obsei

    Obsei is a low code AI powered automation tool

    Obsei is an automated no-code/low-code AI-powered text observation and analysis framework, designed for extracting insights from unstructured text data such as social media, reviews, and logs.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    PromethAI

    PromethAI

    Open-source framework that gives you AI Agents

    PromethAI-Backend is a backend framework for AI-driven automation and knowledge extraction. It is designed to integrate with large language models (LLMs) to provide AI-enhanced workflows, including content generation, summarization, and data analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Pytorch Points 3D

    Pytorch Points 3D

    Pytorch framework for doing deep learning on point clouds

    Torch Points 3D is a framework for developing and testing common deep learning models to solve tasks related to unstructured 3D spatial data i.e. Point Clouds. The framework currently integrates some of the best-published architectures and it integrates the most common public datasets for ease of reproducibility. It heavily relies on Pytorch Geometric and Facebook Hydra library thanks for the great work! We aim to build a tool that can be used for benchmarking SOTA models, while also allowing practitioners to efficiently pursue research into point cloud analysis, with the end goal of building models which can be applied to real-life applications. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    cocoNLP

    cocoNLP

    A Chinese information extraction tool

    ...Instead of requiring a heavy pipeline, it focuses on quick wins such as extracting names, places, organizations, emails, phone numbers, and dates directly from unstructured sentences. The project blends pattern-based methods with NLP heuristics, giving developers dependable results for real-world texts like chats, comments, and user-generated content. Its API is intentionally simple, so you can drop it into scripts, ETL jobs, or dashboards without deep ML expertise. Because it aims at utility over complexity, it’s useful for prototyping data products or building lightweight text analytics where large models would be overkill. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    Advanced Numerical Instruments 2D

    Advanced numerical instruments: adaptive meshing, FE methods, solvers

    Ani2D provides portable libraries for each step in the numerical solution of systems of PDEs with variable tensorial coefficients: (1) unstructured adaptive mesh generation, (2) metric-based mesh adaptation, (3) finite element discretization and interpolation, (4) algebraic solvers.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Twisted Storage is open source software that converts any number of storage systems, legacy or green-field, into a single petabyte-scale cloud. A Twisted Storage cloud is ideal for unstructured data, digital media storage, and archiving
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Securely store unstructured data in (keyword, record) pairs. Both keywords and record data are encrypted. Ideal for storing passwords, account information, etc. Thus, srd is a general-purpose password vault or a secure rolodesk application.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    bitHull is a Simple unstructured data store-and-share mechanism. It is part experimental graph-based task/note/idea management system and part data aggregator.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    litersta

    litersta

    Litersta - textual analytics - software

    Unstructured text is no match for Litersta - see further details here: https://litersta.com Working with text now becomes effortless when paired with Litersta textual analytics software. Unlike database fields, which are easily queried, text contains unstructured data that must be parsed for key objects that can be transformed in to powerful metrics.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB