• Data management solutions for confident marketing Icon
    Data management solutions for confident marketing

    For companies wanting a complete Data Management solution that is native to Salesforce

    Verify, deduplicate, manipulate, and assign records automatically to keep your CRM data accurate, complete, and ready for business.
    Learn More
  • The Most Powerful Software Platform for EHSQ and ESG Management Icon
    The Most Powerful Software Platform for EHSQ and ESG Management

    Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

    Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.
    Learn More
  • 1
    Data-Juicer

    Data-Juicer

    Data processing for and with foundation models

    Data-Juicer is an open-source data processing and augmentation framework designed to enhance the quality and diversity of datasets for machine learning tasks. It includes a modular pipeline for scalable data transformation.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    spaCy

    spaCy

    Industrial-strength Natural Language Processing (NLP)

    spaCy is a library built on the very latest research for advanced Natural Language Processing (NLP) in Python and Cython. Since its inception it was designed to be used for real world applications-- for building real products and gathering real insights. It comes with pretrained statistical models and word vectors, convolutional neural network models, easy deep learning integration and so much more. spaCy is the fastest syntactic parser in the world according to independent benchmarks, with an accuracy within 1% of the best available. ...
    Downloads: 92 This Week
    Last Update:
    See Project
  • 3
    HanLP

    HanLP

    Han Language Processing

    HanLP is a multilingual Natural Language Processing (NLP) library composed of a series of models and algorithms. Built on TensorFlow 2.0, it was designed to advance state-of-the-art deep learning techniques and popularize the application of natural language processing in both academia and industry. HanLP is capable of lexical analysis (Chinese word segmentation, part-of-speech tagging, named entity recognition), syntax analysis, text classification, and sentiment analysis. ...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 4
    SciSpaCy

    SciSpaCy

    A full spaCy pipeline and models for scientific/biomedical documents

    ScispaCy is a spaCy extension optimized for processing biomedical and scientific text, providing domain-specific NLP models for tasks like named entity recognition (NER) and dependency parsing.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Inventory and Order Management Software for Multichannel Sellers Icon
    Inventory and Order Management Software for Multichannel Sellers

    Avoid stockouts, overselling, and losing control as your business grows.

    We are the most powerful inventory and order management platform for Amazon, Walmart, and multichannel product sellers. Centralize orders, product information, and fulfillment operations to run more efficiently, sell more products, and stay compliant with marketplace requirements so you can grow profitably.
    Learn More
  • 5
    DOLMA

    DOLMA

    Data and tools for generating and inspecting OLMo pre-training data

    DOLMA (Data Optimization and Learning for Model Alignment) is a framework designed to manage large-scale datasets for training and fine-tuning language models efficiently.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    ExtractThinker

    ExtractThinker

    ExtractThinker is a Document Intelligence library for LLMs

    ExtractThinker is a tool designed to facilitate the extraction and analysis of information from various data sources, aiding in data processing and knowledge discovery.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 7
    deepdoctection

    deepdoctection

    A Repo For Document AI

    ...It does not implement models but enables you to build pipelines using highly acknowledged libraries for object detection, OCR and selected NLP tasks and provides an integrated frameworks for fine-tuning, evaluating and running models. For more specific text processing tasks use one of the many other great NLP libraries.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Classical Language Toolkit (CLTK)

    Classical Language Toolkit (CLTK)

    The Classical Language Toolkit

    The Classical Language Toolkit (CLTK) is a Python library offering natural language processing support for classical languages, including Latin, Greek, and others.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    STORM

    STORM

    An LLM-powered knowledge curation system that researches topics

    STORM is an open-source virtual assistant framework developed by Stanford's OVAL lab. It is designed for creating natural language interfaces and assistants that can interact with APIs, databases, and services in a modular way.
    Downloads: 5 This Week
    Last Update:
    See Project
  • AestheticsPro Medical Spa Software Icon
    AestheticsPro Medical Spa Software

    Our new software release will dramatically improve your medspa business performance while enhancing the customer experience

    AestheticsPro is the most complete Aesthetics Software on the market today. HIPAA Cloud Compliant with electronic charting, integrated POS, targeted marketing and results driven reporting; AestheticsPro delivers the tools you need to manage your medical spa business. It is our mission To Provide an All-in-One Cutting Edge Software to the Aesthetics Industry.
    Learn More
  • 10
    Datasets

    Datasets

    Hub of ready-to-use datasets for ML models

    Datasets is a library for easily accessing and sharing datasets, and evaluation metrics for Natural Language Processing (NLP), computer vision, and audio tasks. Load a dataset in a single line of code, and use our powerful data processing methods to quickly get your dataset ready for training in a deep learning model. Backed by the Apache Arrow format, process large datasets with zero-copy reads without any memory constraints for optimal speed and efficiency.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    MindNLP

    MindNLP

    Easy-to-use and high-performance NLP and LLM framework

    MindNLP is a natural language processing library built on the MindSpore framework, providing tools and models for various NLP tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Underthesea

    Underthesea

    Underthesea - Vietnamese NLP Toolkit

    Underthesea is a Vietnamese NLP toolkit providing various text processing capabilities, including word segmentation, part-of-speech tagging, and named entity recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Chonkie

    Chonkie

    The no-nonsense RAG chunking library

    Chonkie is an AI-powered framework designed for building conversational agents and chatbots with natural language understanding and multi-turn conversation support.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 14
    PaperAI

    PaperAI

    Semantic search and workflows for medical/scientific papers

    PaperAI is an open-source framework for searching and analyzing scientific papers, particularly useful for researchers looking to extract insights from large-scale document collections.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 15
    Pyreft

    Pyreft

    ReFT: Representation Finetuning for Language Models

    PyreFT is a tool by Stanford NLP for fine-tuning transformer models with an emphasis on efficient, resource-conserving training and customizability for NLP tasks.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    BEIR

    BEIR

    A Heterogeneous Benchmark for Information Retrieval

    BEIR is a benchmark framework for evaluating information retrieval models across various datasets and tasks, including document ranking and question answering.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 17
    Text Generation Inference

    Text Generation Inference

    Large Language Model Text Generation Inference

    Text Generation Inference is a high-performance inference server for text generation models, optimized for Hugging Face's Transformers. It is designed to serve large language models efficiently with optimizations for performance and scalability.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 18
    FastRAG

    FastRAG

    Efficient Retrieval Augmentation and Generation Framework

    fastRAG is a research framework for efficient and optimized retrieval augmented generative pipelines, incorporating state-of-the-art LLMs and Information Retrieval. fastRAG is designed to empower researchers and developers with a comprehensive tool set for advancing retrieval augmented generation.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 19
    DeepSparse

    DeepSparse

    Sparsity-aware deep learning inference runtime for CPUs

    A sparsity-aware enterprise inferencing system for AI models on CPUs. Maximize your CPU infrastructure with DeepSparse to run performant computer vision (CV), natural language processing (NLP), and large language models (LLMs).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Hazm

    Hazm

    Persian NLP Toolkit

    Hazm is a natural language processing (NLP) library for Persian text, offering various tools for text preprocessing, tokenization, part-of-speech tagging, and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    ...Annotation is required because raw media is considered to be unstructured and not usable without it. That’s why training data is required for many modern machine learning use cases including computer vision, natural language processing and speech recognition.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 22
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    ...Supported models: Jasper, QuartzNet, CitriNet, Conformer-CTC, Conformer-Transducer, Squeezeformer-CTC, Squeezeformer-Transducer, ContextNet, LSTM-Transducer (RNNT), LSTM-CTC. NGC collection of pre-trained speech processing models.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    AdalFlow

    AdalFlow

    The library to build & auto-optimize LLM applications

    AdalFlow is a framework for building AI-powered automation workflows, enabling users to design and execute intelligent automation pipelines with minimal coding.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    NNCF

    NNCF

    Neural Network Compression Framework for enhanced OpenVINO

    NNCF (Neural Network Compression Framework) is an optimization toolkit for deep learning models, designed to apply quantization, pruning, and other techniques to improve inference efficiency.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    Detoxify

    Detoxify

    Trained models & code to predict toxic comments

    Detoxify is a deep learning-based tool for detecting and filtering toxic language in online conversations, leveraging Transformer models for high accuracy.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB