Industrial-strength Natural Language Processing (NLP)
A natural language interface for computers
Semantic search and workflows for medical/scientific papers
ExtractThinker is a Document Intelligence library for LLMs
Han Language Processing
The no-nonsense RAG chunking library
The Classical Language Toolkit
ReFT: Representation Finetuning for Language Models
An LLM-powered knowledge curation system that researches topics
A Repo For Document AI
Trained models & code to predict toxic comments
Data and tools for generating and inspecting OLMo pre-training data
Efficient Retrieval Augmentation and Generation Framework
Easy-to-use and powerful NLP library with Awesome model zoo
Training data (data labeling, annotation, workflow) for all data types
Large Language Model Text Generation Inference
Toolkit for conversational AI
Hub of ready-to-use datasets for ML models
Build AI-powered semantic search applications
A Heterogeneous Benchmark for Information Retrieval
The library to build & auto-optimize LLM applications
Data processing for and with foundation models
Efficient few-shot learning with Sentence Transformers
Extract schema, statistics and entities from datasets
Module for automatic summarization of text documents and HTML pages