MII makes low-latency and high-throughput inference possible
Topic Modelling for Humans
Implementation for MatMul-free LM
Big Model Application Development Practice 1
Book_4_Matrix Power | The Iris Book: From Addition, Subtraction
BitNet: Scaling 1-bit Transformers for Large Language Models
Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
A simple, easy-to-hack GraphRAG implementation
A PyTorch-based Speech Toolkit
Superfast AI decision making and processing of multi-modal data
Tensor search for humans
High-performance Inference and Deployment Toolkit for LLMs and VLMs
GLM-4 series: Open Multilingual Multimodal Chat LMs
Revolutionizing Database Interactions with Private LLM Technology
SOTA discrete acoustic codec models with 40/75 tokens per second
21 Lessons, Get Started Building with Generative AI
Hindsight: Agent Memory That Learns
GeoAI: Artificial Intelligence for Geospatial Data
Composable transformations of Python+NumPy programs
Superduper: Integrate AI models and machine learning workflows
Data Infrastructure providing an approach to multimodal AI workloads
Multimodal embedding and reranking models built on Qwen3-VL
Designed for text embedding and ranking tasks
A tool for learning vector representations of words and entities
Open-source tools for prompt testing and experimentation