NanoGPT is a minimalistic yet powerful reimplementation of GPT-style transformers created by Andrej Karpathy for educational and research use. It distills the GPT architecture into a few hundred lines of Python code, making it far easier to understand than large, production-scale implementations. The repo is organized with a training pipeline (dataset preprocessing, model definition, optimizer, training loop) and inference script so you can train a small GPT on text datasets like Shakespeare or custom corpora. It emphasizes readability and clarity: the training loop is cleanly written, and the code avoids heavy abstractions, letting students follow the architecture step by step. While simple, it can still train non-trivial models on modern GPUs and generate coherent text. The project has become widely used in tutorials, courses, and experiments for people learning how transformers work under the hood.

Features

  • Compact GPT transformer implementation in plain Python/PyTorch
  • Data preprocessing pipeline for text datasets (e.g. Shakespeare)
  • Training loop with clear optimizer and scheduler setup
  • Inference script for text generation after training
  • Readable, educational codebase (few hundred lines)
  • Supports running on modern GPUs for small to mid-sized models

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow nanoGPT

nanoGPT Web Site

Other Useful Business Software
Dynamic Work and Complex Project Management Platform | Quickbase Icon
Dynamic Work and Complex Project Management Platform | Quickbase

Quickbase is the leading application platform for dynamic work.

Our no-code platform lets you easily create, connect, and customize enterprise applications that fix visibility and workflow gaps without replacing a single system.
Learn More
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of nanoGPT!

Additional Project Details

Operating Systems

Windows

Programming Language

Python

Related Categories

Python Artificial Intelligence Software, Python Research Software

Registered

2025-10-01