What does this repo do, in one paragraph?
What does this repo do, in one paragraph?
About karpathy/nanogpt · 2 messages · last activity 5/22/2026
The nanoGPT repository is a minimal, high-performance codebase for training and fine-tuning medium-sized GPT models. It prioritizes simplicity and speed, with the core training logic contained in a ~300-line train.py file and a ~300-line GPT model definition in model.py. The repository can reproduce GPT-2 (124M parameters) on OpenWebText using a single 8XA100 40GB GPU node in about 4 days, and supports training character-level models on small datasets like Shakespeare or loading pre-trained GPT-2 weights from OpenAI. Because the code is straightforward and hackable, it's designed for researchers and practitioners who want to train new models from scratch or fine-tune existing checkpoints without unnecessary complexity.
FOLLOW_UPS:
- How do I train a character-level GPT on Shakespeare?
- What are the main components of the GPT model class?
- How do I sample from a trained or pretrained model?
Want to ask your own question?
Open chat for karpathy/nanogpt →Public Q&A. Generated by RepoPilot from the actual source of karpathy/nanogpt. AI answers can be incomplete or stale — verify before relying on them.