RepoPilot

What does this repo do, in one paragraph?

About karpathy/nanogpt · 2 messages · last activity 5/22/2026

Question

What does this repo do, in one paragraph?

RepoPilot

The nanoGPT repository is a minimal, high-performance codebase for training and fine-tuning medium-sized GPT models. It prioritizes simplicity and speed, with the core training logic contained in a ~300-line train.py file and a ~300-line GPT model definition in model.py. The repository can reproduce GPT-2 (124M parameters) on OpenWebText using a single 8XA100 40GB GPU node in about 4 days, and supports training character-level models on small datasets like Shakespeare or loading pre-trained GPT-2 weights from OpenAI. Because the code is straightforward and hackable, it's designed for researchers and practitioners who want to train new models from scratch or fine-tune existing checkpoints without unnecessary complexity.

FOLLOW_UPS:

  • How do I train a character-level GPT on Shakespeare?
  • What are the main components of the GPT model class?
  • How do I sample from a trained or pretrained model?

Want to ask your own question?

Open chat for karpathy/nanogpt

Public Q&A. Generated by RepoPilot from the actual source of karpathy/nanogpt. AI answers can be incomplete or stale — verify before relying on them.