Skip to content

Supervised Learning

Quick start

We've provided an implementation of supervised learning in supervised/train.py. To use this training loop, you'll need to create a Config object with the data and parameters.

We've provided a ready-to-run example that fine-tunes Qwen3-4B on a small instruction-following dataset in sl_basic.py:

python -m logits_cookbook.recipes.sl_basic

This script fine-tunes the base (pretrained) model on a small dataset called NoRobots, created by Hugging Face.

What you'll see during training

  • Each step prints train and test loss, along with timing stats.
  • Predicted tokens (weight=1) are shown in green; context tokens (weight=0) in yellow.
  • Logs and checkpoints are written to the log_path directory (/tmp/logits-examples/sl_basic by default).

Output files

File Contents
metrics.jsonl Train/test loss and other metrics per step
checkpoints.jsonl Checkpoint paths (sampler and full-state)
config.json Serialized training config
# Plot train and test loss
df = pandas.read_json("/tmp/logits-examples/sl_basic/metrics.jsonl", lines=True)
plt.plot(df['train_mean_nll'], label='train_loss')
plt.plot(df['test/nll'].dropna(), label='test_loss')
plt.legend()
plt.show()

To use your own dataset, see the commented-out section in sl_basic.py using conversations.jsonl format.

Minimal training loop

For a more self-contained example without the dataset abstractions, see sl_loop.py:

python -m logits_cookbook.recipes.sl_loop

This script defines data loading inline and is useful for understanding how the training loop works under the hood, or as a starting point for writing your own loop.