TL;DR

The fastest way to build your own storyteller with a batch run on VESSL.

Description

This repository provides a practical and efficient approach for training medium-sized GPTs, prioritizing simplicity and speed over extensive educational resources. The code is concise and easy to customize, enabling the training of new models or fine-tuning of existing checkpoints, including the option to use the GPT-2 1.3B model.

YAML

name: nanogpt
description: "The fastest way to build your own storyteller with a batch run on VESSL."
image: nvcr.io/nvidia/pytorch:22.03-py3
resources:
  cluster: aws-apne2
  preset: v1.v100-1.mem-52
import:
  /root/examples: git://github.com/vessl-ai/examples
export:
  /output: vessl-artifact://
run:
  - workdir: /root/examples/nanogpt
    command: |
      pip install torchaudio -f <https://download.pytorch.org/whl/cu111/torch_stable.html>
      pip install transformers datasets tiktoken wandb tqdm
      python data/shakespeare_char/prepare.py
      python train.py config/train_shakespeare_char.py
      python sample.py --out_dir=out-shakespeare-char