The fastest way to build your own storyteller with a batch run on VESSL.
This repository provides a practical and efficient approach for training medium-sized GPTs, prioritizing simplicity and speed over extensive educational resources. The code is concise and easy to customize, enabling the training of new models or fine-tuning of existing checkpoints, including the option to use the GPT-2 1.3B model.
name: nanogpt
description: "The fastest way to build your own storyteller with a batch run on VESSL."
image: nvcr.io/nvidia/pytorch:22.03-py3
resources:
cluster: aws-apne2
preset: v1.v100-1.mem-52
import:
/root/examples: git://github.com/vessl-ai/examples
export:
/output: vessl-artifact://
run:
- workdir: /root/examples/nanogpt
command: |
pip install torchaudio -f <https://download.pytorch.org/whl/cu111/torch_stable.html>
pip install transformers datasets tiktoken wandb tqdm
python data/shakespeare_char/prepare.py
python train.py config/train_shakespeare_char.py
python sample.py --out_dir=out-shakespeare-char