modelzoo.transformers.pytorch.gpt2.scripts#

fold_mup

This script takes a path to a muP GPT checkpoint and folds muP constants into the weights of the model to create an sP checkpoint that has approximately equivalent behavior when used for non-training workloads.