How-to Guides#

DIV Onclick Test

Train an LLM using Maximal Update Parameterization

Learn how to enable μP when training models on the Cerebras Wafer-Scale cluster

Extend Context Length Using Position Interpolation

Learn how to use Position Interpolation to enable models using RoPE or ALiBi to efficiently extend their context length.

Train an LLM with a large or small context window

Learn how to use the CS-X to train an LLM with a large or small context window


Instruction fine-tune an LLM

Learn how to fine-tune LLMs on datasets with instructions and corresponding responses

Train a model with weight sparsity

Learn how to train a model with weight sparsity to achieve a sparse model that requires fewer FLOPs to train and fewer parameters to store

Restart a dataloader

Learn how to resume training from the same point in the input-generating dataloader


Port a trained and fine-tuned model to Hugging Face

Learn how to port an LLM model trained in the Cerebras’s Wafer-Scale Cluster to Hugging Face to generate outputs

Port a Hugging Face model to Cerebras Model Zoo

Learn how to port a Hugging Face model to the Cerebras Model Zoo to generate outputs

Control numerical precision level

Learn how to control the level of numerical precision used for training runs for large NLP models


Enable Dynamic Loss Scaling

Learn how to enable dynamic loss scaling using Cerebras's custom PyTorch module.

Run Cerebras Model Zoo on a GPU

Learn how to run models in the Cerebras Model Zoo on GPUs and which packages to install