Cerebras Model Zoo supported config parameters#

The Cerebras Model Zoo supports an extensive range of models, each with its own set of parameters. This document provides a comprehensive list of these parameters as defined in the Model Zoo.

With the introduction of Config classes in the Model Zoo, each parameter is defined in the model’s config file. These classes, implemented as Python dataclasses, organize and validate the parameters necessary for model definition and training.

Common parameters across models#

1. RunConfig Parameters:


2. Sparsity Parameters:


3. Optimizer Parameters:


Model Specific Parameters#

Large Language Model (LLM) parameters#

BERT config

Bloom config

BTLM config

DPO config

DPR config

Falcon config

GPT2 config

GPT3 config

GPTJ config

Llama config

Mistral config

MPT config

SantaCoder config

StarCoder config

T5 config

Transformer config

Vision model parameters#

DiT config

Vision Transformer config

Multimodal model parameters#

LLaVA config

Understanding Config classes structure#

Each of these parameters are present as part of a Config class. A Config class, implemented as a Python dataclass, serves as a container for storing essential settings and parameters needed for defining and training a model.

Each of these class atributes correspond to the respective section in a YAML file which is used to define the parameters for a training run.

A config class looks like this:

class <ConfigClass>:
    train_input = Optional[DataConfig] = None

    eval_input = Optional[DataConfig] = None

    model = <ModelConfigClass> = required

    sparsity: Optional[SparsityConfig] = None

    optimizer: OptimizerConfig = required

    runconfig: RunConfig = required

For more information about Config classes, refer to the Model Zoo config classes documentation.

Additional notes#

Each model has a designated ModelConfigClass. In cases where a model is a variant of another, it may inherit the ModelConfigClass from the parent model. To understand more about this inheritance and the hierarchy of config classes, visit the Config class hierarchy documentation.