TensorFlow Dynamic Loss Scaling
On This Page
TensorFlow Dynamic Loss Scaling#
Attention
This document presents dynamic loss scaling for TensorFlow. For PyTorch, see PyTorch Dynamic Loss Scaling.
See also
Dynamic Loss Scaling on Cerebras system.
Enabling dynamic loss scaling#
To enable dynamic loss scaling (DLS) with TensorFlow, use the CS system supported Trainer
optimizer.
Trainer#
The Trainer
optimizer builds the train ops based on the given configuration parameters. This optimizer initializes several parameters that apply to DLS, such as initial loss scaling factor, number of steps before changing the loss scale factor and so on. These settings are optimized for CS system.
Parameters#
params
: Input. Datatype dict
. Configuration parameters for the Trainer optimizer.
tf_summary
: Input. Datatype bool
. The flag for summaries. Defaults to False
.
mixed_precision
: Input. Datatype bool
. The flag for mixed precision. Defaults to False
.
Example#
The following is an example showing how to use the Trainer
optimizer in your code:
First, create an instance of the Trainer optimizer in the __init__(self)
section in your code.
# Model trainer
self.trainer = Trainer(
params=params["optimizer"],
tf_summary=tf_summary,
mixed_precision=params["training"]["mixed_precision"],
)
Then build the train ops.
def build_train_ops(self, total_loss):
"""
Setup optimizer and build train ops.
"""
return self.trainer.build_train_ops(total_loss)
For more details on the CSDynamicLossScale
and the Trainer
optimizer, refer to the code in the Cerebras Model Zoo repository.
Note
To access the Python code for CSDynamicLossScale
and the Trainer
optimizer, you will need read permission for Cerebras Model Zoo Git repository.
The
CSDynamicLossScale
object in Cerebras Graph Compiler (CGC) implements the dynamic loss scaling. See LossScale.py.This
CSDynamicLossScale
object is used by theTrainer
optimizer. See Trainer.py.