Supported PyTorch Learning Rate Schedulers

Supported PyTorch Learning Rate SchedulersΒΆ

LRScheduler

Cerebras specific learning rate scheduler base class.

ConstantLR

Maintains a constant learning rate for each parameter group (no decaying).

PolynomialLR

Decays the learning rate of each parameter group using a polynomial function in the given decay_steps.

ExponentialLR

Decays the learning rate of each parameter group by decay_rate every step.

InverseExponentialTimeDecayLR

Decays the learning rate inverse-exponentially over time, as described here.

InverseSquareRootDecayLR

Decays the learning rate inverse-squareroot over time.

CosineDecayLR

Applies the cosine decay schedule as described here.

SequentialLR

Receives the list of schedulers that is expected to be called sequentially during optimization process and milestone points that provides exact intervals to reflect which scheduler is supposed to be called at a given step.

PiecewiseConstantLR

Adjusts the learning rate to a predefined constant at each milestone and holds this value until the next milestone.

MultiStepLR

Decays the learning rate of each parameter group by gamma once the number of steps reaches one of the milestones.

StepLR

Decays the learning rate of each parameter group by gamma every step_size.

CosineAnnealingLR

Set the learning rate of each parameter group using a cosine annealing schedule, where πœ‚π‘šπ‘Žπ‘₯ is set to the initial lr and π‘‡π‘π‘’π‘Ÿ is the number of steps since the last restart in SGDR.

LambdaLR

Sets the learning rate of each parameter group to the initial lr times a given function (which is specified by overriding set_lr_lambda).

CosineAnnealingWarmRestarts

Set the learning rate of each parameter group using a cosine annealing schedule, where πœ‚π‘šπ‘Žπ‘₯ is set to the initial lr, π‘‡π‘π‘’π‘Ÿ is the number of steps since the last restart and 𝑇𝑖 is the number of steps between two warm restarts in SGDR.

MultiplicativeLR

Multiply the learning rate of each parameter group by the supplied coefficient.

ChainedScheduler

Chains list of learning rate schedulers.

CyclicLR

Sets the learning rate of each parameter group according to cyclical learning rate policy (CLR).

OneCycleLR

Sets the learning rate of each parameter group according to the 1cycle learning rate policy.