cerebras_pytorch.sparse
#
Configuration routines#
The highest level entry-point to enabling sparsity is
configure_sparsity_wrapper
, which will
return a drop-in replacement optimizer that automatically applies the sparsity
algorithm according to the high-level configuration dictionary. These config
dictionaries (actually passed as **kwargs
) follow the same form as given in
Sparsity.
For more control, the helper
configure_sparsity_optimizer
can construct
a BaseSparsityOptimizer
from the same
high-level configuration dictionary.
These optimizers can be used manually like any other pytorch optimizer, but see
hook_module
for
having them automatically apply sparsity during a module’s forward() and
backward(). The same high-level drop-in replacment optimizer wrapper from
configure_sparsity_wrapper
can be directly
constructed and used.
Sparsity Optimizers#
These classes are the built-in sparsity algorithms.
StaticSparsityOptimizer
is an “optimizer”
that maintains a static sparsity pattern throughout all training. The rest
implement published dyanmic sparsity algorithms. These are the objects returned
from configure_sparsity_optimizer
et al.
Even though static sparsity never updates it sparsity pattern throughout training, it is still implemented as an “Optimizer” to provide a consistent API and allow static & dynamic sparsity to be easily swapped via configuration.
Customizing Sparsity & Reference#
Several building blocks can be inherited from or composed to help build new dynamic sparsity algorithms or customize the behavior of existing ones.