`cerebras_pytorch.sparse`#

Configuration routines#

The highest level entry-point to enabling sparsity is configure_sparsity_wrapper, which will return a drop-in replacement optimizer that automatically applies the sparsity algorithm according to the high-level configuration dictionary. These config dictionaries (actually passed as **kwargs) follow the same form as given in Sparsity.

For more control, the helper configure_sparsity_optimizer can construct a BaseSparsityOptimizer from the same high-level configuration dictionary.

These optimizers can be used manually like any other pytorch optimizer, but see hook_module for having them automatically apply sparsity during a module’s forward() and backward(). The same high-level drop-in replacment optimizer wrapper from configure_sparsity_wrapper can be directly constructed and used.

Sparsity Optimizers#

These classes are the built-in sparsity algorithms. StaticSparsityOptimizer is an “optimizer” that maintains a static sparsity pattern throughout all training. The rest implement published dyanmic sparsity algorithms. These are the objects returned from configure_sparsity_optimizer et al.

Even though static sparsity never updates it sparsity pattern throughout training, it is still implemented as an “Optimizer” to provide a consistent API and allow static & dynamic sparsity to be easily swapped via configuration.

Customizing Sparsity & Reference#

Several building blocks can be inherited from or composed to help build new dynamic sparsity algorithms or customize the behavior of existing ones.