.. _supported-pytorch-ops:


Cerebras PyTorch Layer API
==========================

Cerebras PyTorch Layer API implements a subset of PyTorch APIs with our custom implementation that takes advantage of our high-performance kernels and provides extra functionalities as compared to the native PyTorch version. The extra functionalities are optional and opt-in; if you don’t use the extra functionalities, then the Layer API is equivalent to the native PyTorch version.

- :ref:`pytorch-ops-torch.nn.multihead-attention` is the replacement for ``torch.nn.MultiheadAttention``

- :ref:`pytorch-ops-torch.nn.transformer-decoder-layer` is the replacement for ``torch.nn.TransformerDecoderLayer``

- :ref:`pytorch-ops-torch.nn.transformer-decoder` is the replacement for ``torch.nn.TransformerDecoder``

- :ref:`pytorch-ops-torch.nn.transformer-encoder-layer` is the replacement for ``torch.nn.TransformerEncoderLayer``

- :ref:`pytorch-ops-torch.nn.transformer-encoder` is the replacement for ``torch.nn.TransformerEncoder``

Supported PyTorch Ops
---------------------

If your model implementation requires additional PyTorch Ops beyond the layer APIs above, Cerebras also supports the following PyTorch operations.

.. attention::

	The following list of supported PyTorch ops is very preliminary. We cannot guarantee that mixing and matching them in your models will work. Support is only provided for the way they are used in the `Cerebras Model Zoo <https://github.com/Cerebras/modelzoo>`_.

nn
--

- `torch.nn.BCEWithLogitsLoss <https://pytorch.org/docs/stable/generated/torch.nn.BCEWithLogitsLoss.html#bcewithlogitsloss>`_
- `torch.nn.CrossEntropyLoss <https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html>`_
	Note: Known limitation: ``ignore_index`` can only be -100
- `torch.nn.Dropout <https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html#torch.nn.Dropout>`_
- `torch.nn.Embedding <https://pytorch.org/docs/stable/generated/torch.nn.Embedding.html#torch.nn.Embedding>`_
	Note: Known limitation: ``num_embeddings < 65536``
- `torch.nn.functional.dropout <https://pytorch.org/docs/stable/generated/torch.nn.functional.dropout.html#torch-nn-functional-dropout>`_
- `torch.nn.functional.gelu <https://pytorch.org/docs/stable/generated/torch.nn.functional.gelu.html#torch-nn-functional-gelu>`_
	Note: Known limitation: May have precision issue when approximation ``!=tanh``
- `torch.nn.functional.pad <https://pytorch.org/docs/stable/generated/torch.nn.functional.pad.html#torch-nn-functional-pad>`_
- `torch.nn.functional.softmax <https://pytorch.org/docs/stable/generated/torch.nn.functional.softmax.html#torch-nn-functional-softmax>`_
- `torch.nn.LayerNorm <https://pytorch.org/docs/stable/generated/torch.nn.LayerNorm.html#torch.nn.LayerNorm>`_
- `torch.nn.Linear <https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear>`_
- `torch.nn.MSELoss <https://pytorch.org/docs/stable/generated/torch.nn.MSELoss.html#mseloss>`_
- `torch.nn.NLLLoss <https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html#torch.nn.NLLLoss>`_
	Note: Known limitation: ``ignore_index`` can only be -100
- `torch.nn.ReLU <https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU>`_
- `torch.nn.Softmax <https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html#softmax>`_
- `torch.nn.TanH <https://pytorch.org/docs/stable/generated/torch.nn.Tanh.html#torch.nn.Tanh>`_


Functional
----------

- `torch.nn.functional.log_softmax <https://pytorch.org/docs/stable/generated/torch.nn.functional.log_softmax.html#torch.nn.functional.log_softmax>`_
- `torch.nn.functional.relu <https://pytorch.org/docs/stable/generated/torch.nn.functional.relu.html#torch.nn.functional.relu>`_
- `torch.nn.functional.silu <https://pytorch.org/docs/stable/generated/torch.nn.functional.silu.html#torch.nn.functional.silu>`_

Other ops
---------


- `torch.abs <https://pytorch.org/docs/stable/generated/torch.abs.html#torch-abs>`_
- `torch.all <https://pytorch.org/docs/stable/generated/torch.all.html#torch-all>`_
- `torch.arange <https://pytorch.org/docs/stable/generated/torch.arange.html#torch-arange>`_
- `torch.broadcast_to <https://pytorch.org/docs/stable/generated/torch.broadcast_to.html>`_
- `torch.cat <https://pytorch.org/docs/stable/generated/torch.cat.html#torch-cat>`_
- `torch.einsum <https://pytorch.org/docs/stable/generated/torch.einsum.html#torch-einsum>`_
- `torch.flatten <https://pytorch.org/docs/stable/generated/torch.flatten.html#torch.flatten>`_
- `torch.full <https://pytorch.org/docs/stable/generated/torch.full.html#torch-full>`_
- `torch.full_like <https://pytorch.org/docs/stable/generated/torch.full_like.html#torch-full-like>`_
- `torch.gather <https://pytorch.org/docs/stable/generated/torch.gather.html#torch-gather>`_
- `torch.log <https://pytorch.org/docs/stable/generated/torch.log.html#torch-log>`_
- `torch.matmul <https://pytorch.org/docs/stable/generated/torch.matmul.html#torch-matmul>`_
- `torch.min <https://pytorch.org/docs/stable/generated/torch.min.html#torch-min>`_
- `torch.ones <https://pytorch.org/docs/stable/generated/torch.ones.html#torch-ones>`_
- `torch.rsqrt <https://pytorch.org/docs/stable/generated/torch.rsqrt.html#torch-rsqrt>`_
- `torch.sigmoid <https://pytorch.org/docs/stable/generated/torch.sigmoid.html>`_
- `torch.sum <https://pytorch.org/docs/stable/generated/torch.sum.html>`_
- `torch.tanh <https://pytorch.org/docs/stable/generated/torch.tanh.html>`_
- `torch.where <https://pytorch.org/docs/stable/generated/torch.where.html>`_
- `torch.zeros <https://pytorch.org/docs/stable/generated/torch.zeros.html#torch-zeros>`_
- `torch.zeros_like <https://pytorch.org/docs/stable/generated/torch.zeros_like.html#torch-zeros-like>`_

Layers
------

.. toctree::
  :maxdepth: 2

  pytorch-ops-torch.nn.multihead-attention.rst
  pytorch-ops-torch.nn.transformer-decoder-layer.rst
  pytorch-ops-torch.nn.transformer-decoder.rst
  pytorch-ops-torch.nn.transformer-encoder-layer.rst
  pytorch-ops-torch.nn.transformer-encoder.rst