.. _cs-pytorch-qs: Pytorch: Getting Started ======================== This quick-start guide describes first-time user setup and user workflow for running PyTorch jobs on a Cerebras Wafer-Scale Cluster. Cerebras Wafer-Scale Cluster is composed of CS-2 system(s), MemoryX, SwarmX, management and input worker nodes. The cluster supports two execution modes to enable ML models of different sizes: - **Pipelined**: In this mode, all the layers of the network are loaded together onto the Cerebras WSE. This mode is selected for neural network models that fit entirely on the WSE, approximately up to 1B parameters. - **Weight Streaming**: In this mode, one layer of the neural network model is loaded at a time. This layer-by-layer mode is used to run extremely large models (>1B parameters). Perform the following steps to run your PyTorch jobs on the Wafer-Scale Cluster: 1. Ensure that the admin setup is complete. See :ref:`admin-checklist`. 2. Follow the first-time user setup procedure for PyTorch below. This includes creating and configuring your virtual environment. This step should be done only once. 3. Wafer-Scale Clusters now adopt the same workflow to launch jobs for Pipelined execution and Weight Streaming execution. Please refer to page :ref:`cs-pytorch-pl-ws-unified-appliance`. .. note:: If you are on the Original Cerebras Installation and have not upgraded to the Wafer-Scale Cluster, you can still use Slurm-based workflow to launch jobs for small to medium models with Pipelined execution. Large models with Weight Streaming execution are not supported on the Original Cerebras Installation. To get started on the Original Cerebras Installation, see `PyTorch: Getting Started `_. .. admonition:: If you are ready to start developing / adapting your own PyTorch code for CS System Skip to `Workflow for PyTorch `_ on CS for an in-depth development guide using PyTorch for Cerebras. First-time user setup for PyTorch --------------------------------- The first time you use Wafer-Scale Cluster for your PyTorch runs, you must set up a virtual environment as shown below. .. Note:: Make sure that you have the TLS Certificate available from your sysadmin. You need this to communicate between the user node and the Wafer-Scale Cluster. Your admin will have shared the path to this file during the setup. 1. Set up the Python virtual environment using Python 3.7. Create the environment named ``venv_cerebras_pt`` using the following command: .. code-block:: bash python3.7 -m venv venv_cerebras_pt 2. Cerebras provides three main packages to set up virtual environments: ``cerebras_appliance`` software package, the ``cerebras_tensorflow`` package if you are using TensorFlow, and the ``cerebras_pytorch`` package if you are using PyTorch. To set up your PyTorch environment, you need two out of these three packages. Enter the following commands on the user node to install the required packages (make sure to execute the commands to install the appliance wheel first): .. code-block:: bash source venv_cerebras_pt/bin/activate pip install /cerebras_appliance-___-py3-none-any.whl --find-links= pip install /cerebras_pytorch-___-py3-none-any.whl --find-links= .. Note:: With the ``find-links`` command, it finds the correct ``cerebras_appliance`` version if you place all the wheels in the same directory. Running PyTorch jobs -------------------- After you have completed first-time user setup, please refer to page :ref:`cs-pytorch-pl-ws-unified-appliance` to get started. .. toctree:: :maxdepth: 2 cs-pytorch-pl-ws-unified-appliance.rst