PyTorch Quickstart

This quickstart is a step-by-step guide to compile a PyTorch FC-MNIST model (already ported to Cerebras) targeting your CS system.

Prerequisites

Attention

Go over this Checklist Before You Quickstart before you proceed.

Compile the model

  1. Log in to your CS system cluster.

  2. Clone the reference samples repository to your preferred location in your home directory.

    git clone https://github.com/Cerebras/cerebras_reference_implementations.git
    

    In the reference samples directory you will see the following PyTorch model examples:

    In this quickstart we will use the FC MNIST model. Navigate to the fc_mnist model directory.

    cd cerebras_reference_implementations/fc_mnist/pytorch/
    
  3. Compile the model targeting the CS system.

    The below csrun_cpu command will compile the code in the train mode for the CS system. Note that this step will only compile the code and will not run training on the CS system.

    csrun_cpu python-pt run.py --mode train \
        --compile_only \
        --params configs/<name-of-the-params-file.yaml> \
        --cs_ip <specify your CS_IP>:<port>
    

    Note

    The parameters can also be set in the params.yaml file.

Train on GPU

To train on a GPU, run:

python run.py --mode train --params configs/<name-of-the-params-file.yaml>

Train on CS system

  1. Execute the csrun_wse command to run the training on the CS system. See the command format below:

    Attention

    For PyTorch models only, the cs_ip flag must include both the IP address and the port number of the CS system. Only the IP address, for example: --cs_ip 192.168.1.1, will not be sufficient. You must also include the port number, for example: --cs_ip 192.168.1.1:9000.

    csrun_wse python-pt run.py --mode train \
        --cs_ip <IP:port-number> \
        --params configs/<name-of-the-params-file.yaml> \