Setup your environment#

To launch any job in the Original Cerebras Installation, you will use:

  • A Singularity Image Format (SIF) file that contains the Cerebras Software Platform (CSoft), as well as TensorFlow 2.2, PyTorch 1.11, and Python 3.8

  • Slurm workload manager (v 19.05.1-2) for resource allocation and orchestration.

These tools are available through two wrapper scripts called csrun_cpu and csrun_wse available in your chief node.

Note

Wrapper scripts (csrun_wse and csrun_cpu) may be customized for your particular environment by your sysadmins and may look different than the default template. Check whether your Sysadmin’s local documentation is available and whether there are any special instructions for your CS-2.

To start using the csrun_cpu and csrun_wse wrappers, follow these steps:

  1. Obtain from your system admininstrator the location of the csrun_cpu and csrun_wse wrappers. Your system administrator has already set up the information related with the Cerebras SIF image and slurm default configuration inside csrun_cpu.

    # All that needs to be set by system admins for different systems is here
    ########################################################################
    # sif image location
    SINGULARITY_IMAGE=
    
    # Comma seperated string of directories to mount.
    # ex: MOUNT_DIRS="/data/,/home/"
    # Note that the current directory is always mounted. So no need to add ${pwd}
    MOUNT_DIRS=
    
    # Default slurm cluster settings (must be set)
    DEF_NODES=
    DEF_TASKS_PER_NODE=
    DEF_CPUS_PER_TASK=
    
    #### More slurm configurations (recommended but not required) #####
    # The name of the GRES resource.
    GRES_RESOURCE=
    
    # The GRES node associated with the gres resource
    GRES_NODE=
    ########################################################################
    
  2. Confirm that csrun_cpu and csrun_wse scripts are accessible in your path.

    echo $PATH
    

    If $PATH does not include the parent directory of csrun_cpu and csrun_wse, then add it with

    export PATH=$PATH:/path/to/parent/directory/csrun_wse/and/csrun_cpu
    
  3. (Optional) for convenience, initialize the environment variable $CS_IP$ with the IP of the CS-2 system attached. Your system administrador can provide this value.

    export CS_IP=ip.to.cs-2
    

    To verify the value of the environment variable, do

    echo $PATH
    

Now you are all set and ready to train your first model on the Original Cerebras Installation!