Setup your environment#
To launch any job in the Original Cerebras Installation, you will use:
A Singularity Image Format (SIF) file that contains the Cerebras Software Platform (CSoft), as well as TensorFlow 2.2, PyTorch 1.11, and Python 3.8
Slurm workload manager (v 19.05.1-2) for resource allocation and orchestration.
These tools are available through two wrapper scripts called csrun_cpu
and csrun_wse
available in your chief node.
Note
Wrapper scripts (csrun_wse
and csrun_cpu
) may be customized for your particular environment by your sysadmins and may look different than the default template. Check whether your Sysadmin’s local documentation is available and whether there are any special instructions for your CS-2.
To start using the csrun_cpu
and csrun_wse
wrappers, follow these steps:
Obtain from your system admininstrator the location of the
csrun_cpu
andcsrun_wse
wrappers. Your system administrator has already set up the information related with the Cerebras SIF image and slurm default configuration insidecsrun_cpu
.# All that needs to be set by system admins for different systems is here ######################################################################## # sif image location SINGULARITY_IMAGE= # Comma seperated string of directories to mount. # ex: MOUNT_DIRS="/data/,/home/" # Note that the current directory is always mounted. So no need to add ${pwd} MOUNT_DIRS= # Default slurm cluster settings (must be set) DEF_NODES= DEF_TASKS_PER_NODE= DEF_CPUS_PER_TASK= #### More slurm configurations (recommended but not required) ##### # The name of the GRES resource. GRES_RESOURCE= # The GRES node associated with the gres resource GRES_NODE= ########################################################################
Confirm that
csrun_cpu
andcsrun_wse
scripts are accessible in your path.echo $PATH
If
$PATH
does not include the parent directory ofcsrun_cpu
andcsrun_wse
, then add it withexport PATH=$PATH:/path/to/parent/directory/csrun_wse/and/csrun_cpu
(Optional) for convenience, initialize the environment variable
$CS_IP$
with the IP of the CS-2 system attached. Your system administrador can provide this value.export CS_IP=ip.to.cs-2
To verify the value of the environment variable, do
echo $PATH
Now you are all set and ready to train your first model on the Original Cerebras Installation!