Software Requirements and Dependencies

Software Requirements and Dependencies

These are software dependencies for Cerebras Software Platform (CSoft) for Original Cerebras Installation. This covers AI and the orchestration stack. Cerebras Software Platform (CSoft) bits for the CPU nodes in the Original Cerebras Installation are packaged as a Singularity Image Format (SIF) file. We rely on Slurm workload manager for resource allocation and orchestration in Cerebras Original Installation.

Dependencies

  1. Workflow
    • Slurm 19.05.1-2, SLURM website.

    • Singularity, tested with Singularity version 3.8.7-1.el8

  2. Inside our Singularity container (this information is for your reference, you don’t need to install anything when running with CSoft as these packages are already in the Cerebras SIF file):
    • TensorFlow 2.2

    • PyTorch 1.11

    • Python recommended version 3.8

Checklist Before You Quickstart

When the CS system is installed at your site as part of Original Cerebras Installation, it is in a cluster that looks similar to the following diagram. Before you can start using the CS system, check with your system administrator and go over the following prerequisites first.

Note

This checklist only applies to the Original Cerebras Installation workflow, which supports Pipelined execution only.

Note

Slurm wrapper scripts (csrun_wse and csrun_cpu) may be customized for your particular environment by your sysadmins and may look different than what is shown below. Check whether your Sysadmin’s local documentation is available and whether there are any special instructions for your CS-2.

../_images/cs-getting-started.png
1
Cerebras SIF container
The Singularity software is installed on all the nodes, including the chief and the worker nodes, and can launch Cerebras container that consists of the Cerebras Graph Compiler (CGC) and other necessary libraries.
2
Slurm orchestrator
The orchestrator software Slurm is installed and is running on all the CPU nodes: on the chief node and on all the worker nodes. The orchestrator software Slurm performs the coordination between the CS system and the nodes in the CS cluster.
3
Hostnames
You have the hostnames of the chief and the worker nodes. You will log in to the chief node and perform all your work on the chief node. You need hostnames of the worker nodes for debugging.
4
IP address of CS system
You have the IP address and the port number of the network attached CS system accelerator. You pass this IP address and port number to the --cs_ip flag of your runtime scripts during compiling and running your models.
5
Login steps
Steps to log in to the chief node of the CS system cluster. Logging into the chief node is done by using ssh.
7
Done

Attention

Proceed to work on the CS system only after you have completed the above checklist.