Software Requirements and Dependencies

Software Requirements and Dependencies

Users interact with the Cerebras Wafer-Scale Cluster as if it were an appliance, meaning running large models on the Cerebras Wafer-Scale Cluster is as easy as running on a single device.

To use the Wafer-Scale Cluster to run a job, you must set up a virtual environment, install Cerebras packages in that environment, and launch your jobs from there. The process is as simple as installing a few python wheels provided by Cerebras to the virtual environment. The python wheels contain the dependencies needed.

We recommend setting up separate environments for TensorFlow and PyTorch, if you plan to experiment with both frameworks. If you plan to run all your experiments with PyTorch only and don’t plan to use TensorFlow, set up the PyTorch environment only. If you plan to work with TensorFlow only, set up the TensorFlow environment only. If you want to use both frameworks, set up two different environments, one for PyTorch, and one for TensorFlow.

Perform the following steps to run your jobs on the Cerebras Wafer-Scale Cluster:

  1. Ensure that the admin setup is complete. Check with your Sysadmin and see the checklist below.

  2. Follow the first-time user setup procedure for your framework of choice. This includes creating and configuring your virtual environment. This step should be done only once.

  3. Activate your virtual environment at the beginning of your working session. Run the scripts within the activated environment to train or evaluate your model.

Admin setup checklist

Your admin should have set up the following:

  • Kubernetes is set up.

  • Cluster management software is already running on the Wafer-Scale Cluster and is ready to interact with the user node.

  • TLS certificate is generated, and you know its location.

  • Python 3.7 is available.

  • The path to the Cerebras packages (cerebras_appliance, cerebras_tensorflow and cerebras_pytorch) is available to you. You need these packages to set up your virtual environment(s). You need cerebras_appliance package for all your environments, cerebras_tensorflow package for the TensorFlow environment for Weight Streaming runs, and cerebras_pytorch for the PyTorch environment for Weight Streaming runs.

  • Sysadmin has populated a admin-defaults.yaml file with the default distribution of resources to be used. (This is required for Piplined execution only.)

Note

If you are interested in pipelined execution mode only (smaller models), then you only need cerebras_appliance package and can use the same environment with only this package installed for both PyTorch and TensorFlow runs with Pipelined execution. For runs with Weight Streaming execution, you need environments with cerebras_pytorch for PyTorch runs and with cerebras_tensorflow for TensorFlow runs. We expect that most of the users will be experimenting with both Pipelined and Weight Streaming executions and provide steps to setup PyTorch and TensorFlow environments, which support both modes of execution.

  • To get started with PyTorch on Wafer-Scale Cluster and set up your PyTorch environent, follow steps provided in Pytorch: Getting Started.

  • To get started with TensorFlow on Wafer-Scale Cluster and set up your TensorFlow environment, follow steps provided in TensorFlow: Getting Started.