Software Requirements and Dependencies
On This Page
Software Requirements and Dependencies#
Here are our dependencies both for Cerebras Software Platform (CSoft) for pipeline mode and to run our model reference implementations on GPUs. This covers AI and the orchestration stack. Note that this only applies to the Slurm/Singularity workflow.
Dependencies#
- Workflow
Slurm 19.05.1-2, SLURM website.
- Inside our container, if you use our SIF container, you don’t need to install anything.
TensorFlow 2.2
PyTorch 1.11
Python recommended version 3.8
GPU Requirements#
The Cerebras Model Zoo git repository allows for models to be run on GPUs as well as the Cerebras WSE. To run the model code
on a GPU, certain packages need to be installed. This is usually best done in a virtual environment
(virtualenv
) or a Conda environment. Below we provide instructions for setting up a virtulenv
.
CUDA Requirements#
To run on a GPU, the CUDA libraries must be installed on the system. This includes both the CUDA toolkit as well as the cuDNN libraries. To install these packages, please follow the instructions provided on the CUDA website. And make sure to also include the cuDNN library installation. The TensorFlow and PyTorch models included in the Cerebras Model Zoo git repository have different requirements. Please follow the specific instructions below.
TensorFlow#
Currently, the Cerebras Model Zoo git repository only supports TensorFlow version 2.2 which requires CUDA version 10.1/10.2.
Once all the CUDA reuirements are installed, create a virtualenv
on your system, activate the virtualenv
and install TensorFlow 2.2 for GPUs using the following commnds:
virtualenv venv
source venv/bin/activate
pip install tensorflow-gpu==2.2.0
Note, the virtualenv
may need to set the Python version to version older than 3.9 to be compatible with
TensorFlow version 2.2.
Set the LD_LIBRARY_PATH environment variable to the location at which the CUDA 10.1/2 libraries are installed on your system:
export LD_LIBRARY_PATH=<path to cuda 10.1/10.2 lib64>:$LD_LIBRARY_PATH
for example:
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64
To test if TensorFlow is able to properly access the GPU start a python session and run the following TensorFlow commands:
python
>>> import tensorflow as tf
>>> # confirm that the TF version is 2.2
>>> tf.__version__
'2.2.0'
>>> tf.test.is_gpu_available()
The last command should return True
if all libraries were correctly loaded; otherwise, the output should
indicate which CUDA libraries did not load correctly. Note that some methods of installing CUDA 10.1/2
require a installing the cuBLAS library from 10.2, while the rest of the CUDA libraries are from 10.1.
This may require adding the path to the lib64 directory in both installations to the LD_LIBRARY_PATH
variable.
PyTorch#
Currently, the Cerebras Model Zoo git repository only supports PyTorch version 1.11 which requires CUDA version 10.1/10.2.
Once all the CUDA requirements are installed, create a virtualenv on your system, with Python version 3.8 or newer, activate the virtualenv and install pytorch using the following commnds:
virtualenv venv
source venv/bin/activate
pip install torch==1.11.0 torchvision==0.12.0 pyyaml numpy tensorboard nltk keras-preprocessing filelock huggingface_hub transformers
To test whether PyTorch is able to properly access the GPU, start a Python session and run the following commands:
>>> import torch
>>> torch.__version__
1.11
>>> torch.cuda.is_available()
True # SHOULD RETURN TRUE
>>> torch.cuda.device_count()
1 # NUMBER OF DEVICES PRESENT
>>> torch.cuda.get_device_name(0)
# SHOULD RETURN THE PROPER GPU TYPE
While is not needed for GPU/CPU run, Cerebras uses PyTorch/XLA in the container because it depends on the XLA backend PyTorch/XLA website.