The Cerebras ML Workflow
The Cerebras ML Workflow#
When you are targeting the Cerebras CS system for your neural network jobs, start with the high-level workflow described here.
1
Port your code to CS
2
Prepare input data
3
Compile on CPU
4
Run on the CS system
Familiarize yourself with Cerebras ML workflow:
Whether your preferred framework is PyTorch or TensorFlow, start by first porting your ML code to Cerebras. For TensorFlow, you use CerebrasEstimator
, and for PyTorch, you use cerebras.framework.torch
.
Preparing your input data is critical. Due to very high-speed cluster-scale acceleration performed by Cerebras accelerator, your input pipeline must be very fast.
You can achieve such high input data throughput by running the input pipeline on multiple CPU nodes simultaneously, all feeding the data to the CS system.
This means you must ensure you preprocess the input data by sharding, shuffling, prefetching, interleaving, repeating, batching, etc., in a proper order.
Also, make sure to put your input data on a network file system so it can be accessed by all the CPU nodes in the cluster.
Compile your code first on a CPU node without running it on the CS system. With this approach, you can optimize your code for your specific CS system early on. Then use the compiled artifacts later when you run this network on the CS system and save time in your workflow.
Run your compiled code on the CS system. During runtime, the workers in the Cerebras server cluster stream the input data to the Cerebras accelerator. When the execution is done, the chief CPU node retrieves the results from the network-attached accelerator for you to review.
We recommend you read the Cerebras basics and run quickstart first. When you are ready, start your PyTorch to CS journey here.
We recommend you read the Cerebras basics and run quickstart first. When you are ready, start your TensorFlow to CS journey here.