The cerebras.framework.torch interface
On This Page
The cerebras.framework.torch
interface¶
This section describes the methods of the cerebras.framework.torch
module you will use to convert your PyTorch code into Cerebras workflow.
cerebras.framework.torch.initialize()
¶
Initializes the Cerebras system and prepares the system for the runtime. You can configure the runtime system by passing the configuration variables while calling this method.
Important
The cerebras.framework.torch.initialize()
must be called first, before any PyTorch code, in your program.
Usage¶
import cerebras.framework.torch as cbtorch
# Initialize with default values
cbtorch.initialize(cs_ip="12.0.0.0")
# Initialize with configuration variables
cbtorch.initialize(cs_ip="12.0.0.0", autostart_timeout=35, shutdown_timeout=25)
# Initialize for compiling the model only (no system required)
cbtorch.initialize(compile_only=True, autostart_timeout=35, shutdown_timeout=25)
# Initialize with service workdir to configure streamer workers
cbtorch.initialize(cs_ip="12.0.0.0", service_workdir="./service", autostart_timeout=35, shutdown_timeout=25)
# Initialize with cbfloat16 enabled
cbtorch.initialize(cs_ip="12.0.0.0", use_cbfloat16=True, autostart_timeout=35, shutdown_timeout=25)
Parameters¶
cs_ip
¶
The IP address of the Cerebras system.
If not provided, assumes that a CPU workflow flow and configures accordingly.
compile_only
¶
If
True
, configures the Cerebras backend for a compile-only flow. No Cerebras system is required and no execution will occur.
service_workdir
¶
Optional. A string specifying the path to the service working directory that the streamer workers will use. Default:
./cerebras_wse
.
use_cbfloat16
¶
If
True
, configures the run to use the Cerebrascbfloat16
data type. Thecbfloat16
is a Cerebras-specific 16-bit floating point type that is optimized for performance.
autostart_timeout
¶
Optional. An integer specifying the time in seconds that the process should wait for the auto-started services to complete. Default: 30, indicating 30 seconds. You may need to increase this depending on the host machine’s CPU and networking performance.
shutdown_timeout
¶
Optional. An integer specifying the time in seconds that the process should wait for the services to shutdown before raising a time out error. Default: 10, indicating 10 seconds. You may need to increase this depending on the host machine’s CPU and networking performance.
cerebras.framework.torch.module()
¶
Wrap your PyTorch Model object of the class torch.nn.Module with this method as cerebras.framework.torch.module(Model)
. This will enable the Model object to be executed on the Cerebras system and on any device type specified in the device_type parameter of the cerebras.framework.torch.initialize()
method.
Usage¶
See the following example. This will create the object with the name model
, of the type cbtorch
, of your PyTorch model MNIST. This model
object is now ready to be loaded onto the Cerebras system:
import cerebras.framework.torch as cbtorch
...
class MNIST(nn.Module):
def __init__(self):
super(MNIST, self).__init__()
...
...
def main():
# Initialize Cerebras backend and configure for run
cbtorch.initialize(cs_ip=args.cs_ip)
# prepare to move the model and dataloader onto the Cerebras engine
model = cbtorch.module(MNIST())
...
...
Parameters¶
model
¶
Required. The name of the PyTorch Model, of the class torch.nn.Module, that you are targeting to run on the Cerebras system, and in general on any
device_type
device you specify.
cerebras.framework.torch.dataloader()
¶
Wrap the torch.util.data.DataLoader object in your PyTorch code with cerebras.framework.torch.dataloader
. This will enable the DataLoader object to be executed on the Cerebras system and on any device type specified in the device_type
parameter of the cerebras.framework.torch.initialize()
method.
Usage¶
See the following example, line 11. This will create the object with the name train_dataloader
, of the type cbtorch
by simply wrapping the object returned by the get_train_dataloader()
method into cbtorch
type. The resulting train_dataloader
object is now compatible with the Cerebras system.
import cerebras.framework.torch as cbtorch
...
def main():
# Initialize Cerebras backend and configure for run
cbtorch.initialize(cs_ip=args.cs_ip)
# prepare to move the model and dataloader onto the Cerebras engine
model = cbtorch.module(MNIST())
train_dataloader = cbtorch.dataloader(get_train_dataloader())
...
def get_train_dataloader():
batch_size = 64
...
train_loader = torch.utils.data.DataLoader(
train_dataset,
batch_size=batch_size,
sampler=None,
drop_last=True,
shuffle=True,
num_workers=0,
)
return train_loader
Parameters¶
loader
¶
Required. The name of the PyTorch DataLoader object, of the class torch.util.data.DataLoader that you are targeting to use to stream data to the Cerebras system, and in general to any
device_type
device you specify.
cerebras.framework.torch.Session()
¶
A context manager that enables running PyTorch code on a Cerebras system.
Usage¶
See the following example:
import cerebras.framework.torch as cbtorch
...
train_dataloader = cbtorch.dataloader(get_train_dataloader())
...
with cbtorch.Session(train_dataloader, mode="train") as session:
for epoch in range(num_epochs):
cm.master_print(f"Epoch {epoch} train begin")
for step, batch in enumerate(train_dataloader):
...