cerebras.modelzoo.tools.checkpoint_converters.base_converter.convert_dataloader_checkpoint#

cerebras.modelzoo.tools.checkpoint_converters.base_converter.convert_dataloader_checkpoint(checkpoint_state_dict: dict, data_checkpoints_dir: str, dataloader_type: str = 'map', shuffle_seed: int = 0)[source]#

Converts DataLoader state files saved in release 1.9 to DataLoader checkpoint format for the new map and iterable DataLoaders in MZ in release 2.0. This is useful to provide backwards comptability for deterministic restart on 2.0 runs from old dataloader state files.

Parameters
  • checkpoint_state_dict – the state_dict of the 1.9 checkpoint

  • data_checkpoints_dir – Path to directory containing data step file data_iter_checkpoint_state_file_global and worker checkpoint files of the format data_iter_state_file_worker_*_step_*.txt

  • dataloader_type – The MZ DataLoader for which state is being converted. Use map for the map-style dataloader and iterable for the iterable-style dataloader. Defaults to map-style dataloader.

  • shuffle_seed – The seed value to be captured in the DataLoader state for the map-style dataloader. Note that the seed is only relevant for deterministically restarting the map-style dataloader if dataset shuffling/mixing is enabled.