cerebras.modelzoo.data_preparation.nlp.chunk_data_processing.utils.save_mlm_data_to_csv#

cerebras.modelzoo.data_preparation.nlp.chunk_data_processing.utils.save_mlm_data_to_csv(filename, data)[source]#

Process and save given data to a CSV file. This includes splitting combined arrays into labels, masked_lm_positions, and masked_lm_weights using the actual_length indicator stored as the last element of these arrays.

Parameters
  • filename (str) – Path to the CSV file to write.

  • data (list) – A list of tokenized data arrays to be processed and written.