cerebras.modelzoo.data_preparation.nlp.hdf5_preprocessing.utils.get_files#

cerebras.modelzoo.data_preparation.nlp.hdf5_preprocessing.utils.get_files(input_dir=None, filetypes=None, metadata_files=None)[source]#

Get all files of given filetypes from input directory.

Parameters
  • input_dir (str) – Input directory to read files from.

  • filetypes (list) – File types to fetch from the given input directory. Defaults to None.

  • metadata_files (str) – Comma separated string of metadata files.

Returns

List of lists containing all file paths as strings