cerebras.modelzoo.data_preparation.nlp.chunk_data_processing.chunk_data_preprocessor#

This module implements a generic data preprocessor called ChunkDataPreprocessor. It internally uses DataFrame and DataReader to read and process data.

Functions

get_compression_factor

Calculate and return the compression factor based on a file's extension.

update_progress

Update the progress bar based on the current progress.

Classes

ChunkDataPreprocessor

Initialize the class with given parameters and logger.