cerebras.modelzoo.data_preparation.nlp.slimpajama.preprocessing#

datasets

filter

normalize_text

Script that normalizes text

shuffle_holdout