cerebras.modelzoo.data_preparation.nlp.slimpajama.preprocessing.datasets#

Functions

redpj_datasets

Classes

Dataset

RedPajamaArXivDataset

RedPajamaBooksDataset

RedPajamaC4Dataset

RedPajamaCommonCrawlDataset

RedPajamaGithubDataset

RedPajamaReplication

RedPajamaStackExchangeDataset

RedPajamaWikipediaDataset