cerebras.modelzoo.data_preparation.nlp.bert#

bertsum_data_processor

Common pre-processing functions for BERTSUM data processing

create_csv

Preprocessed CSV data generator for BERT pretraining from raw text documents.

create_csv_mlm_only

Preprocessed CSV data generator for BERT pretraining from raw text documents.

create_csv_mlm_only_static_masking

Preprocessed CSV data generator for BERT pretraining from raw text documents.

create_csv_static_masking

Preprocessed CSV data generator for BERT pretraining from raw text documents.

create_hdf5_files

Script to write HDF5 files for MLM_only and MLM + NSP datasets.

dynamic_processor

fine_tuning

mlm_only_processor

ner_data_processor

Common pre-processing functions taken from: https://github.com/NVIDIA/DeepLearningExamples/blob/master/TensorFlow/LanguageModeling/BERT/run_ner.py with minor modifications

parser_utils

sentence_pair_processor