modelzoo.transformers.data_processing.scripts.pubmed.preprocess.TextSharding#

Script to shard into separate train and test dataset files

Reference: https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/LanguageModeling/BERT

Classes

NLTKSegmenter

Sharding