modelzoo.transformers.data_processing.scripts.pubmed.preprocess.TextFormatting#

Script to format PubMed Fulltext commercial, PubMed Baseline and Update file Abstracts

Reference: https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/LanguageModeling/BERT

Classes

TextFormatting

param str pubmed_path

Path to folder containing PubMed files