modelzoo.transformers.pytorch.bert.fine_tuning.qa.input.BertQADataProcessor.BertQADataProcessor#
- class modelzoo.transformers.pytorch.bert.fine_tuning.qa.input.BertQADataProcessor.BertQADataProcessor[source]#
Bases:
torch.utils.data.IterableDataset
Reads csv file containing the input token ids, and label_ids. Creates attention_masks and sedment_ids on the fly
- Parameters
params – dict containing input parameters for creating dataset.
Expects the following fields:
“data_dir” (str or list of str): Path to the metadata files.
“batch_size” (int): Batch size.
“shuffle” (bool): Flag to enable data shuffling.
“shuffle_buffer” (int): Shuffle buffer size.
“shuffle_seed” (int): Shuffle seed.
“num_workers” (int): Number of PyTorch data workers (see PyTorch docs).
- “prefetch_factor” (int): How much data to prefetch.
for better performance (see PyTorch docs).
- “persistent_workers” (bool): For multi-worker dataloader controls if the
workers are recreated at the end of each epoch ((see PyTorch docs).
“max_sequence_length” (int): Maximum sequence length for the model.
Methods
Classmethod to create the dataloader object.
Generator to read the data in chunks of size of data_buffer.
- __call__(*args: Any, **kwargs: Any) Any #
Call self as a function.
- static __new__(cls, *args: Any, **kwargs: Any) Any #