cerebras.modelzoo.data_preparation.nlp.chunk_data_processing.summarization_vsl_data_token_generator#

This module provides the VSLSummarizationTokenGenerator class, which extends the SummarizationTokenGenerator for processing tokenized text data specifically for variable-length sequence summarization (VSLS). The class includes methods for processing chunks of tokenized text, encoding documents for text summarization, and optimizing the representation of tokenized data by merging shorter sequences within a specified maximum sequence length.

Functions

create_features_summarization_vsl

Given a list of VSL sequences, generate input features and labels.

Classes

VSLSummarizationTokenGenerator

Token generator for variable-length sequence summarization (VSLS).