cerebras.modelzoo.data_preparation.nlp.tokenizers.HFTokenizer#

Classes

HFTokenizer

Designed to integrate the HF's Tokenizer library :param vocab_file: A vocabulary file to create the tokenizer from. :type vocab_file: str :param special_tokens: A list or a string representing the special tokens that are to be added to the tokenizer. :type special_tokens: list, str.