modelzoo.transformers.data_processing.tokenizers.HFTokenizer.HFTokenizer#

class modelzoo.transformers.data_processing.tokenizers.HFTokenizer.HFTokenizer[source]#

Bases: object

Designed to integrate the HF’s Tokenizer library :param vocab_file: A vocabulary file to create the tokenizer from. :type vocab_file: str :param special_tokens: A list or a string representing the special

tokens that are to be added to the tokenizer.

Methods

add_special_tokens

add_token

decode

encode

get_token

get_token_id

set_eos_pad_tokens

Attributes

eos

pad

__init__(vocab_file, special_tokens=None)[source]#