modelzoo.transformers.pytorch.gpt2.input.InferenceDataProcessor.get_token_ids#

modelzoo.transformers.pytorch.gpt2.input.InferenceDataProcessor.get_token_ids(text: str, tokenizer: Union[tokenizers.Tokenizer, transformers.PreTrainedTokenizerBase]) List[int][source]#

Get encoded token ids from a string using the specified tokenizer.

Parameters
  • text (str) – The input string.

  • tokenizer (Tokenizer) – Tokenizer class from huggingface tokenizers library.

Returns

List of token ids.

Return type

List[int]