modelzoo.transformers.pytorch.gpt2.sparse_mask.create_fixed_sparse_attention_mask#
- modelzoo.transformers.pytorch.gpt2.sparse_mask.create_fixed_sparse_attention_mask(max_sequence_length, n_heads, dtype=None, local_attn_ctx=16, num_verts=64, vert_size=16, different_layout_per_head=False)[source]#
Create GPT-3 Fixed Sparse mask. Adapted from https://github.com/openai/sparse_attention/blob/master/attention.py#L135
- Parameters
max_sequence_length (int) – Max sequence length.
dtype – Dtype of the resulting mask.
- Returns
The autoregressive fixed sparse mask of shape [n_heads, max_sequence_length, max_sequence_length].