ilm.model_ilm
BaseRotaryTransformerILMModel
Bases: Module, Model
Rotary embedding based transformer decoder.
forward(x_t, attention_mask=None, positions=None, token_type_ids=None, cls_position=None)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x_t
|
Integer[Tensor, ' *batch seq_len']
|
The input tokens of shape (*batch, seq_len) |
required |
t
|
The timesteps of shape (*batch) |
required | |
attention_mask
|
Optional[Bool[Tensor, ' *batch seq_len']]
|
The attention mask of shape (*batch, seq_len), which is True for non-padding tokens. |
None
|
positions
|
Optional[Integer[Tensor, ' *batch seq_len']]
|
The positions of the tokens of shape (*batch, seq_len) |
None
|
RotaryTransformerILMModelWithClassification
Bases: BaseRotaryTransformerILMModel
Rotary embedding based transformer decoder.
forward(x_t, attention_mask=None, positions=None, token_type_ids=None, cls_position=None)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x_t
|
Integer[Tensor, ' *batch seq_len']
|
The input tokens of shape (*batch, seq_len) |
required |
t
|
The timesteps of shape (*batch) |
required | |
attention_mask
|
Optional[Bool[Tensor, ' *batch seq_len']]
|
The attention mask of shape (*batch, seq_len), which is True for non-padding tokens. |
None
|
positions
|
Optional[Integer[Tensor, ' *batch seq_len']]
|
The positions of the tokens of shape (*batch, seq_len) |
None
|
BaseGPT2ILMModel
Bases: GPT, Model
forward(x_t, attention_mask=None, positions=None, token_type_ids=None)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x_t
|
Integer[Tensor, ' *batch seq_len']
|
The input tokens of shape (*batch, seq_len) |
required |
t
|
The timesteps of shape (*batch) |
required | |
attention_mask
|
Optional[Bool[Tensor, ' *batch seq_len']]
|
The attention mask of shape (*batch, seq_len), which is True for non-padding tokens. |
None
|
positions
|
Optional[Integer[Tensor, ' *batch seq_len']]
|
The positions of the tokens of shape (*batch, seq_len) |
None
|
GPT2ILMModelWithClassification
Bases: BaseGPT2ILMModel
forward(x_t, attention_mask=None, positions=None, token_type_ids=None)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x_t
|
Integer[Tensor, ' *batch seq_len']
|
The input tokens of shape (*batch, seq_len) |
required |
t
|
The timesteps of shape (*batch) |
required | |
attention_mask
|
Optional[Bool[Tensor, ' *batch seq_len']]
|
The attention mask of shape (*batch, seq_len), which is True for non-padding tokens. |
None
|
positions
|
Optional[Integer[Tensor, ' *batch seq_len']]
|
The positions of the tokens of shape (*batch, seq_len) |
None
|