`ilm.model_ilm`

`BaseRotaryTransformerILMModel`

Bases: Module, Model

Rotary embedding based transformer decoder.

Parameters:

Name	Type	Description	Default
`x_t`	`Integer[Tensor, ' *batch seq_len']`	The input tokens of shape (*batch, seq_len)	required
`t`		The timesteps of shape (*batch)	required
`attention_mask`	`Optional[Bool[Tensor, ' *batch seq_len']]`	The attention mask of shape (*batch, seq_len), which is True for non-padding tokens.	`None`
`positions`	`Optional[Integer[Tensor, ' *batch seq_len']]`	The positions of the tokens of shape (*batch, seq_len)	`None`

Rotary embedding based transformer decoder.

Parameters:

Name	Type	Description	Default
`x_t`	`Integer[Tensor, ' *batch seq_len']`	The input tokens of shape (*batch, seq_len)	required
`t`		The timesteps of shape (*batch)	required
`attention_mask`	`Optional[Bool[Tensor, ' *batch seq_len']]`	The attention mask of shape (*batch, seq_len), which is True for non-padding tokens.	`None`
`positions`	`Optional[Integer[Tensor, ' *batch seq_len']]`	The positions of the tokens of shape (*batch, seq_len)	`None`

Bases: GPT, Model

Parameters:

Name	Type	Description	Default
`x_t`	`Integer[Tensor, ' *batch seq_len']`	The input tokens of shape (*batch, seq_len)	required
`t`		The timesteps of shape (*batch)	required
`attention_mask`	`Optional[Bool[Tensor, ' *batch seq_len']]`	The attention mask of shape (*batch, seq_len), which is True for non-padding tokens.	`None`
`positions`	`Optional[Integer[Tensor, ' *batch seq_len']]`	The positions of the tokens of shape (*batch, seq_len)	`None`

Parameters:

Name	Type	Description	Default
`x_t`	`Integer[Tensor, ' *batch seq_len']`	The input tokens of shape (*batch, seq_len)	required
`t`		The timesteps of shape (*batch)	required
`attention_mask`	`Optional[Bool[Tensor, ' *batch seq_len']]`	The attention mask of shape (*batch, seq_len), which is True for non-padding tokens.	`None`
`positions`	`Optional[Integer[Tensor, ' *batch seq_len']]`	The positions of the tokens of shape (*batch, seq_len)	`None`