xlm.modules.position
RotaryEmbedding
Bases: Module
__init__(dim, head_first=True, cache_size=1024)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dim
|
int
|
the dimension of the input. |
required |
head_first
|
bool
|
if True, the input is assumed to be of shape (batch_size, seq_len, num_heads, dim) if False, the input is assumed to be of shape (batch_size, num_heads, seq_len, dim) |
True
|
cache_size
|
int
|
the maximum sequence length to cache the sine and cosine values for. |
1024
|
forward(x, positions)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
shape (batch_size, seq_len, num_heads, dim) if head_first is False shape (batch_size, num_heads, seq_len, dim) if head_first is True |
required | |
positions
|
Integer[Tensor, ' *batch seq_len']
|
shape (batch_size, seq_len) |
required |