xlm.utils.fsdp_grouping
Build FSDP grouping plans from structured Hydra / dict config.
make_layer_wrap_policy(*class_paths)
Resolve dotted class paths into a set of layer classes for Lightning FSDPStrategy.
Use in Hydra YAML::
auto_wrap_policy:
_target_: xlm.utils.fsdp_grouping.make_layer_wrap_policy
_args_:
- xlm.backbones.dream.modeling_dream.DreamDecoderLayer
This matches Lightning's recommended pattern of passing {nn.TransformerDecoderLayer}
style sets to auto_wrap_policy / activation_checkpointing_policy.
fsdp_bf16_mixed_precision()
Default FSDP mixed precision: bf16 params, fp32 reductions (matches DreamOn reference).
build_fsdp_grouping_plan_from_config(fsdp_cfg, *, context=None)
Build a list of (module_name_prefix, wrap_output_bool) from a config dict.
Example YAML::
fsdp_grouping:
embed: { path: model.embed_tokens, wrap: false }
layers:
path_template: "model.layers.{i}"
wrap: false
count: ${model.num_hidden_layers}
head: { path: lm_head, wrap: true }
context should include keys referenced by interpolation (e.g. model).