Skip to content

xlm.utils.fsdp_grouping

Build FSDP grouping plans from structured Hydra / dict config.

make_layer_wrap_policy(*class_paths)

Resolve dotted class paths into a set of layer classes for Lightning FSDPStrategy.

Use in Hydra YAML::

auto_wrap_policy:
  _target_: xlm.utils.fsdp_grouping.make_layer_wrap_policy
  _args_:
    - xlm.backbones.dream.modeling_dream.DreamDecoderLayer

This matches Lightning's recommended pattern of passing {nn.TransformerDecoderLayer} style sets to auto_wrap_policy / activation_checkpointing_policy.

fsdp_bf16_mixed_precision()

Default FSDP mixed precision: bf16 params, fp32 reductions (matches DreamOn reference).

build_fsdp_grouping_plan_from_config(fsdp_cfg, *, context=None)

Build a list of (module_name_prefix, wrap_output_bool) from a config dict.

Example YAML::

fsdp_grouping:
  embed: { path: model.embed_tokens, wrap: false }
  layers:
    path_template: "model.layers.{i}"
    wrap: false
    count: ${model.num_hidden_layers}
  head: { path: lm_head, wrap: true }

context should include keys referenced by interpolation (e.g. model).