Skip to content

xlm.utils.consolidate_model_checkpoint

Consolidate Lightning FSDP sharded checkpoints to model-only safetensors.

export_model_only_safetensors_from_consolidated_checkpoint(checkpoint, output, *, max_shard_size=None)

Write model-only weights from a consolidated Lightning checkpoint dict.

checkpoint must follow standard Lightning format with a top-level state_dict (e.g. after Lightning's _format_checkpoint on a loaded distributed checkpoint).

Parameters:

Name Type Description Default
checkpoint dict[str, Any]

Loaded consolidated checkpoint mapping.

required
output Path

Destination file (single-file mode) or directory (when max_shard_size is set).

required
max_shard_size Union[str, int, None]

If set (e.g. "5GB" or 128 bytes in HF convention), write model.safetensors.index.json and shards under output.

None

Returns:

Type Description
Path

Path to model.safetensors or to model.safetensors.index.json.

write_model_only_hub_artifacts(cfg, out_dir)

Write config.json and full_config.yaml for a PyTorchModelHubMixin-style upload.

Mirrors :meth:xlm.harness.Harness._save_pretrained config serialization (not weights).

Parameters:

Name Type Description Default
cfg Any

Hydra DictConfig with at least a model subtree.

required
out_dir Path

Directory to write into (created if missing).

required

Returns:

Type Description
Path

Path to the written config.json.

push_model_only_folder_to_hub(folder, *, repo_id, commit_message, branch=None, private=False, create_pr=False, token=None, allow_patterns=None)

Upload a folder of model-only artifacts (safetensors + configs) to the Hub.

Uses :meth:huggingface_hub.HfApi.create_repo, optional branch creation, and :meth:huggingface_hub.HfApi.upload_folder. Does not instantiate a :class:~xlm.harness.Harness.

consolidate_model_checkpoint(sharded_checkpoint_dir, output, *, max_shard_size=None)

Consolidate a Lightning FSDP sharded directory to model-only safetensors.

Requires PyTorch >= 2.3 (Lightning uses torch.distributed.checkpoint).

Parameters:

Name Type Description Default
sharded_checkpoint_dir str | Path

Folder with *.distcp shards and meta.pt.

required
output str | Path

Target .safetensors path (single-file) or directory (sharded export).

required
max_shard_size Union[str, int, None]

Optional HF shard size (e.g. "5GB") for multi-file layout.

None

Returns:

Type Description
Path

Path suitable for model_only_checkpoint_path (weights file or index JSON).