XLM Logo

A Unified Framework for Non-Autoregressive Language Models

XLM is a modular, research-friendly framework for developing and comparing non-autoregressive language models. Built on PyTorch and PyTorch Lightning, with Hydra for configuration management, XLM makes it effortless to experiment with cutting-edge NAR architectures.

Key Features

Feature	Description
Modular Design	Plug-and-play components—swap models, losses, predictors, and collators independently
Lightning-Powered	Uses PyTorch Lightning for distributed training, mixed precision, and logging out of the box
Hydra Configs	Hierarchical configuration with runtime overrides—no code changes needed
Multiple Architectures	Multiple model families ready to use as baselines
Research-First	Lightweight, and type annotated with `jaxtyping`, several debug for quick testing, and flexible code injection points for practially limitless customization
Hub Integration	Push trained models directly to Hugging Face Hub

Available Models

Model	Full Name	Description
`mlm`	Masked Language Model	Classic BERT-style masked prediction
`ilm`	Insertion Language Model	Insertion-based generation
`arlm`	Autoregressive LM	Standard left-to-right baseline
`mdlm`	Masked Diffusion LM	Discrete diffusion with masking

Installation

pip install xlm-core

For model implementations, also install:

pip install xlm-models

Quick Start

XLM uses a simple CLI with three main arguments:

xlm job_type=<JOB> job_name=<NAME> experiment=<CONFIG>

Argument	Description
`job_type`	One of `prepare_data`, `train`, `eval`, or `generate`
`job_name`	A descriptive name for your run
`experiment`	Path to your Hydra experiment config

Next Steps

Quick Start – Installation, CLI usage, and example workflow
API Reference – xlm-core and xlm-models API documentation
Contributing – Guidelines for adding new models and features