xlm.tasks.composite_eval
Composite post-hoc evaluator that routes to task-specific evaluators.
Usage in Hydra config::
post_hoc_evaluator:
_target_: xlm.tasks.composite_eval.CompositePostHocEvaluator
evaluators:
math500_prediction:
_target_: xlm.tasks.math500.Math500Eval
denovo_prediction:
_target_: xlm.tasks.molgen.DeNovoEval
use_bracket_safe: true
CompositePostHocEvaluator
Routes eval() calls to a task-specific evaluator chosen by
dataloader name.
The evaluators dict maps a pattern (substring) to an evaluator
instance. When eval() is called with a dataloader_name, the first
evaluator whose key is a substring of the name is selected. If nothing
matches, the predictions are returned unchanged with empty metrics.
This is a drop-in replacement for a single evaluator: the existing
Harness.compute_post_hoc_metrics passes dataloader_name through,
and evaluators that don't use it simply ignore the kwarg.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
evaluators
|
Dict[str, Any]
|
Mapping from dataloader-name substring to evaluator.
Each evaluator must implement
|
required |