Make adaptive_levels YAML-usable and reaction-consistent#897
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates ARC’s adaptive_levels feature to (1) accept a YAML-friendly list-of-entries schema (and correctly round-trip through restart serialization) and (2) enforce reaction-consistent adaptive level selection so barriers aren’t computed using mixed levels of theory across TS/wells.
Changes:
- Replace the legacy tuple-keyed
adaptive_levelsinput format with a YAML-usable list-of-entries schema and update restart serialization accordingly. - Add reaction-wide adaptive level selection via
adaptive_lot_n_heavy, with a new per-speciesthermo_at_own_levelflag controlling whether thermo is computed at the species’ own granular level. - Add/extend unit tests covering restart round-trips and reaction-wide adaptive-level behavior.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/source/advanced.rst | Updates user docs for the new YAML schema and reaction-consistent behavior. |
| arc/species/species.py | Adds thermo_at_own_level and adaptive_lot_n_heavy fields with restart serialization support. |
| arc/species/species_test.py | Adds a unit test verifying the new species fields round-trip through as_dict/from_dict. |
| arc/scheduler.py | Applies reaction-wide adaptive-level logic and uses adaptive_lot_n_heavy in adaptive LOT selection. |
| arc/scheduler_test.py | Adds tests validating reaction-wide consistency, copy creation, and heavy-atom override behavior. |
| arc/main.py | Implements new adaptive_levels list-of-entries parser and updates restart serialization to emit the list form. |
| arc/main_test.py | Updates tests for the new schema, restart round-trip, and legacy-form rejection. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for each such participant whose ``thermo_at_own_level`` is ``True`` (the default), an autonomous relabeled | ||
| copy of the species is created from the outset and used by the reaction (evaluated at the reaction-wide | ||
| level), while the original species is left to compute its own thermochemistry at its own granular level. If | ||
| ``thermo_at_own_level`` is ``False``, the participant itself is evaluated at the reaction-wide level (no copy). |
| if not spc.thermo_at_own_level: | ||
| spc.adaptive_lot_n_heavy = reaction_n_heavy | ||
| continue |
| if not isinstance(atom_range, (list, tuple)) or len(atom_range) != 2 \ | ||
| or not all(isinstance(a, int) or a in ('inf', float('inf')) for a in atom_range): | ||
| raise InputError(f'The "atom_range" of each adaptive levels entry must be a 2-length list of integers ' | ||
| f'with an optional "inf" upper bound, got {atom_range} in:\n{adaptive_levels}') | ||
| atom_range = (atom_range[0], 'inf' if atom_range[1] in ('inf', float('inf')) else atom_range[1]) |
| By default (the per-species ``thermo_at_own_level`` flag, ``False``) a well that lands on a | ||
| coarser grain than its reaction is evaluated directly at the reaction-wide level, and its | ||
| thermochemistry uses that same (coarser) level. Set ``thermo_at_own_level=True`` on a species |
d4d8938 to
2ff4acd
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #897 +/- ##
==========================================
- Coverage 63.05% 63.05% -0.01%
==========================================
Files 114 114
Lines 38069 38117 +48
Branches 9967 9978 +11
==========================================
+ Hits 24005 24035 +30
- Misses 11175 11181 +6
- Partials 2889 2901 +12
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Tuple keys (1, 6)/(7, 'inf') can't be produced by yaml.safe_load, so adaptive_levels was unusable from an input file. Accept a list of {atom_range, levels} entries, build the tuple-keyed structure internally, and serialize the same form on restart.
Levels were chosen per species from its own heavy-atom count, so a reaction's large TS and its small wells could fall on different grains, mixing levels of theory across the barrier. Key the whole reaction by its (conserved) heavy-atom count, and for any well on a coarser grain make an autonomous relabeled copy that the reaction uses at the reaction-wide level. The new per-species thermo_at_own_level flag (default True) keeps each species' own granular level for thermo; set it False to evaluate the species at the reaction-wide level with no copy.
Reaction wells take the reaction-wide level directly by default (no relabeled copies, no duplicated jobs); set True per species to opt into its own granular level for thermo.
Lilachn91
left a comment
There was a problem hiding this comment.
Thanks, I left a few comments (some are just minor wording comments).
| (defined separately e.g., via ``opt_level``, or using ARC's defaults. | ||
| To do so, pass the ``adaptive_levels`` attribute, which is a list of entries. | ||
| Each entry is a dictionary with an ``atom_range`` 2-list giving the heavy | ||
| (non-hydrogen) atom count range (the upper bound may be ``inf``), and a ``levels`` |
There was a problem hiding this comment.
For clarity, I would rephrase to:
Each entry is a dictionary with two keys: atom_range, a two-element list [min, max] giving a range for the species' size by counting the number of heavy (non‑hydrogen) atoms (the upper bound of the last range must be inf); and levels: mapping from job type(s) to the level of theory (a string or dictionary) to use for that size range.
| (coarser) level, and its thermochemistry uses that same level. Set ``thermo_at_own_level=True`` on a species | ||
| to instead compute its thermochemistry at its own size-appropriate (granular) adaptive level: | ||
| an autonomous, relabeled copy of the species is then created and used by the reaction (at the | ||
| reaction-wide level), while the original keeps its own level for thermochemistry. |
There was a problem hiding this comment.
Also for clarity, I'd rephrase, especially the last part:
By default (the per-species thermo_at_own_level flag, False), a well whose own size falls on a finer grain than its reaction, is evaluated directly at the reaction-wide (coarser) level, and its thermochemistry uses that same level.
To compute its thermochemistry at its own size-appropriate (granular) adaptive level, set thermo_at_own_level=True on a species. ARC will then create an autonomous, relabeled copy of the species to be computed at the reaction-wide level and used by the reaction, while the original species will keep its own level for thermochemistry.
| if i and atom_ranges[i-1][1] + 1 != atom_ranges[i][0]: | ||
| raise InputError(f'Atom ranges of adaptive levels must be consecutive. ' | ||
| f'Got:\n{list(adaptive_levels.keys())}') | ||
| raise InputError(f'Atom ranges of adaptive levels must be consecutive. Got:\n{atom_ranges}') |
There was a problem hiding this comment.
What happens if the first range doesn't start at 0 or 1, or at the lowest heavy atoms count? Is it allowed (as opposed to not finishing with inf)?
| f'the reaction internally consistent.') | ||
| copy_label = check_label(f'{label}_TS{rxn_index}')[0] | ||
| if copy_label not in self.species_dict: | ||
| copy_spc = spc.copy() |
There was a problem hiding this comment.
The copied reaction participant should probably have thermo_at_own_level reset to False. Since spc.copy() preserves the original setting, a thermo_at_own_level=True source creates a copy that still opts into copy behavior. On restart, _apply_adaptive_reaction_levels() can see the copy as needing its own copy and relabel, duplicating species across restarts.
| logger.warning(f'Species {label} participates in reactions on different adaptive level ' | ||
| f'grains, creating a dedicated copy of it for reaction {rxn.label} to keep ' | ||
| f'the reaction internally consistent.') | ||
| copy_label = check_label(f'{label}_TS{rxn_index}')[0] |
There was a problem hiding this comment.
This should guard against label collisions. If {label}_TS{rxn_index} already exists but is not the copy created for this reaction, the reaction will silently point to an unrelated species. Is there a way to validate that an existing label is the intended adaptive-level copy?
Two fixes to
adaptive_levels(adaptive level-of-theory), from the roadmap.1. YAML-usable schema
adaptive_levelsrequired Python tuple keys ((1,6),(7,'inf')) thatyaml.safe_loadcan't produce, so it was unusable from an input file. It nowtakes a list of entries, and round-trips through restart:
2. Reaction-consistent levels
Levels were picked per species from its own heavy-atom count, so a reaction's
large TS and its small wells could land on different grains, mixing levels of
theory across the barrier. Each reaction is now keyed by its (conserved)
heavy-atom count, and all of its species are evaluated at that one level.
New per-species thermo_at_own_level flag (default False): set True to
compute a species' thermo at its own granular level — ARC then makes an
autonomous relabeled copy for the reaction so the barrier stays consistent.