Skip to content

CREST Adapter#807

Open
calvinp0 wants to merge 91 commits intomainfrom
crest_adapter
Open

CREST Adapter#807
calvinp0 wants to merge 91 commits intomainfrom
crest_adapter

Conversation

@calvinp0
Copy link
Copy Markdown
Member

@calvinp0 calvinp0 commented Nov 27, 2025

Addition of CREST Adapter that complements the heuristic adapter.

This pull request adds support for the CREST conformer and transition state (TS) search method to the ARC project, along with several related improvements and code cleanups. The most important changes include integrating CREST as a TS search adapter, updating configuration and constants, and enhancing the heuristics TS search logic for better provenance tracking and code clarity.

CREST Integration:

  • Added CREST as a supported TS search method: updated JobEnum (arc/job/adapter.py), included CREST in the list of adapters and RMG family mapping, and registered it as a default incore adapter (arc/job/adapters/common.py, arc/job/adapters/ts/__init__.py). [1] [2] [3] [4]
  • Implemented a new test suite for CREST input generation (arc/job/adapters/ts/crest_test.py).
  • Added a Makefile target and installation script for CREST (Makefile). [1] [2]

Constants and Configuration:

  • Added the angstrom_to_bohr conversion constant to both Cython and Python constants modules (arc/constants.pxd, arc/constants.py). [1] [2] [3]

Heuristics TS Search Enhancements and Refactoring:

  • Refactored heuristics TS search logic to track and combine method provenance for TS guesses, allowing for more precise attribution when multiple methods contribute to a guess (arc/job/adapters/ts/heuristics.py). [1] [2] [3] [4]
  • Improved code readability and maintainability by reformatting imports and function calls, and clarifying data structures and comments in heuristics TS search (arc/job/adapters/ts/heuristics.py). [1] [2] [3] [4] [5] [6] [7]
    .

Comment thread arc/job/adapters/ts/crest.py Fixed
Comment thread arc/job/adapters/ts/crest.py Fixed
Comment thread arc/job/adapters/ts/crest.py Fixed
Comment thread arc/settings/settings.py Fixed
Comment thread arc/settings/settings.py Fixed
Comment thread arc/job/adapters/ts/autotst_ts.py Fixed
Comment thread arc/job/adapters/ts/crest.py Fixed
Comment thread arc/job/adapters/ts/heuristics.py Fixed
Comment thread arc/job/adapters/ts/heuristics.py Fixed
Comment thread arc/job/adapters/ts/heuristics.py Fixed
Comment thread arc/job/adapters/ts/crest.py Fixed

This comment was marked as resolved.

Comment thread arc/job/adapters/ts/crest.py Fixed

This comment was marked as resolved.

Comment thread arc/job/adapters/ts/heuristics.py Fixed
Comment thread arc/settings/settings.py Fixed
Comment thread arc/settings/settings.py Fixed
@calvinp0 calvinp0 force-pushed the crest_adapter branch 2 times, most recently from 3e88c36 to 7674f5f Compare February 2, 2026 09:49
@calvinp0 calvinp0 requested a review from Copilot February 2, 2026 12:10

This comment was marked as resolved.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

This comment was marked as resolved.

@calvinp0 calvinp0 force-pushed the crest_adapter branch 2 times, most recently from 9be935e to eab7647 Compare February 4, 2026 20:14
@calvinp0 calvinp0 requested a review from alongd February 4, 2026 21:30
calvinp0 added 13 commits April 12, 2026 16:17
Pipe job IDs submitted during the scheduler label loop weren't in the server_job_ids list (fetched earlier in the same iteration), causing  poll_pipes to conclude the scheduler job was gone and mark all tasks FAILED TERMINAL

Append the job ID to server_job_ids on both initial submission and resubmission so _is_scheduler_job_alive sees it immediately.
delete_all_species_jobs blanket-set all output job_types to False,
including rotors and bde which are initialised to True by
initialize_output_dict.  For species with no torsional modes (e.g.
cyclic TS from THF), no scan jobs are ever spawned, so rotors stays
False and check_all_done incorrectly marks the TS as unconverged —
even when opt, freq, sp, and IRC all passed.

Additionally, switch_ts did not reset rotors_dict, so
determine_rotors was never re-called for the new TS geometry and
stale scan results from the previous guess carried over.

Changes:
- Preserve the True default for rotors/bde in delete_all_species_jobs,
  matching initialize_output_dict.
- Reset rotors_dict and number_of_rotors in switch_ts when
  job_types['rotors'] is enabled, so the new geometry gets fresh
  rotor detection.
Previously self.level.cabs was stored on Level but never templated,
so F12 jobs silently ran with DimCABS = 0. Add a ${cabs} slot on the
! line after ${auxiliary_basis} and populate it from self.level.cabs.
Moved the separator space out of the template and into the cabs value
itself so the non-F12 input file (the 99% case) gets the exact same
output as before. Existing orca_test snapshots pass unchanged.
Previously, sp_level configs like DLPNO-CCSD(T)-F12 with no 'cabs:'
field would reach ORCA silently and run with DimCABS = 0, returning
non-F12 energies (and in some cases crashing ARC downstream with no
clear cause). Fail fast at input-file generation with a descriptive
error pointing at the missing 'cabs:' key.
When a monoatomic species hits the DLPNO-incompatibility branch in
run_sp, the level was rebuilt via Level(method=..., basis=..., software=
..., args=...), silently dropping auxiliary_basis, cabs, dispersion,
and all solvation_* fields. For an F12 sp_level on an atom this zeroed
out CABS and aux, making ORCA run with DimCABS = 0.

Rebuild from as_dict() and only override the method so the rest of
the Level spec survives.
ORCA 5.x rejects non-RI F12-CC methods on open-shell species with
'F12-CC method not implemented for UHF references'. ORCA 6.x may
handle it natively, so a per-run global override would be wrong.

Add a reactive trsh path: detect the specific ORCA warning in
determine_ess_status (keyword 'F12_UHF'), and in trsh_ess_job rebuild
the level with '/RI' appended to the method (preserving aux, cabs,
solvation, etc. via as_dict). Scoped to the failing job only.
Previous logic blanket-stripped 'dlpno-' from the method for every
monoatomic species, but DLPNO works fine in ORCA for atoms with more
than one electron (e.g. [O], [N], [C]). Only single-electron species
lack pair correlations and need the HF fallback.

Also narrow the trsh guard at trsh.py:1027 to match (is_h instead of
is_monoatomic or is_h), so non-H monoatomics keep their DLPNO method
through the retry path.
The old 'read user value OR default, then add_to_args' pattern always
appended to args['keyword']['general'], even when the user already had
the value under its named key. For a user who set dlpno_threshold:
TightPNO, 'tightpno' was emitted twice — once from their key and once
from 'general' — and ORCA rejected the input with UNRECOGNIZED OR
DUPLICATED KEYWORD.

Use setdefault() instead, so the default only lands when the user
didn't supply the key. The value rides through update_input_dict_with_
args on its canonical key, not under 'general', so each keyword is
emitted exactly once.
The DFT branch unconditionally appended defgrid2/defgrid3 via
add_to_args, so a user who set their own defgrid keyword for a DFT
sp_level got it twice in the ! line. Apply the same setdefault pattern
we just used for scf_convergence and dlpno_threshold under the key
'dft_grid', so the default only lands when the user hasn't already
supplied one.
global _diagnostics_logged
if _diagnostics_logged:
return
_diagnostics_logged = True

Check notice

Code scanning / CodeQL

Unused global variable Note

The global variable '_diagnostics_logged' is not used.

Copilot Autofix

AI 2 days ago

Use function-local persistent state instead of a module-level global flag.

Best fix (without changing behavior):

  • Remove _diagnostics_logged = False at module scope.
  • In _log_node_diagnostics(), replace global _diagnostics_logged and related checks with a function attribute (e.g., _log_node_diagnostics._logged), initialized lazily via getattr(..., False).
  • This preserves “log once per worker process” behavior and eliminates the flagged global variable.

Edits required in arc/scripts/pipe_worker.py:

  • Delete the module-level _diagnostics_logged definition.
  • Update lines inside _log_node_diagnostics() where the global is read/written.

No new imports or dependencies are needed.

Suggested changeset 1
arc/scripts/pipe_worker.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/arc/scripts/pipe_worker.py b/arc/scripts/pipe_worker.py
--- a/arc/scripts/pipe_worker.py
+++ b/arc/scripts/pipe_worker.py
@@ -43,9 +43,7 @@
 
 logger = logging.getLogger('pipe_worker')
 
-_diagnostics_logged = False
 
-
 def _log_node_diagnostics() -> None:
     """Dump hostname, PBS/Slurm array context, PATH, PYTHONPATH, TMPDIR, and
     group membership on first task failure. A dead compute node otherwise
@@ -54,10 +51,9 @@
     many sites. Logging once per worker process keeps the volume bounded
     when one bad node drains many tasks.
     """
-    global _diagnostics_logged
-    if _diagnostics_logged:
+    if getattr(_log_node_diagnostics, '_logged', False):
         return
-    _diagnostics_logged = True
+    _log_node_diagnostics._logged = True
     import socket
     import subprocess
     try:
EOF
@@ -43,9 +43,7 @@

logger = logging.getLogger('pipe_worker')

_diagnostics_logged = False


def _log_node_diagnostics() -> None:
"""Dump hostname, PBS/Slurm array context, PATH, PYTHONPATH, TMPDIR, and
group membership on first task failure. A dead compute node otherwise
@@ -54,10 +51,9 @@
many sites. Logging once per worker process keeps the volume bounded
when one bad node drains many tasks.
"""
global _diagnostics_logged
if _diagnostics_logged:
if getattr(_log_node_diagnostics, '_logged', False):
return
_diagnostics_logged = True
_log_node_diagnostics._logged = True
import socket
import subprocess
try:
Copilot is powered by AI and may make mistakes. Always verify output.
Comment thread arc/job/trsh.py Fixed
When an Orca job fails due to insufficient memory and hits a total memory limit, ARC now attempts to resolve the issue by reducing the number of CPU cores to increase the memory available per core.

- Added logic to calculate the maximum feasible CPU cores that fit within the total memory cap while meeting Orca's per-core requirements.
- Ensures at least one core is utilized if viable, rather than failing the troubleshooting step.
- Prevents recalculating/inflating total memory when it is already constrained by a cap.
- Adjusted the conservative memory buffer added during total memory estimation from 5 GB to 3 GB.

.
- Make the general memory error check in the scheduler case-insensitive and restricted to cases where 'memory' is the sole error keyword.
- Add unit tests for ORCA memory troubleshooting to verify that CPU cores are reduced when the total memory limit is hit, while total memory is increased when no cap is present.

.
…ction

When a pipe task fails with an Electronic Structure System (ESS) error, the coordinator now attempts to read the `parser_summary` from the worker's result file. If found, the task is ejected to the scheduler via `troubleshoot_ess` rather than a blind resubmission. This allows the scheduler to apply intelligent troubleshooting logic—such as adjusting memory or CPU cores—based on the specific failure mode encountered during the pipe run.

Additionally, resource requirements (CPU and memory) are now explicitly passed when ejecting tasks to ensure consistency between the pipe environment and the scheduler.
- Refresh scheduler-side job logs before parsing ESS status to detect server-reported out-of-memory (OOM) errors even when output files are absent or incomplete.
- Tag jobs whose requested memory is clipped to a node's capacity with a `max_total_job_memory` keyword.
- Ensure the capped memory marker is preserved during status updates to allow the troubleshooter to distinguish between hitting a node limit versus a simple insufficient memory request.
- Enable status determination from additional job info (scheduler logs) when the primary output file cannot be found.
Transition state connectivity and bond orders are not uniquely defined
(partial forming/breaking bonds at the saddle point), so enforcing
isomorphism against a stored 2D adjacency list rejects valid TS
geometries.

- species.ARCSpecies.mol_from_xyz: short-circuit to accept the perceived
  molecule for TSs without running check_xyz_isomorphism.
- scheduler: add is_ts guards at the three remaining check_xyz_isomorphism
  call sites (check_directed_scan, check_directed_scan_job,
  troubleshoot_scan_job). Matches the existing is_ts guards in
  parse_composite_geo and parse_opt_geo.
- Fix secondary bug in the non-TS warning message: log perceived_mol
  in the "and:" branch (previously logged self.mol twice).
- Update TS_search.rst and advanced.rst to document the policy.
- Tests: TS mol_from_xyz acceptance, and two mock-based scheduler tests
  covering check_directed_scan_job and troubleshoot_scan_job.
r_cut_p_cut_isomorphic matched purely on molecular-formula fingerprint OR
graph isomorphism, so constitutional isomers sharing a formula (e.g. the
alpha vs beta pentyl-ether radicals in H-abstraction between identical
donors) got paired as "isomorphic" and handed to map_two_species, which
then failed to superimpose them and returned None.

Add a strict flag that requires full graph isomorphism, and run
pairing_reactants_and_products_for_mapping in two passes: strict first,
then the loose fingerprint-or-isomorphic fallback for any unmatched
r-cuts so rearrangements whose cuts aren't strictly isomorphic (e.g.
Intra_Disproportionation, ring-openings) still work.

Also guard glue_maps against a None per-pair map so the caller sees a
clean None instead of a TypeError when a future mapping regression slips
through.
map_rxn blindly handed fragment_maps to glue_maps. Any None entry (from
map_two_species giving up on a pair) would crash inside glue_maps trying
to subscript the None. Detect None entries up front, log which pairs
failed, and either advance to the next product_dict index (same recovery
path used by the existing template-order error branch) or return None so
upstream consumers can fall back cleanly.
NMD (check_normal_mode_displacement) only needs the reactive bond set in
reactant index space - which is canonically given by the RMG family
recipe (BREAK_BOND / FORM_BOND / CHANGE_BOND actions) plus r_label_map.
Previously NMD routed through reaction.get_bonds, which hard-required
self.atom_map and raised otherwise, so any atom-mapping failure killed
the whole TS validation even though the reactive bonds were directly
computable.

Three changes:

- get_bonds: move the atom_map check below the r_bonds_only=True short-
  circuit. Reactant bonds never touch atom_map, so r_bonds_only should
  not require one.

- _get_reactive_bonds_from_family: new helper that reads the family's
  actions (via ReactionFamily(...).actions) and resolves labels through
  r_label_map into reactant-indexed bond tuples.

- get_formed_and_broken_bonds / get_changed_bonds: when self.atom_map is
  None but family + product_dicts are present, use the helper. Otherwise
  unchanged. Emits a warning so the fallback path is diagnosable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants