Added support for orca NEB. by kfir4444 · Pull Request #833 · ReactionMechanismGenerator/ARC

kfir4444 · 2026-02-24T07:52:02Z

PR Overview

This PR introduces Orca NEB support, improves queue-based TS search handling, enhances parsing efficiency, and adds comprehensive test coverage, and opens the path to additional queue TS guess jobs.

1. Orca NEB Integration

New Adapter: `OrcaNEBAdapter`

Implemented OrcaNEBAdapter supporting both:
- Incore execution
- Queue-based execution
Added orca_neb to supported TS adapters across relevant RMG families (arc/job/adapters/common.py)
Included default settings and keyword configurations for Orca 6.x NEB-TS calculations

`OrcaNEBAdapter` Inheritance Strategy

OrcaNEBAdapter inherits directly from OrcaAdapter rather than the base JobAdapter.

This design leverages polymorphism to ensure that all Orca-specific logic is reused consistently, including:

Memory calculation
CPU core allocation
Method formatting (_format_orca_method)
Basis set formatting (_format_orca_basis)
Submission script generation
Server-specific command formatting

Why inherit from `OrcaAdapter`?

By inheriting from OrcaAdapter, we:

Reuse existing logic
Automatically inherit the complex submission and execution behavior of standard Orca jobs.
Apply targeted overrides only where necessary
Override only:
- write_input_file (for NEB-specific input formatting)
- process_run (to extract the TS guess from the NEB log)
Maintain global consistency
Any future updates to Orca settings automatically propagate to NEB jobs.

This approach minimizes duplication and improves long-term maintainability.

2. Queue-Based TSG Job Support

Extended Scheduler Support

Modified:

Scheduler.schedule_jobs
ARCSpecies.process_completed_tsg_queue_jobs

The scheduler now supports .log outputs for TSG jobs. Previously, the queue-based TSG pipeline primarily handled .yml outputs.

Impact

This enables any ESS-based TS search method (e.g., Orca NEB) to:

Run on HPC clusters
Be submitted through queue systems
Be parsed automatically upon completion

3. Parser Improvements

`parse_orca_runtime`

Added parse_orca_runtime to arc/parser/parser.py
Optimized to read only the end of log files
Avoids memory bottlenecks when parsing large Orca outputs

A corresponding parser test was added to verify correct runtime extraction.

4. Robustness Improvements

OrcaNEBAdapter now automatically:

Creates dummy TS species
Generates required coordinate files:
- reactant.xyz
- product.xyz

This ensures compatibility with Orca’s NEB input requirements and reduces user-side setup complexity.

Testing

Unit Tests

Added tests in:

arc/job/adapters/ts/orca_neb_test.py
- Input file generation
- NEB post-processing
arc/parser/parser_test.py
- Orca runtime extraction

Why Is This Needed?

Orca NEB is a powerful method for generating high-quality TS guesses for complex reactions.

Previously, TS guess generation via NEB was limited to local (incore) execution.

This PR enables:

Submission of NEB jobs to HPC queues
Automated parsing of results
Scalable TS search workflows

This significantly improves ARC’s ability to handle computationally demanding TS searches in production environments.

arc/job/adapters/ts/orca_neb.py

arc/species/species.py

arc/job/adapters/ts/orca_neb.py

calvinp0 · 2026-02-24T08:07:15Z

arc/species/species.py

-        for tsg in tsgs:
+        if path.endswith('.log'):
+            xyz = parse_geometry(path)
+            time = parse_orca_runtime(path)


So, I see you made a new function to parse the ORCA log file for the time it took. I know in the submit.sh template files we generate an initial_time and final_time file to calculate the execution/run time of a job IIRC. Should we then be going forward creating run time parsers for gaussian etc. and use this type of logic?

Also, I don't know if feasible and may require a lot of extra work and redoing functions/logic but would it be better to build the runtime parser in the specific ESS adapter itself? That way, if someone writes a new adapter/updates one, they just need to change it in the adapter itself. But that's just a small thought I had. But I see the parsing of log files etc. is done outside of the adapter anyway, so probably not feasible at all.

Regarding timings, I don't know if there is another way from this entry point. Maybe @alongd has more info about this.

As for adapter specific timings, I think this is technically possible per adapter, the question is how to bubble the result up to the scheduler easily? There is probably some way to do that, @alongd can probably say something about this too 😄

Right, in the submit files we have touch initial_time and touch final time, the Job does set_initial_and_final_times and finally determines a .run_time which you could use.

so in Scheduler you do JobObject.run_time. At the Species level, you could go to SpeciesObject.ts_guesses[i].execution_time

arc/job/adapters/ts/orca_neb.py

Copilot

Pull request overview

This pull request adds support for Orca's Nudged Elastic Band (NEB) method as a transition state guess adapter. The implementation enables both local (incore) and queue-based execution of NEB jobs on HPC clusters. The PR introduces a new OrcaNEBAdapter that inherits from OrcaAdapter to reuse Orca-specific logic, extends the scheduler to handle .log file outputs from TS guess jobs, adds a runtime parser for Orca output files, and includes comprehensive test coverage.

Changes:

Introduces OrcaNEBAdapter for executing Orca NEB calculations with support for both incore and queue execution
Extends queue-based TS guess job handling to support .log file outputs in addition to .yml files
Adds parse_orca_runtime parser function to extract job execution times from Orca log files

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
arc/job/adapters/ts/orca_neb.py	New adapter implementing Orca NEB job execution, input file generation, and post-processing
arc/job/adapters/ts/init.py	Imports the new orca_neb module
arc/settings/settings.py	Adds orca_neb to global ESS settings, filenames mappings, and default NEB parameters
arc/scheduler.py	Extends schedule_jobs to handle .log outputs from TSG jobs; adds unused check_tsg_jobs method
arc/species/species.py	Modifies process_completed_tsg_queue_jobs to handle .log files from orca_neb jobs
arc/parser/parser.py	Adds parse_orca_runtime function to extract execution time from Orca output files
arc/parser/parser_test.py	Adds test for parse_orca_runtime function
arc/job/adapters/ts/orca_neb_test.py	Comprehensive unit tests for OrcaNEBAdapter
arc/job/adapters/common.py	Adds orca_neb to supported TS adapters for various RMG reaction families
arc/job/adapter.py	Adds orca_neb to JobEnum
arc/common.py	Adds orca_neb to ESS validation list
arc/main_test.py	Updates test expectations to include orca_neb server mapping

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-25T05:07:00Z

arc/scheduler.py

+    def check_tsg_jobs(self,
+                       label: str,
+                       job: 'JobAdapter',
+                       ):
+        """
+        Check that a transition state guess job converged successfully.
+
+        Args:
+            label (str): The species label.
+            job (JobAdapter): The single point job object.
+        """
+        print("Checking TS guess job...")
+
+        if job.job_status[1]['status'] == 'done':
+            self.post_sp_actions(label,
+                                 sp_path=os.path.join(job.local_path_to_output_file),
+                                 level=job.level,
+                                 )
+            # Update restart dictionary and save the yaml restart file:
+            self.save_restart_dict()
+            if self.species_dict[label].number_of_atoms == 1:
+                # save the geometry from the sp job for monoatomic species for which no opt/freq jobs will be spawned
+                self.output[label]['paths']['geo'] = job.local_path_to_output_file
+        else:
+            self.troubleshoot_ess(label=label,
+                                  job=job,
+                                  level_of_theory=job.level,
+                                  )
+


The check_tsg_jobs method is defined but never called anywhere in the codebase. This appears to be dead code that should either be removed or integrated into the job checking workflow. Additionally, the method contains a print statement instead of using the logger, and duplicates logic from check_sp_job.

Suggested change

def check_tsg_jobs(self,

label: str,

job: 'JobAdapter',

):

"""

Check that a transition state guess job converged successfully.

Args:

label (str): The species label.

job (JobAdapter): The single point job object.

"""

print("Checking TS guess job...")

if job.job_status[1]['status'] == 'done':

self.post_sp_actions(label,

sp_path=os.path.join(job.local_path_to_output_file),

level=job.level,

)

# Update restart dictionary and save the yaml restart file:

self.save_restart_dict()

if self.species_dict[label].number_of_atoms == 1:

# save the geometry from the sp job for monoatomic species for which no opt/freq jobs will be spawned

self.output[label]['paths']['geo'] = job.local_path_to_output_file

else:

self.troubleshoot_ess(label=label,

job=job,

level_of_theory=job.level,

)

Copilot · 2026-02-25T05:07:00Z

arc/species/species.py

+            xyz = parse_geometry(path)
+            time = parse_orca_runtime(path)
+            tsg = TSGuess(method='orca_neb',
+                          execution_time=time,
+                          success=True,
+                          family=None,
+                          xyz=xyz,
+                          )


There is no error handling if parse_geometry returns None. If parsing fails, xyz will be None, and the TSGuess will be created with xyz=None. While there's a check for tsg.initial_xyz at line 1588, it would be clearer to check if xyz is None before creating the TSGuess or to handle the failure case more explicitly.

Copilot · 2026-02-25T05:07:01Z

arc/job/adapters/ts/orca_neb.py

+        input_dict['cpus'] = self.cpu_cores
+        input_dict['charge'] = self.charge
+        input_dict['multiplicity'] = self.multiplicity
+        input_dict['abs_path'] = self.local_path


The abs_path is set to self.local_path, which will be incorrect for remote (queue) execution. When the job runs on a remote server, the local path from the client machine won't exist on the server. The NEB input file should use either relative paths (just 'reactant.xyz' and 'product.xyz') or self.remote_path for remote execution. Since files are uploaded to the execution directory, relative paths would be the safest and most portable solution.

Suggested change

input_dict['abs_path'] = self.local_path

input_dict['abs_path'] = '.'

Copilot · 2026-02-25T05:07:01Z

arc/parser/parser.py

+        try:
+            f.seek(-2000, os.SEEK_END)
+        except OSError:
+            pass # File is smaller than 2kb


The comment says "File is smaller than 2kb" but the code seeks -2000 bytes, which is approximately 2000 characters (bytes in ASCII), not 2KB. While this is a minor distinction (2KB = 2048 bytes), the comment could be more precise (e.g., "File is smaller than 2000 bytes").

Suggested change

pass # File is smaller than 2kb

pass # File is smaller than 2000 bytes

Copilot · 2026-02-25T05:07:01Z

arc/job/adapters/ts/orca_neb.py

+        level = level or orca_neb_settings.get('level', '')
+        if not level:
+            raise ValueError('A level of theory must be specified for Orca NEB jobs, either in the job arguments or in the settings file.')


The level retrieved from settings is a string (e.g., 'wb97xd/def2tzvp'), but it needs to be converted to a Level object before passing to the parent class. If level is None and orca_neb_settings.get('level', '') returns a string, this string should be wrapped in Level(repr=...) to ensure type consistency.

Suggested change

level = level or orca_neb_settings.get('level', '')

if not level:

raise ValueError('A level of theory must be specified for Orca NEB jobs, either in the job arguments or in the settings file.')

raw_level = level or orca_neb_settings.get('level', '')

if not raw_level:

raise ValueError('A level of theory must be specified for Orca NEB jobs, either in the job arguments or in the settings file.')

# Ensure level is a Level object, even if provided as a string in settings

if isinstance(raw_level, str):

level = Level(repr=raw_level)

else:

level = raw_level

Copilot · 2026-02-25T05:07:01Z

arc/job/adapters/ts/orca_neb.py

+              raise_msg=f'Please install {self.job_adapter}, see {self.url} for more information.',
+              )
+        self._log_job_execution()
+        execute_command([f'cd {self.local_path}'] + incore_commands[self.job_adapter], executable='/bin/bash')


The execute_incore method references incore_commands[self.job_adapter] where self.job_adapter is 'orca_neb', but there is no 'orca_neb' entry in the incore_commands dictionary defined in arc/settings/submit.py. This will cause a KeyError at runtime when attempting to execute orca_neb jobs incore. An entry for 'orca_neb' should be added to incore_commands, likely using the same command as regular orca jobs (e.g., 'orca input.in > input.log').

Suggested change

execute_command([f'cd {self.local_path}'] + incore_commands[self.job_adapter], executable='/bin/bash')

commands = incore_commands.get(self.job_adapter)

if commands is None:

# Fall back to the standard ORCA incore command if a specific orca_neb command is not defined

commands = incore_commands.get('orca', [])

execute_command([f'cd {self.local_path}'] + commands, executable='/bin/bash')

Copilot · 2026-02-25T05:07:01Z

arc/species/species.py

-            return None
-        tsgs = [TSGuess(ts_dict=tsg_dict) for tsg_dict in tsg_list]
-        for tsg in tsgs:
+        from arc.parser.parser import parse_orca_runtime


The import statement for parse_orca_runtime should be moved to the top of the file with other parser imports (around line 32-37) for consistency. Currently parse_geometry is imported at the module level while parse_orca_runtime is imported inside the function.

Copilot · 2026-02-25T05:07:02Z

arc/job/adapters/ts/orca_neb.py

+
+
+input_template = """
+!${restricted}HF ${method} ${basis} NEB-TS


The input template hardcodes 'HF' in the line '!${restricted}HF ${method} ${basis} NEB-TS', but this is incorrect for DFT methods. The template should use ${method_class} like the regular OrcaAdapter template does, which can be either 'HF' for wavefunction methods or 'KS' for DFT methods. The write_input_file method needs to determine and set input_dict['method_class'] similar to how OrcaAdapter does it.

Suggested change

!${restricted}HF ${method} ${basis} NEB-TS

!${restricted}${method_class} ${method} ${basis} NEB-TS

Copilot · 2026-02-25T05:07:02Z

arc/job/adapters/ts/orca_neb.py

+        self.execution_type = execution_type or 'queue'
+        super().__init__(project=project,
+                         project_directory=project_directory,
+                         job_type=job_type,
+                         args=args,
+                         bath_gas=bath_gas,
+                         checkfile=checkfile,
+                         conformer=conformer,
+                         constraints=constraints,
+                         cpu_cores=cpu_cores,
+                         dihedral_increment=dihedral_increment,
+                         dihedrals=dihedrals,
+                         directed_scan_type=directed_scan_type,
+                         ess_settings=ess_settings,
+                         ess_trsh_methods=ess_trsh_methods,
+                         execution_type=execution_type,


Setting self.execution_type at line 172 before calling super().init() is problematic because the parent class initialization will override this value based on the execution_type parameter passed at line 187. If you want to default to 'queue', you should either modify the execution_type variable before passing it to super (e.g., 'execution_type = execution_type or "queue"'), or set self.execution_type after the super call. The current approach may not have the intended effect.

alongd

Thanks @kfir4444 for this awesome addition to ARC!
Please see a few comments

alongd · 2026-02-25T05:03:05Z

arc/job/adapters/ts/orca_neb.py

+        if reactions is None:
+            raise ValueError('Cannot execute Orca NEB without an ARCReaction object.')
+
+        # For NEB calculations, the 'species' argument to the OrcaAdapter should be the TS species from the reaction.


this comment might not be relevant, we're not looking st the species arg anymore here

alongd · 2026-02-25T05:04:41Z

arc/job/adapters/ts/orca_neb.py

+                                                      is_ts=True,
+                                                      charge=reactions[0].charge,
+                                                      multiplicity=reactions[0].multiplicity,
+                                                      xyz=reactions[0].r_species[0].get_xyz())


why give the reactant xyz to initialize the TS object?

alongd · 2026-02-25T05:05:06Z

arc/job/adapters/ts/orca_neb.py

+                                                      charge=reactions[0].charge,
+                                                      multiplicity=reactions[0].multiplicity,
+                                                      xyz=reactions[0].r_species[0].get_xyz())
+            else: # Fallback if no reactant species either, though this shouldn't happen for a valid reaction


can delete this comment (correct, but trivial - the err mssg is self-explanatory)

alongd · 2026-02-25T05:06:46Z

arc/job/adapters/ts/orca_neb.py

+        if not level:
+            raise ValueError('A level of theory must be specified for Orca NEB jobs, either in the job arguments or in the settings file.')
+        species_for_super = [reactions[0].ts_species]
+        self.execution_type = execution_type or 'queue'


should this be done after super().__init__, so it's not overwritten?

alongd · 2026-02-25T05:09:03Z

arc/job/adapters/ts/orca_neb.py

+        # Render and write the NEB input file
+        with open(os.path.join(self.local_path, input_filenames[self.job_adapter]), 'w') as f:
+            f.write(Template(input_template).render(**input_dict))
+


remove 1 extra line break

alongd · 2026-02-25T05:13:22Z

arc/scheduler.py

                                  level_of_theory=job.level,
                                  )

+    def check_tsg_jobs(self,


where is this being called from?

alongd · 2026-02-25T05:13:55Z

arc/species/species.py

        self.ts_guesses = cluster_tsgs

-    def process_completed_tsg_queue_jobs(self, yml_path: str):
+    def process_completed_tsg_queue_jobs(self, path: str):


are there any other usages of this function where we need to change the arg name?

alongd · 2026-02-25T05:14:19Z

arc/species/species.py

-            return None
-        tsgs = [TSGuess(ts_dict=tsg_dict) for tsg_dict in tsg_list]
-        for tsg in tsgs:
+        from arc.parser.parser import parse_orca_runtime


can you move the import to the top of the module?

alongd · 2026-02-25T05:21:19Z

arc/species/species.py

-        for tsg in tsgs:
+        if path.endswith('.log'):
+            xyz = parse_geometry(path)
+            time = parse_orca_runtime(path)


Right, in the submit files we have touch initial_time and touch final time, the Job does set_initial_and_final_times and finally determines a .run_time which you could use.

so in Scheduler you do JobObject.run_time. At the Species level, you could go to SpeciesObject.ts_guesses[i].execution_time

alongd · 2026-02-25T05:22:32Z

arc/testing/neb/neb_res.out

@@ -0,0 +1,18797 @@
+


do you think you could safely prune sections of this file? Maybe show an LLM the file and the parsing function, and see what to keep?

added orca_neb to ts adapters

99fddb8

github-actions bot added Module: Scheduler Module: Species labels Feb 24, 2026

added orca_neb to settings

55d80cb

github-advanced-security bot found potential problems Feb 24, 2026

View reviewed changes

arc/job/adapters/ts/orca_neb.py Fixed Show fixed Hide fixed

arc/species/species.py Fixed Show fixed Hide fixed

arc/job/adapters/ts/orca_neb.py Fixed Show fixed Hide fixed

kfir4444 force-pushed the orca_neb branch from 5aa49db to 5f6c0ba Compare February 24, 2026 08:01

calvinp0 reviewed Feb 24, 2026

View reviewed changes

kfir4444 force-pushed the orca_neb branch from 5f6c0ba to af4a986 Compare February 24, 2026 08:17

github-advanced-security bot found potential problems Feb 24, 2026

View reviewed changes

arc/job/adapters/ts/orca_neb.py Fixed Show fixed Hide fixed

kfir4444 added 9 commits February 24, 2026 10:32

added orca neb

ec88329

added neb_tests

d6dc8f6

added neb to adapters common

f51b1af

added neb to init

7c3d639

added get time from orca

958510a

added support for legacy exe tsg jobs

3724e96

added reading tsg objects from log files

7f1f52d

added test: parse_orca_runtime

f00644b

added neb_results for testing

1c26ddc

kfir4444 force-pushed the orca_neb branch from af4a986 to 1c26ddc Compare February 24, 2026 08:38

kfir4444 added 2 commits February 24, 2026 13:26

added neb to main test

4543483

added neb to check_ess_settings, for testing

b83b405

alongd requested a review from Copilot February 25, 2026 04:59

Copilot started reviewing on behalf of alongd February 25, 2026 04:59 View session

Copilot AI reviewed Feb 25, 2026

View reviewed changes

alongd reviewed Feb 25, 2026

View reviewed changes

	input_dict['abs_path'] = self.local_path
	input_dict['abs_path'] = '.'

	pass # File is smaller than 2kb
	pass # File is smaller than 2000 bytes

-        level = level or orca_neb_settings.get('level', '')
-        if not level:
-            raise ValueError('A level of theory must be specified for Orca NEB jobs, either in the job arguments or in the settings file.')
+        raw_level = level or orca_neb_settings.get('level', '')
+        if not raw_level:
+            raise ValueError('A level of theory must be specified for Orca NEB jobs, either in the job arguments or in the settings file.')
+        # Ensure level is a Level object, even if provided as a string in settings
+        if isinstance(raw_level, str):
+            level = Level(repr=raw_level)
+        else:
+            level = raw_level

-        execute_command([f'cd {self.local_path}'] + incore_commands[self.job_adapter], executable='/bin/bash')
+        commands = incore_commands.get(self.job_adapter)
+        if commands is None:
+            # Fall back to the standard ORCA incore command if a specific orca_neb command is not defined
+            commands = incore_commands.get('orca', [])
+        execute_command([f'cd {self.local_path}'] + commands, executable='/bin/bash')



		input_template = """
		!${restricted}HF ${method} ${basis} NEB-TS

	!${restricted}HF ${method} ${basis} NEB-TS
	!${restricted}${method_class} ${method} ${basis} NEB-TS

Conversation

kfir4444 commented Feb 24, 2026

PR Overview

1. Orca NEB Integration

New Adapter: OrcaNEBAdapter

OrcaNEBAdapter Inheritance Strategy

Why inherit from OrcaAdapter?

2. Queue-Based TSG Job Support

Extended Scheduler Support

Impact

3. Parser Improvements

parse_orca_runtime

4. Robustness Improvements

Testing

Unit Tests

Why Is This Needed?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kfir4444 Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

alongd left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

New Adapter: `OrcaNEBAdapter`

`OrcaNEBAdapter` Inheritance Strategy

Why inherit from `OrcaAdapter`?

`parse_orca_runtime`

kfir4444 Feb 24, 2026 •

edited

Loading