Engine: Allow `CalcJob` monitors to return outputs #6191

sphuber · 2023-11-24T10:10:30Z

The CalcJobMonitorResult dataclass adds the attribute outputs. It takes a dictionary of nodes that the engine will attach to the calculation node to which the monitor is attached just as the outputs of a Parser would be.

sphuber · 2023-11-24T12:34:08Z

@edan-bainglass this might be a feature of interest

edan-bainglass · 2023-11-24T16:56:23Z

Thanks @sphuber. I'll take a look at it as soon as I'm back from vacay 🏝️

sphuber · 2024-01-31T17:19:39Z

@edan-bainglass did you want to have a look at this? Otherwise I think I will go ahead and merge this

edan-bainglass · 2024-01-31T17:45:48Z

Yes. I'll give it a look by end of Friday. Thanks

edan-bainglass

Overall looks good 👍

As mentioned in a comment, a use-case for persistence of partial data would be interesting. I wonder if this is a necessary step towards (or sheds light on) monitor-triggered calcjobs, for which we DO have a requirement!

edan-bainglass · 2024-02-02T10:51:05Z

docs/source/howto/run_codes.rst

+.. versionadded:: 2.5.0
+
+Monitors can also attach outputs to the calculation that it is monitoring.
+This can be useful to report outputs while the calculation is running that should still be stored permanently in the provenance graph.


At least in the use case of Aurora, there is no desire to store snapshots permanently. In the current implementation, I continuously overwrite extras.snapshot with the latest. The field is discarded on job completion. But perhaps partial data can be important in some cases, say, in a structure optimization calculation, where intermediates are of value down the road (though many if not all codes take care of this internally).

edan-bainglass · 2024-02-02T10:56:35Z

src/aiida/engine/processes/calcjobs/calcjob.py

+                    retrieved = retrieve_calculation(self.node, transport, retrieved_temporary_folder.abspath)
+                    if retrieved is not None:
+                        self.out(self.node.link_label_retrieved, retrieved)
+                        self.update_outputs()


At the end of those tasks, the process will change to a new state and the outputs will be "flushed" to the database by update_outputs.

Is _perform_import where this happens?

No, this is the method that is called when a CalcJobNode is run in "import" mode. That is when a remote_folder is passed as an input, in which case the actual code is not run, but a CalcJobNode is created taking the outputs from the already existing remote_folder. The comment you quoted to the lifetime of an actual CalcJob process that is being run, which will call update_outputs on state changes.

Where is it then actually calling update_outputs?

aiida-core/src/aiida/engine/processes/process.py

Line 417 in 35d7ca6

self.update_outputs()

😅 Hopefully one day I'll have time to find out how this line connects to the rest of AiiDA's mechanism

The handling of outputs of a process are handled as follows. A new output is registered by calling `Process.out`. This validates the output against the process specification, and if it passes, the output is added to the `Process._outputs` mapping. Whenever a new state is entered, `Process.on_entered` is triggered which calls `Process.update_outputs`. This goes through all the outputs in memory and adds a link to the process node if it hasn't already been done before. The `CalcJob` implementation wen around this mechanism though for the `remote_folder` and `retrieved` outputs. These outputs are created in the `upload_calculation` and `retrieve_calculation` methods of the `aiida.engine.daemon.execmanager`, respectively, and those same methods also create the link to the process node. To restore consistency, the `upload_calculation`/`retrieve_calculation` methods now return the node they created and it is the caller, the `task_upload_job` and `task_retrieve_job`, respectively, that now call `Process.out` for the returned node. At the end of those tasks, the process will change to a new state and the outputs will be "flushed" to the database by `update_outputs`. This change now also allows removing the "manual" handling of outputs in the `CalcJob.parse` call.

The `CalcJobMonitorResult` dataclass adds the attribute `outputs`. It takes a dictionary of nodes that the engine will attach to the calculation node to which the monitor is attached just as the outputs of a `Parser` would be.

sphuber force-pushed the feature/6158/monitor-store-outputs branch from 4c283ba to d0c5c02 Compare November 24, 2023 12:05

sphuber force-pushed the feature/6158/monitor-store-outputs branch from d0c5c02 to 9211f43 Compare December 21, 2023 15:37

edan-bainglass approved these changes Feb 2, 2024

View reviewed changes

sphuber added 2 commits February 2, 2024 13:09

Engine: Allow CalcJob monitors to return outputs

8c6e16a

The `CalcJobMonitorResult` dataclass adds the attribute `outputs`. It takes a dictionary of nodes that the engine will attach to the calculation node to which the monitor is attached just as the outputs of a `Parser` would be.

sphuber force-pushed the feature/6158/monitor-store-outputs branch from 9211f43 to 8c6e16a Compare February 2, 2024 12:09

sphuber merged commit b7e59a0 into aiidateam:main Feb 2, 2024
20 checks passed

sphuber deleted the feature/6158/monitor-store-outputs branch February 2, 2024 12:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Engine: Allow `CalcJob` monitors to return outputs #6191

Engine: Allow `CalcJob` monitors to return outputs #6191

sphuber commented Nov 24, 2023

sphuber commented Nov 24, 2023

edan-bainglass commented Nov 24, 2023

sphuber commented Jan 31, 2024

edan-bainglass commented Jan 31, 2024

edan-bainglass left a comment

edan-bainglass Feb 2, 2024

edan-bainglass Feb 2, 2024

sphuber Feb 2, 2024

edan-bainglass Feb 2, 2024

sphuber Feb 2, 2024

edan-bainglass Feb 2, 2024

Engine: Allow CalcJob monitors to return outputs #6191

Engine: Allow CalcJob monitors to return outputs #6191

Conversation

sphuber commented Nov 24, 2023

sphuber commented Nov 24, 2023

edan-bainglass commented Nov 24, 2023

sphuber commented Jan 31, 2024

edan-bainglass commented Jan 31, 2024

edan-bainglass left a comment

Choose a reason for hiding this comment

edan-bainglass Feb 2, 2024

Choose a reason for hiding this comment

edan-bainglass Feb 2, 2024

Choose a reason for hiding this comment

sphuber Feb 2, 2024

Choose a reason for hiding this comment

edan-bainglass Feb 2, 2024

Choose a reason for hiding this comment

sphuber Feb 2, 2024

Choose a reason for hiding this comment

edan-bainglass Feb 2, 2024

Choose a reason for hiding this comment

Engine: Allow `CalcJob` monitors to return outputs #6191

Engine: Allow `CalcJob` monitors to return outputs #6191