-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducibility tests for determinism from restart (as well as cold start) #277
Comments
I've decided to expedite the overhaul of the ACCESS-OM3 CI repro tests because, despite what I said in the meeting, the current test actually isn't fit-for-purpose for restart repro. I'll try to add your test 2 as part of this. |
Thanks @dougiesquire, the overhaul might also be an opportunity to see whether md5 restart hashes can be used #278 |
There is a test to compare the output from 2 consecutive one day runs with the output from a 2 day run that will never pass if this option 2 test fails. |
I think that's a slightly different situation. If case 1 passes but case 2 fails, that indicates non-deterministic code in initialisation from restart (but not a cold start). If case 1 passes but case 2 fails, you'd also expect a failure of restart repro (2x1 day vs 1x2day), but having case 1 pass and case 2 fail also eliminates the possibility that the restart repro failure was due to the restarts storing an incomplete/incorrect model state at the end of day 1, and suggests it could be a lack of determinism in initialising from restarts at the start of day 2. [note: this statement is only correct if "passing case 1" means we check md5 hashes of restarts to confirm an identical model state at the end of day 1] (Also note that 1 and 2 would not be covered by having two identical experiments starting from rest, each consisting of two 1-day runs, and checking the reproducibility at the end of day 1 (case 1) and day 2 (case 2). This is because the restarts would differ if case 1 fails. So to test case 2 independently of case 1, the case 2 runs need to start from identical restarts.) |
I'm not sure 2 is something we need/want to check with every CI test since, as @anton-seaice points out, the 2x1d-vs-1x2d test should fail if when don't have 2. It is definitely something we need to check if the 2x1d-vs-1x2d test fails. |
Before we can make sense of any other reproducibility checks we need to establish that the model is deterministic, i.e. two runs of the same config from the same initial condition will produce the same result. This is not guaranteed: #40
As discussed in today's OSIT catchup, I think we actually need two tests here:
I think these are distinct, because cold starts and restarts exercise differing parts of the code, so passing 1 does not guarantee passing 2.
I understand we're already testing 1 (#41), but we're not testing 2. I think we should also test 2 so we have a solid basis for interpreting other types of reproducibility tests (e.g. reproducibility across restarts #266).
The text was updated successfully, but these errors were encountered: