-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interchange hang or SIGABRT on kill_event driven exit #3697
Labels
Comments
To recreate, run the test in #3698 with stderr/streams enabled:
|
In some situations in my replicator test, the interchange will exit with this jumbled pair of stack traces, but unix exit code 0, not -6:
|
github-merge-queue bot
pushed a commit
that referenced
this issue
Jan 16, 2025
On certain bad registration messages, the interchange should exit immediately. This tests that. See #3697 for some bad (cosmetic?) behaviour here - the interchange SIGABRTs on this code path rather than exiting cleanly, and this test includes a commented out assert that could check for clean exit (in addition to checking that the interchange process exits at all) ## Type of change - Code maintenance/cleanup
The test introduced in #3698 appears to fail (with a hung interchange process and different ZMQ errors on stderr) - so although I labelled this issue as only cosmetic, it appears not to be. |
github-merge-queue bot
pushed a commit
that referenced
this issue
Jan 21, 2025
This removes one of two non-main threads in the interchange - the task puller thread - and moves the behaviour there (receive a message and put it in an in-process queue) into the main thread (where that in-process queue is ultimately dequeued, anyway) This is aimed at helping with ZMQ-vs-threads issues within the interchange -- most immediately, clean shutdown #3697 performance notes: parsl-perf -t 30, my laptop, no logging before this PR, 2320 tasks/second post this PR, 2344 tasks/second cc @rjmello who expressed especial interest in this # Changed Behaviour Some performance difference, although the brief measurements above are not concerning. ## Type of change - New feature --------- Co-authored-by: Kevin Hunter Kesling <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
There are a few paths through which the interchange exits. The regular shutdown path, driven by the DFK, is to send a SIGTERM which immediately kills the process.
Another rare path is using
kill_event
which is polled every 10ms, and is set when a particular form of incorrect worker registration is received.When that kill_event path is taken, the interchange exits with a SIGABRT, placing this (or a variant) on stderr:
The interchange then exits (as desired) but with unix exit code -6, SIGABRT.
This is probably mostly cosmetic: the interchange still exits.
To Reproduce
I will make a pull request with a demonstrator test.
Expected behavior
clean exit
Environment
my laptop, branched from Parsl 2024.11.11
The text was updated successfully, but these errors were encountered: