Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jobs Hung When Jenkins Temporarily In Quietdown Mode #113

Open
David-Villeneuve opened this issue Nov 4, 2021 · 0 comments
Open

Jobs Hung When Jenkins Temporarily In Quietdown Mode #113

David-Villeneuve opened this issue Nov 4, 2021 · 0 comments
Labels
bug Something isn't working

Comments

@David-Villeneuve
Copy link
Contributor

David-Villeneuve commented Nov 4, 2021

Version report

Jenkins and plugins versions report:

This has been happening for a long time, several versions of Jenkins and combinations of plug-ins. In this case, Jenkins 2.303.1, and version 1.11 of the swarm plug-in.

  • What Operating System are you using (both controller, and any agents involved in the problem)?

Ubuntu 16, and agents are a variety of Ubuntu versions (16/18/20)

Reproduction steps

  • Setup a job which uses a swarm template
  • Put Jenkins in shutdown mode
  • Trigger the job. The job will be held, and the agent will start
  • After 2-3 minutes, Jenkins takes the node down because it is idle
  • Cancel shutdown mode.

Results

Expected result:

Build runs on agent.

Actual result:

Build never happens, and new builds can't run until it is manually cancelled.

Possible Solution

I made the following change to prevent calling done(c). It results in the agent being taken down and a new one being created. This will repeat until Jenkins shutdown is cancelled:

index 31e4878..41cc23d 100644
--- a/src/main/java/org/jenkinsci/plugins/docker/swarm/DockerSwarmAgentRetentionStrategy.java
+++ b/src/main/java/org/jenkinsci/plugins/docker/swarm/DockerSwarmAgentRetentionStrategy.java
@@ -19,6 +19,7 @@ import hudson.model.Executor;
 import hudson.model.ExecutorListener;
 import hudson.model.Queue;
 import hudson.slaves.RetentionStrategy;
+import jenkins.model.Jenkins;
 
 public class DockerSwarmAgentRetentionStrategy extends RetentionStrategy<DockerSwarmComputer>
         implements ExecutorListener {
@@ -45,7 +46,7 @@ public class DockerSwarmAgentRetentionStrategy extends RetentionStrategy<DockerS
             final long connectTime = System.currentTimeMillis() - c.getConnectTime();
             final long idleTime = System.currentTimeMillis() - c.getIdleStartMilliseconds();
             final boolean isTimeout = connectTime > timeout && idleTime > timeout;
-            if (isTimeout && (!isTaskAccepted || isTaskCompleted)) {
+            if (isTimeout && (!isTaskAccepted || isTaskCompleted ) && !Jenkins.getInstance().isQuietingDown()) {
                 LOGGER.log(Level.INFO, "Disconnecting due to idle {0}", c.getName());
                 done(c);
             }```

I don't know enough about the interactions with the caller, so this may not be the most optimal solution.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant