Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notebooks 2.0 // Controller // Extract All Container Statuses to State #212

Open
thesuperzapper opened this issue Feb 14, 2025 · 0 comments

Comments

@thesuperzapper
Copy link
Member

As raised by @harshad16 in #210, there is an issue with the generateWorkspaceState() method in the Workspace controller.

The issue is that we only look at the main container's status when checking for issues like ImagePullBackOff, but there could be multiple container or initContainers with issues preventing the Pod from becoming ready.

We need to update the following code so that it looks at all the container statuses in the Pod's status.containerStatuses and status.initContainerStatuses and looks for any which are in a "waiting" state with CrashLoopBackOff or ImagePullBackOff reasons, and then aggregate the errors into the stateMessage.

// get container status
// https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#container-states
var containerStatus corev1.ContainerStatus
for _, container := range pod.Status.ContainerStatuses {
if container.Name == workspacePodTemplateContainerName {
containerStatus = container
break
}
}
// get the container state
containerState := containerStatus.State
// STATUS: Error (container state)
if containerState.Waiting != nil {
if containerState.Waiting.Reason == "CrashLoopBackOff" {
state = kubefloworgv1beta1.WorkspaceStateError
stateMessage = stateMsgErrorContainerCrashLoopBackOff
return state, stateMessage, nil
}
if containerState.Waiting.Reason == "ImagePullBackOff" {
state = kubefloworgv1beta1.WorkspaceStateError
stateMessage = stateMsgErrorContainerImagePullBackOff
return state, stateMessage, nil
}
}

NOTE: because there could be different reasons for each one being in a waiting state, we should FIRST look for any ImagePullBackOff reasons and then only return all those as the state, then SECOND check for CrashLoopBackOff so we present the most pressing issues to the user in the state first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Needs Triage
Development

No branches or pull requests

1 participant