-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EKS] [request]: Ability to configure pod-eviction-timeout #159
Comments
Thanks for submitting this Chris. At present, the 5 minute timeout is the default for Kubernetes. We’re evaluating adding additional configuration parameters onto the control plane and have added this to our list of parameters to research exposing for customization on a per-cluster basis. |
Hi @tabern , thanks for the response. Yes, I'm aware of the Kubernetes default. A large portion of those running K8s in production have actively tweaked these values and I worry this would be a barrier to EKS supporting some of our more critical applications. Glad to hear this is being evaluated and look forward to seeing where it goes. |
@ChrisCooney sounds good. We're going to look into this. I've updated the title of your request to specifically address this ask so we can track it. |
To add another use case: Thanks! |
Ideally we should be also able to tune:
|
I would also very much like to have control over HPA scaling delays since there's no other way to do it:
|
@BrianChristie BTW, if you like you can monitor for spot node terminator and evict the pods cleanly before termination. |
also |
Are there plans to allow passing in any amount of parameters from something like https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/ (specifically |
Could also use the ability to modify
Since I'm having problems with HPA and metrics-server and can't view or configure it |
Looks like more and more people adapting k8s on eks are in urgent need of these customizations. Specifically the one already mentioned, Unable to meet worker nodes patching requirements. (although draining helps a little, but not enough to comply) |
Actually 5 minute is sometimes too long to delete pods on failed nodes. |
I really need to set below one! |
Any updates? We're also looking for the ability to configure these values. |
As an interim workaround, instead of using There's an example in this issue: kubernetes/kubernetes#74651 |
Not sure if this works for everybody or everything but I recently noticed this in the AWS EKS node AMI: https://github.com/awslabs/amazon-eks-ami/blob/master/files/kubelet.service#L14 Notice the use of |
Yeah, this means you can configure the Kubelet on the node. Alas, it doesn't allow us to configure the kubernetes control plane. |
can you allow the ability to modify the below flags for the kube-controller-manager fo us to be able to manage the col down delay aside from the default 5 minutes: |
you could use this instead, https://blog.postmates.com/configurable-horizontal-pod-autoscaler-81f48779abfc |
Add:
|
Jumping in to request that I'm not sure why kube chose to have all of these HPA-related configs go on the controller manager instead of being configured on the HPA resource itself, but that's another story. |
Note that 1.18 adds support configurable scaling behavior So this will be possible once EKS supports 1.18 |
Still with 1.18 it doesn't seem to bite
|
@danijelk try |
@toricls Ah, didn't see I was on beta1, k8s accepted it now thanks. |
Is there a way to set the --terminated-pod-gc-threshold on the Kube-controller-manager with EKS? A solution was suggested earlier about specifying the parameters in the AMI. Is that a recommended way to do it for now? Although, that would mean having a custom AMI that needs to be updated every time there is a new AMI version for EKS. |
Closing this as setting these flags is supported in K8s v1.18 and higher. |
@tabern, |
need horizontal-pod-autoscaler-initial-readiness-delay flag to be configurable in eks, but thats not possible till now. any info on how to configure it for eks ? |
Not sure why this ticket is closed and "Shipped"? How to set "pod-eviction-timeout" ??? |
I too require horizontal-pod-autoscaler-initial-readiness-delay on EKS and the scaling-behavior does not support this |
It doesn't look like I can modify |
also need to customize |
Needing this urgently :) |
No status on this?? |
For everyone who's following this, see #1544 |
+1 to allow setting of the |
Is there an update on this? I really need the ability to set terminated-pod-gc-threshold to use EKS. |
I'd like to set terminated-pod-gc-threshold to use EKS |
FYI, we thought we needed to increase horizontal-pod-autoscaler-initial-readiness-delay, to solve an issue with autoscaling being too aggressive after rolling out new pods, and causing scaling to max out. Our issue was actually the custom metrics we were scaling on. We were doing something like this We changed the rate from 1m to 2m, and that smoothed things out quite a bit and fixed our issue with aggressively scaling up. This SO post has some good information about rate in Prometheus https://stackoverflow.com/questions/38915018/prometheus-rate-functions-and-interval-selections |
The default is 10% but I have use cases where the value should be less, making it more sensitive and responsive to changes. |
Does the kube-controller-manager still support a |
The default pod-eviction-timeout 5m doesn't provide opportunity to make graceful shutdown for pods on spot nodes, because when spot node goes down, we have pod running and ready until healthcheck interval, and it follows us to get 502 error from ALB |
Hi team, 5 minutes is too long for node issue, we hope service team can allow the user to change below setting. |
Really gonna need to set |
BTW, many of these command line arguments are deprecated. Kubernetes recommends configuring the kubelet through its configuration file - see https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ |
Kubernetes v1.31 does not include a |
is there any update on this request? we'd be interested in adjusting the hpa's specific flag such as |
Tell us about your request
I would like to be able to make changes to configuration values for things like
kube-controller
. This enables a greater customisation of the cluster to specific, bespoke needs. It will also go a long way in making the cluster more resilient and self-healing.Which service(s) is this request for?
EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
At present, we have a cluster managed by EKS. The default
pod-eviction-timeout
is five minutes, meaning that we can derail an instance and the control plane won't reschedule for five minutes. Five minute outages for things like our payment systems is simply unacceptable - the cost impact would be severe. At present, to the best of my knowledge, the control plane is not configurable at all.What we would like to be able to do is provide configuration parameters via the AWS API or within a Kubernetes resources like a
ConfigMap
. Either or would mean, when we bring up new EKS clusters, we can automate the configuration of values likepod-eviction-timeout
.Are you currently working around this issue?
No, to the best of my knowledge, it isn't something that EKS presently supports.
The text was updated successfully, but these errors were encountered: