Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EKS] [request]: Ability to configure pod-eviction-timeout #159

Open
ChrisCooney opened this issue Feb 10, 2019 · 48 comments
Open

[EKS] [request]: Ability to configure pod-eviction-timeout #159

ChrisCooney opened this issue Feb 10, 2019 · 48 comments
Labels
EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue

Comments

@ChrisCooney
Copy link

ChrisCooney commented Feb 10, 2019

Tell us about your request
I would like to be able to make changes to configuration values for things like kube-controller. This enables a greater customisation of the cluster to specific, bespoke needs. It will also go a long way in making the cluster more resilient and self-healing.

Which service(s) is this request for?
EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

At present, we have a cluster managed by EKS. The default pod-eviction-timeout is five minutes, meaning that we can derail an instance and the control plane won't reschedule for five minutes. Five minute outages for things like our payment systems is simply unacceptable - the cost impact would be severe. At present, to the best of my knowledge, the control plane is not configurable at all.

What we would like to be able to do is provide configuration parameters via the AWS API or within a Kubernetes resources like a ConfigMap. Either or would mean, when we bring up new EKS clusters, we can automate the configuration of values like pod-eviction-timeout.

Are you currently working around this issue?
No, to the best of my knowledge, it isn't something that EKS presently supports.

@ChrisCooney ChrisCooney added the Proposed Community submitted issue label Feb 10, 2019
@abby-fuller abby-fuller added the EKS Amazon Elastic Kubernetes Service label Feb 12, 2019
@tabern
Copy link
Contributor

tabern commented Feb 15, 2019

Thanks for submitting this Chris. At present, the 5 minute timeout is the default for Kubernetes. We’re evaluating adding additional configuration parameters onto the control plane and have added this to our list of parameters to research exposing for customization on a per-cluster basis.

@ChrisCooney
Copy link
Author

Hi @tabern , thanks for the response. Yes, I'm aware of the Kubernetes default. A large portion of those running K8s in production have actively tweaked these values and I worry this would be a barrier to EKS supporting some of our more critical applications.

Glad to hear this is being evaluated and look forward to seeing where it goes.

@tabern tabern changed the title [EKS] [request]: Configure Control Plane Values [EKS] [request]: Ability to configure pod-eviction-timeout Feb 15, 2019
@tabern
Copy link
Contributor

tabern commented Feb 15, 2019

@ChrisCooney sounds good. We're going to look into this. I've updated the title of your request to specifically address this ask so we can track it.

@BrianChristie
Copy link

To add another use case:
We also wish to be able to adjust pod-eviction-timeout, specifically to facilitate the use of Spot Instances. In the case that an instance is terminated without the running Pods being properly evicted, we want a short timeout before those Pods are rescheduled elsewhere.

Thanks!

@dawidmalina
Copy link

Ideally we should be also able to tune:

--node-monitor-period
--node-monitor-grace-period

@geerlingguy
Copy link

I would also very much like to have control over HPA scaling delays since there's no other way to do it:

--horizontal-pod-autoscaler-downscale-delay
--horizontal-pod-autoscaler-upscale-delay

@whereisaaron
Copy link

@savar
Copy link

savar commented Apr 17, 2019

also --horizontal-pod-autoscaler-cpu-initialization-period and --horizontal-pod-autoscaler-downscale-stabilization as if one of hour hpa is failing miserably a second one actually only scales within the CPU utilization but as they are limited and only can go up to almost twice the "wished" target, we only can scale up by 2 each run (https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details) which means with 16 pods running we only grow to 32.. and then it takes 5mins before it scales to 64 and then another 5mins to 128.. if the other HPA which is failing at that time had 800 pods running and is dropping to 300, then it takes like ages to cover the missing 500 pods

@echoboomer
Copy link

Are there plans to allow passing in any amount of parameters from something like https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/ (specifically --terminated-pod-gc-threshold) or is the plan to only allow customizing certain parameters?

@eladitzhakian
Copy link

Could also use the ability to modify

--horizontal-pod-autoscaler-use-rest-clients

Since I'm having problems with HPA and metrics-server and can't view or configure it

@mebuzz
Copy link

mebuzz commented Sep 5, 2019

Looks like more and more people adapting k8s on eks are in urgent need of these customizations. Specifically the one already mentioned,
--horizontal-pod-autoscaler-downscale-delay
--horizontal-pod-autoscaler-upscale-delay
and
--pod-eviction-timeout

Unable to meet worker nodes patching requirements. (although draining helps a little, but not enough to comply)

@ghost
Copy link

ghost commented Sep 9, 2019

Actually 5 minute is sometimes too long to delete pods on failed nodes.
--pod-eviction-timeout duration should be enabled on EKS too.

@chillybug
Copy link

I really need to set below one!
--horizontal-pod-autoscaler-upscale-delay

@gillbee
Copy link

gillbee commented Nov 12, 2019

Any updates? We're also looking for the ability to configure these values.

@PaulMaddox
Copy link

PaulMaddox commented Nov 26, 2019

As an interim workaround, instead of using --pod-eviction-timeout, can you use Taint Based Evictions to set this on a per-pod basis? This is supported in EKS clusters running 1.13+.

There's an example in this issue: kubernetes/kubernetes#74651

@echoboomer
Copy link

Not sure if this works for everybody or everything but I recently noticed this in the AWS EKS node AMI:

https://github.com/awslabs/amazon-eks-ami/blob/master/files/kubelet.service#L14

Notice the use of $KUBELET_ARGS $KUBELET_EXTRA_ARGS here - we were able to pass in my original requirement of --terminated-pod-gc-threshold this way, but I'm not entirely certain that a) AWS honors things placed here or b) these work with master-node abstraction.

@ChrisCooney
Copy link
Author

Not sure if this works for everybody or everything but I recently noticed this in the AWS EKS node AMI:

https://github.com/awslabs/amazon-eks-ami/blob/master/files/kubelet.service#L14

Notice the use of $KUBELET_ARGS $KUBELET_EXTRA_ARGS here - we were able to pass in my original requirement of --terminated-pod-gc-threshold this way, but I'm not entirely certain that a) AWS honors things placed here or b) these work with master-node abstraction.

Yeah, this means you can configure the Kubelet on the node. Alas, it doesn't allow us to configure the kubernetes control plane.

@shivarajai
Copy link

can you allow the ability to modify the below flags for the kube-controller-manager fo us to be able to manage the col down delay aside from the default 5 minutes:
--horizontal-pod-autoscaler-downscale-delay
--horizontal-pod-autoscaler-upscale-delay

@jicowan
Copy link

jicowan commented Mar 17, 2020

@starchx
Copy link

starchx commented Apr 13, 2020

Add:

--terminated-pod-gc-threshold

@calebwoofenden
Copy link

calebwoofenden commented May 14, 2020

Jumping in to request that --horizontal-pod-autoscaler-initial-readiness-delay also be added. We are running an HPA in our EKS clusters and are unable to fully configure it how we would like.

I'm not sure why kube chose to have all of these HPA-related configs go on the controller manager instead of being configured on the HPA resource itself, but that's another story.

@mikestef9
Copy link
Contributor

Note that 1.18 adds support configurable scaling behavior

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-configurable-scaling-behavior

So this will be possible once EKS supports 1.18

@danijelk
Copy link

Still with 1.18 it doesn't seem to bite

error validating data: ValidationError(HorizontalPodAutoscaler.spec): unknown field "behavior" in io.k8s.api.autoscaling.v2beta1.HorizontalPodAutoscalerSpec;

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T18:49:28Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.8-eks-7c9bda", GitCommit:"7c9bda52c425d0d56d7b93f1377a826b4132c05c", GitTreeState:"clean", BuildDate:"2020-08-28T23:04:33Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

@toricls
Copy link
Contributor

toricls commented Nov 23, 2020

@danijelk try v2beta2 for it.

@danijelk
Copy link

@toricls Ah, didn't see I was on beta1, k8s accepted it now thanks.

@aniruddhch
Copy link

Is there a way to set the --terminated-pod-gc-threshold on the Kube-controller-manager with EKS? A solution was suggested earlier about specifying the parameters in the AMI. Is that a recommended way to do it for now? Although, that would mean having a custom AMI that needs to be updated every time there is a new AMI version for EKS.

@tabern
Copy link
Contributor

tabern commented Mar 23, 2021

Closing this as setting these flags is supported in K8s v1.18 and higher.

@tabern tabern closed this as completed Mar 23, 2021
@jerry123je
Copy link

@tabern,
I understand the hpa.v2beta2 have ability to add behavior configuration, this resolve part of requests.
However, i just curios that how can we set pod-eviction-timeout after k8s v1.18 without modifying kube-controller-manager ?

@EdwinPhilip
Copy link

need horizontal-pod-autoscaler-initial-readiness-delay flag to be configurable in eks, but thats not possible till now. any info on how to configure it for eks ?

@lmgnid
Copy link

lmgnid commented May 3, 2021

Not sure why this ticket is closed and "Shipped"? How to set "pod-eviction-timeout" ???

@mikestef9 mikestef9 reopened this May 4, 2021
@mibaboo
Copy link

mibaboo commented May 25, 2021

I too require horizontal-pod-autoscaler-initial-readiness-delay on EKS and the scaling-behavior does not support this

@emmeowzing
Copy link

It doesn't look like I can modify --horizontal-pod-autoscaler-sync-period either.

@yongzhang
Copy link

also need to customize pod-eviction-timeout

@sjortiz
Copy link

sjortiz commented Dec 30, 2021

Needing this urgently :)

@marcusthelin
Copy link

No status on this??

@TaiSHiNet
Copy link

For everyone who's following this, see #1544

@dwgillies-bluescape
Copy link

dwgillies-bluescape commented May 16, 2022

+1 to allow setting of the --terminated-pod-gc-threshold setting. Evicted pods are piling up in our dev clusters and the default limit of 12,500 evicted pods before garbage collection begins is way too high! We would like to reduce it to 100 !

@michaelmohamed
Copy link

Is there an update on this? I really need the ability to set terminated-pod-gc-threshold to use EKS.

@vasylherman
Copy link

vasylherman commented Jun 6, 2022

I'd like to set terminated-pod-gc-threshold to use EKS

@aaronmell
Copy link

FYI, we thought we needed to increase horizontal-pod-autoscaler-initial-readiness-delay, to solve an issue with autoscaling being too aggressive after rolling out new pods, and causing scaling to max out.

Our issue was actually the custom metrics we were scaling on. We were doing something like this
sum(rate(container_cpu_cfs_throttled_seconds_total[1m])) The issue here is that we collect metrics every 30s, and container_cpu_cfs_throttled_seconds_total doesn't increased in a linear fashion, it tends to increase in in spurts.

We changed the rate from 1m to 2m, and that smoothed things out quite a bit and fixed our issue with aggressively scaling up.

This SO post has some good information about rate in Prometheus

https://stackoverflow.com/questions/38915018/prometheus-rate-functions-and-interval-selections

@mtcode
Copy link

mtcode commented Nov 1, 2022

--horizontal-pod-autoscaler-tolerance is another flag that is only customizable via controller manager flags. The v2beta2 API does not allow configuring this.

The default is 10% but I have use cases where the value should be less, making it more sensitive and responsive to changes.

@sftim
Copy link

sftim commented Feb 27, 2023

Does the kube-controller-manager still support a --pod-eviction-timeout argument? The docs imply it was removed in v1.24.0 and the changelog implies it'll be removed in v1.27

@daynekoroman
Copy link

daynekoroman commented Dec 4, 2023

The default pod-eviction-timeout 5m doesn't provide opportunity to make graceful shutdown for pods on spot nodes, because when spot node goes down, we have pod running and ready until healthcheck interval, and it follows us to get 502 error from ALB

@xzp1990
Copy link

xzp1990 commented Jan 3, 2024

Hi team, 5 minutes is too long for node issue, we hope service team can allow the user to change below setting.
–node-status-update-frequency
–node-monitor-period
–node-monitor-grace-period
–pod-eviction-timeout

@des1redState
Copy link

Really gonna need to set --horizontal-pod-autoscaler-initial-readiness-delay, pretty please.

@sftim
Copy link

sftim commented Sep 24, 2024

BTW, many of these command line arguments are deprecated. Kubernetes recommends configuring the kubelet through its configuration file - see https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/

@sftim
Copy link

sftim commented Sep 24, 2024

Kubernetes v1.31 does not include a --pod-eviction-timeout command line argument for any component.

@ardikabs
Copy link

ardikabs commented Jan 7, 2025

is there any update on this request? we'd be interested in adjusting the hpa's specific flag such as --horizontal-pod-autoscaler-tolerance on EKS. Even though there is an ongoing KEP for configuring the tolerance on the resource-level, but we still demand to change this behavior globally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue
Projects
None yet
Development

No branches or pull requests