-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KEP-2941]: Supporting Dynamic Resource Allocations in Kueue #3071
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: kannon92 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
I brought this KEP up with wg-device-management. It seemed to over well and they pointed out a few improvements. I added this KEP to the wg-batch agenda next week. Maybe from there we can continue to discuss this KEP. |
Yeah, I'm open to discussion. But, I will start to review this PR after the 0.9 release due to the 0.9 release tight deadline. |
nominalQuota: 9 | ||
- name: "memory" | ||
nominalQuota: "200Mi" | ||
- name: "gpu.example.com" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgive the naive question, still getting familiar with DRA. How does this work exactly? Does the resulting workload end up with 2 GPUs from this claim? Do we need to differentiate between the "time-slice" vs "space-partition" GPUs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So claims correspond to device classes. We can only limit device classes so I think you would set up a device with time-slicing enabled. Your claim would request this device.
What type of PR is this?
/kind feature
What this PR does / why we need it:
DRA support and Kueue
Which issue(s) this PR fixes:
KEP for #2941
Special notes for your reviewer:
Does this PR introduce a user-facing change?