Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-127 (UserNS): allow customizing subids length #5020

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions keps/sig-node/127-user-namespaces/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -335,6 +335,8 @@ bool `pod.spec.hostUsers`.
The mapping length will be 65536, mapping the range 0-65535 to the pod. This wide
range makes sure most workloads will work fine. Additionally, we don't need to
worry about fragmentation of IDs, as all pods will use the same length.
The mapping length (multiple of 65536) will be customizable via a new
`KubeletConfiguration` property `subidsPerPod`.
Comment on lines +338 to +339
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd got the impression we might want to make the mapping size configurable on a per-Pod basis.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(what if you have a particular Pod that assigns a (POSIX) ID to each user, and you have 42000000 users, but all your other Pods only need 65000 UIDs?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's possible but not a common case IMO, and the implementation of adding a pod API field would be much more complex than adding a kubelet configuration field. I'm not sure the maintenance burden is worth it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So long as we're not accidentally tying ourselves into not being able to extend the Pod API in the future. If we are tying ourselves, let's make sure we'd never want the option.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about introducing a Pod security context property like securityContext.userNS.staticMappingWithUsername: "foo".
This will run getsubids foo to obtain the subID range, and assign the entire range to the Pod.
(So, this is different from getsubids kubelet which returns the total range for the 110 pods)

Multiple pods may use the same range at their own risk.
This allows assigning an extremely large subID range. $(2^{32}-65536)$ at maximum.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the max idrange inside of a container flexible? as in: could we have a kubelet field that toggles a dynamic range and the runtime interpret the range in the image?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flexible. A container may use UID that is not present in /etc/passwd in the image. So, a runtime cannot "interpret the range in the image".

It should be still possible to have OCI Image annotations to declare the range of the needed UIDs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how to prevent that such a field is not abused? An image could claim all the available IDs and prevents that other pods can be created

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admission-time checks is where I'd start; also ResourceQuota and LimitRange specifically.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prevents that other pods can be created

No, with securityContext.userNS.staticMappingWithUsername: "foo" which allows ID conflicts and requires the explicit configuration of the securityContext.

This should be still probably prohibited for Restricted Pod Security Standard.


The mapping will be chosen by the kubelet, using a simple algorithm to give
different pods in this category ("without" volumes) a non-overlapping mapping.
Expand Down