diff --git a/assets/scss/_custom.scss b/assets/scss/_custom.scss index a06da2448c3f8..6ec7f28f17b34 100644 --- a/assets/scss/_custom.scss +++ b/assets/scss/_custom.scss @@ -392,52 +392,63 @@ footer { } main { - .td-content table code, - .td-content>table td { - word-break: break-word; - } -/* SCSS Related to the Metrics Table */ +/* SCSS Related to the Metrics list */ + + div.metric:nth-of-type(odd) { // Look & Feel , Aesthetics + background-color: $light-grey; + } - @media (max-width: 767px) { // for mobile devices, Display the names, Stability levels & types + div.metrics { - table.metrics { - th:nth-child(n + 4), - td:nth-child(n + 4) { + .metric { + div:empty{ display: none; } - td.metric_type{ - min-width: 7em; + display: flex; + flex-direction: column; + flex-wrap: wrap; + gap: .75em; + padding:.75em .75em .75em .75em; + + .metric_name{ + font-size: large; + font-weight: bold; + word-break: break-word; } - td.metric_stability_level{ - min-width: 6em; + + label{ + font-weight: bold; + margin-right: .5em; } - } - } - - table.metrics tbody{ // Tested dimensions to improve overall aesthetic of the table - tr { - td { - font-size: smaller; - } - td.metric_labels_varying{ - min-width: 9em; - } - td.metric_type{ - min-width: 9em; + ul { + li:empty{ + display: none; } - td.metric_description{ - min-width: 10em; + display: flex; + flex-direction: column; + gap: .75em; + flex-wrap: wrap; + li.metric_labels_varying{ + span{ + display: inline-block; + background-color: rgb(240, 239, 239); + padding: 0 0.5em; + margin-right: .35em; + font-family: monospace; + border: 1px solid rgb(230 , 230 , 230); + border-radius: 5%; + margin-bottom: .35em; + } } - + } + } - table.no-word-break td, - table.no-word-break code { - word-break: normal; -} + + } } // blockquotes and callouts diff --git a/content/en/docs/concepts/architecture/garbage-collection.md b/content/en/docs/concepts/architecture/garbage-collection.md index b7173405535fc..4b36d850b55bb 100644 --- a/content/en/docs/concepts/architecture/garbage-collection.md +++ b/content/en/docs/concepts/architecture/garbage-collection.md @@ -137,6 +137,20 @@ collection, which deletes images in order based on the last time they were used, starting with the oldest first. The kubelet deletes images until disk usage reaches the `LowThresholdPercent` value. +#### Garbage collection for unused container images {#image-maximum-age-gc} + +{{< feature-state for_k8s_version="v1.29" state="alpha" >}} + +As an alpha feature, you can specify the maximum time a local image can be unused for, +regardless of disk usage. This is a kubelet setting that you configure for each node. + +To configure the setting, enable the `ImageMaximumGCAge` +[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) for the kubelet, +and also set a value for the `ImageMaximumGCAge` field in the kubelet configuration file. + +The value is specified as a Kubernetes _duration_; for example, you can set the configuration +field to `3d12h`, which means 3 days and 12 hours. + ### Container garbage collection {#container-image-garbage-collection} The kubelet garbage collects unused containers based on the following variables, @@ -178,4 +192,4 @@ configure garbage collection: * Learn more about [ownership of Kubernetes objects](/docs/concepts/overview/working-with-objects/owners-dependents/). * Learn more about Kubernetes [finalizers](/docs/concepts/overview/working-with-objects/finalizers/). -* Learn about the [TTL controller](/docs/concepts/workloads/controllers/ttlafterfinished/) that cleans up finished Jobs. +* Learn about the [TTL controller](/docs/concepts/workloads/controllers/ttlafterfinished/) that cleans up finished Jobs. \ No newline at end of file diff --git a/content/en/docs/concepts/cluster-administration/flow-control.md b/content/en/docs/concepts/cluster-administration/flow-control.md index a00c146d5144a..859e841d0b73c 100644 --- a/content/en/docs/concepts/cluster-administration/flow-control.md +++ b/content/en/docs/concepts/cluster-administration/flow-control.md @@ -7,7 +7,7 @@ weight: 110 -{{< feature-state state="beta" for_k8s_version="v1.20" >}} +{{< feature-state state="stable" for_k8s_version="v1.29" >}} Controlling the behavior of the Kubernetes API server in an overload situation is a key task for cluster administrators. The {{< glossary_tooltip @@ -45,30 +45,27 @@ are not subject to the `--max-requests-inflight` limit. ## Enabling/Disabling API Priority and Fairness -The API Priority and Fairness feature is controlled by a feature gate -and is enabled by default. See [Feature -Gates](/docs/reference/command-line-tools-reference/feature-gates/) -for a general explanation of feature gates and how to enable and -disable them. The name of the feature gate for APF is -"APIPriorityAndFairness". This feature also involves an {{< -glossary_tooltip term_id="api-group" text="API Group" >}} with: (a) a -`v1alpha1` version and a `v1beta1` version, disabled by default, and -(b) `v1beta2` and `v1beta3` versions, enabled by default. You can -disable the feature gate and API group beta versions by adding the +The API Priority and Fairness feature is controlled by a command-line flag +and is enabled by default. See +[Options](/docs/reference/command-line-tools-reference/kube-apiserver/options/) +for a general explanation of the available kube-apiserver command-line +options and how to enable and disable them. The name of the +command-line option for APF is "--enable-priority-and-fairness". This feature +also involves an {{}} +with: (a) a stable `v1` version, introduced in 1.29, and +enabled by default (b) a `v1beta3` version, enabled by default, and +deprecated in v1.29. You can +disable the API group beta version `v1beta3` by adding the following command-line flags to your `kube-apiserver` invocation: ```shell kube-apiserver \ ---feature-gates=APIPriorityAndFairness=false \ ---runtime-config=flowcontrol.apiserver.k8s.io/v1beta2=false,flowcontrol.apiserver.k8s.io/v1beta3=false \ +--runtime-config=flowcontrol.apiserver.k8s.io/v1beta3=false \ # …and other flags as usual ``` -Alternatively, you can enable the v1alpha1 and v1beta1 versions of the API group -with `--runtime-config=flowcontrol.apiserver.k8s.io/v1alpha1=true,flowcontrol.apiserver.k8s.io/v1beta1=true`. - The command-line flag `--enable-priority-and-fairness=false` will disable the -API Priority and Fairness feature, even if other flags have enabled it. +API Priority and Fairness feature. ## Concepts @@ -178,14 +175,12 @@ server. ## Resources The flow control API involves two kinds of resources. -[PriorityLevelConfigurations](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#prioritylevelconfiguration-v1beta2-flowcontrol-apiserver-k8s-io) +[PriorityLevelConfigurations](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#prioritylevelconfiguration-v1-flowcontrol-apiserver-k8s-io) define the available priority levels, the share of the available concurrency budget that each can handle, and allow for fine-tuning queuing behavior. -[FlowSchemas](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#flowschema-v1beta2-flowcontrol-apiserver-k8s-io) +[FlowSchemas](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#flowschema-v1-flowcontrol-apiserver-k8s-io) are used to classify individual inbound requests, matching each to a -single PriorityLevelConfiguration. There is also a `v1alpha1` version -of the same API group, and it has the same Kinds with the same syntax and -semantics. +single PriorityLevelConfiguration. ### PriorityLevelConfiguration diff --git a/content/en/docs/concepts/cluster-administration/system-metrics.md b/content/en/docs/concepts/cluster-administration/system-metrics.md index 6f8b7227743ab..ff6b41bbcd0ef 100644 --- a/content/en/docs/concepts/cluster-administration/system-metrics.md +++ b/content/en/docs/concepts/cluster-administration/system-metrics.md @@ -202,10 +202,23 @@ Here is an example: --allow-label-value number_count_metric,odd_number='1,3,5', number_count_metric,even_number='2,4,6', date_gauge_metric,weekend='Saturday,Sunday' ``` +In addition to specifying this from the CLI, this can also be done within a configuration file. You +can specify the path to that configuration file using the `--allow-metric-labels-manifest` command +line argument to a component. Here's an example of the contents of that configuration file: + +```yaml +allow-list: +- "metric1,label2": "v1,v2,v3" +- "metric2,label1": "v1,v2,v3" +``` + +Additionally, the `cardinality_enforcement_unexpected_categorizations_total` meta-metric records the +count of unexpected categorizations during cardinality enforcement, that is, whenever a label value +is encountered that is not allowed with respect to the allow-list contraints. + ## {{% heading "whatsnext" %}} * Read about the [Prometheus text format](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md#text-based-format) for metrics * See the list of [stable Kubernetes metrics](https://github.com/kubernetes/kubernetes/blob/master/test/instrumentation/testdata/stable-metrics-list.yaml) -* Read about the [Kubernetes deprecation policy](/docs/reference/using-api/deprecation-policy/#deprecating-a-feature-or-behavior) - +* Read about the [Kubernetes deprecation policy](/docs/reference/using-api/deprecation-policy/#deprecating-a-feature-or-behavior) \ No newline at end of file diff --git a/content/en/docs/concepts/containers/container-lifecycle-hooks.md b/content/en/docs/concepts/containers/container-lifecycle-hooks.md index 8e1cd2eb59690..aec5433a75cc2 100644 --- a/content/en/docs/concepts/containers/container-lifecycle-hooks.md +++ b/content/en/docs/concepts/containers/container-lifecycle-hooks.md @@ -55,12 +55,15 @@ There are two types of hook handlers that can be implemented for Containers: * Exec - Executes a specific command, such as `pre-stop.sh`, inside the cgroups and namespaces of the Container. Resources consumed by the command are counted against the Container. * HTTP - Executes an HTTP request against a specific endpoint on the Container. +* Sleep - Pauses the container for a specified duration. + The "Sleep" action is available when the [feature gate](/docs/reference/command-line-tool-reference/feagure-gates/) + `PodLifecycleSleepAction` is enabled. ### Hook handler execution When a Container lifecycle management hook is called, the Kubernetes management system executes the handler according to the hook action, -`httpGet` and `tcpSocket` are executed by the kubelet process, and `exec` is executed in the container. +`httpGet` , `tcpSocket` and `sleep` are executed by the kubelet process, and `exec` is executed in the container. Hook handler calls are synchronous within the context of the Pod containing the Container. This means that for a `PostStart` hook, diff --git a/content/en/docs/concepts/containers/images.md b/content/en/docs/concepts/containers/images.md index b4b837ae32fc6..9b36a6b72803d 100644 --- a/content/en/docs/concepts/containers/images.md +++ b/content/en/docs/concepts/containers/images.md @@ -159,6 +159,17 @@ that Kubernetes will keep trying to pull the image, with an increasing back-off Kubernetes raises the delay between each attempt until it reaches a compiled-in limit, which is 300 seconds (5 minutes). +### Image pull per runtime class + +{{< feature-state for_k8s_version="v1.29" state="alpha" >}} +Kubernetes includes alpha support for performing image pulls based on the RuntimeClass of a Pod. + +If you enable the `RuntimeClassInImageCriApi` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/), +the kubelet references container images by a tuple of (image name, runtime handler) rather than just the +image name or digest. Your {{< glossary_tooltip text="container runtime" term_id="container-runtime" >}} +may adapt its behavior based on the selected runtime handler. +Pulling images based on runtime class will be helpful for VM based containers like windows hyperV containers. + ## Serial and parallel image pulls By default, kubelet pulls images serially. In other words, kubelet sends only diff --git a/content/en/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins.md b/content/en/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins.md index 4770e2c3be156..8dd955cdad965 100644 --- a/content/en/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins.md +++ b/content/en/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins.md @@ -159,8 +159,8 @@ The general workflow of a device plugin includes the following steps: {{< note >}} The processing of the fully-qualified CDI device names by the Device Manager requires that the `DevicePluginCDIDevices` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) - is enabled for the kubelet and the kube-apiserver. This was added as an alpha feature in Kubernetes - v1.28. + is enabled for both the kubelet and the kube-apiserver. This was added as an alpha feature in Kubernetes + v1.28 and graduated to beta in v1.29. {{< /note >}} ### Handling kubelet restarts diff --git a/content/en/docs/concepts/scheduling-eviction/assign-pod-node.md b/content/en/docs/concepts/scheduling-eviction/assign-pod-node.md index c33184b83e08c..3e8a6a14073a9 100644 --- a/content/en/docs/concepts/scheduling-eviction/assign-pod-node.md +++ b/content/en/docs/concepts/scheduling-eviction/assign-pod-node.md @@ -358,6 +358,108 @@ The affinity term is applied to namespaces selected by both `namespaceSelector` Note that an empty `namespaceSelector` ({}) matches all namespaces, while a null or empty `namespaces` list and null `namespaceSelector` matches the namespace of the Pod where the rule is defined. +#### matchLabelKeys + +{{< feature-state for_k8s_version="v1.29" state="alpha" >}} + +{{< note >}} + +The `matchLabelKeys` field is a alpha-level field and is disabled by default in +Kubernetes {{< skew currentVersion >}}. +When you want to use it, you have to enable it via the +`MatchLabelKeysInPodAffinity` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/). +{{< /note >}} + +Kubernetes includes an optional `matchLabelKeys` field for Pod affinity +or anti-affinity. The field specifies keys for the labels that should match with the incoming Pod's labels, +when satisfying the Pod (anti)affinity. + +The keys are used to look up values from the pod labels; those key-value labels are combined +(using `AND`) with the match restrictions defined using the `labelSelector` field. The combined +filtering selects the set of existing pods that will be taken into Pod (anti)affinity calculation. + +A common use case is to use `matchLabelKeys` with `pod-template-hash` (set on Pods +managed as part of a Deployment, where the value is unique for each revision). +Using `pod-template-hash` in `matchLabelKeys` allows you to target the Pods that belong +to the same revision as the incoming Pod, so that a rolling upgrade won't break affinity. + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: application-server +... +spec: + template: + affinity: + podAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + - labelSelector: + matchExpressions: + - key: app + operator: In + values: + - database + topologyKey: topology.kubernetes.io/zone + # Only Pods from a given rollout are taken into consideration when calculating pod affinity. + # If you update the Deployment, the replacement Pods follow their own affinity rules + # (if there are any defined in the new Pod template) + matchLabelKeys: + - pod-template-hash +``` + +#### mismatchLabelKeys + +{{< feature-state for_k8s_version="v1.29" state="alpha" >}} + +{{< note >}} + +The `mismatchLabelKeys` field is a alpha-level field and is disabled by default in +Kubernetes {{< skew currentVersion >}}. +When you want to use it, you have to enable it via the +`MatchLabelKeysInPodAffinity` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/). +{{< /note >}} + +Kubernetes includes an optional `mismatchLabelKeys` field for Pod affinity +or anti-affinity. The field specifies keys for the labels that should **not** match with the incoming Pod's labels, +when satisfying the Pod (anti)affinity. + +One example use case is to ensure Pods go to the topology domain (node, zone, etc) where only Pods from the same tenant or team are scheduled in. +In other words, you want to avoid running Pods from two different tenants on the same topology domain at the same time. + +```yaml +apiVersion: v1 +kind: Pod +metadata: + labels: + # Assume that all relevant Pods have a "tenant" label set + tenant: tenant-a +... +spec: + affinity: + podAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + # ensure that pods associated with this tenant land on the correct node pool + - matchLabelKeys: + - tenant + topologyKey: node-pool + podAntiAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + # ensure that pods associated with this tenant can't schedule to nodes used for another tenant + - mismatchLabelKeys: + - tenant # whatever the value of the "tenant" label for this Pod, prevent + # scheduling to nodes in any pool where any Pod from a different + # tenant is running. + labelSelector: + # We have to have the labelSelector which selects only Pods with the tenant label, + # otherwise this Pod would hate Pods from daemonsets as well, for example, + # which aren't supposed to have the tenant label. + matchExpressions: + - key: tenant + operator: Exists + topologyKey: node-pool +``` + #### More practical use-cases Inter-pod affinity and anti-affinity can be even more useful when they are used with higher diff --git a/content/en/docs/concepts/scheduling-eviction/dynamic-resource-allocation.md b/content/en/docs/concepts/scheduling-eviction/dynamic-resource-allocation.md index f34f7a2c5adbc..47420240d94df 100644 --- a/content/en/docs/concepts/scheduling-eviction/dynamic-resource-allocation.md +++ b/content/en/docs/concepts/scheduling-eviction/dynamic-resource-allocation.md @@ -162,6 +162,17 @@ gets scheduled onto one node and then cannot run there, which is bad because such a pending Pod also blocks all other resources like RAM or CPU that were set aside for it. +{{< note >}} + +Scheduling of pods which use ResourceClaims is going to be slower because of +the additional communication that is required. Beware that this may also impact +pods that don't use ResourceClaims because only one pod at a time gets +scheduled, blocking API calls are made while handling a pod with +ResourceClaims, and thus scheduling the next pod gets delayed. + +{{< /note >}} + + ## Monitoring resources The kubelet provides a gRPC service to enable discovery of dynamic resources of diff --git a/content/en/docs/concepts/scheduling-eviction/scheduling-framework.md b/content/en/docs/concepts/scheduling-eviction/scheduling-framework.md index 0e4bcb22f52da..18cfece94c760 100644 --- a/content/en/docs/concepts/scheduling-eviction/scheduling-framework.md +++ b/content/en/docs/concepts/scheduling-eviction/scheduling-framework.md @@ -84,8 +84,12 @@ the Pod is put into the active queue or the backoff queue so that the scheduler will retry the scheduling of the Pod. {{< note >}} -QueueingHint evaluation during scheduling is a beta-level feature and is enabled by default in 1.28. -You can disable it via the +QueueingHint evaluation during scheduling is a beta-level feature. +The v1.28 release series initially enabled the associated feature gate; however, after the +discovery of an excessive memory footprint, the Kubernetes project set that feature gate +to be disabled by default. In Kubernetes {{< skew currentVersion >}}, this feature gate is +disabled and you need to enable it manually. +You can enable it via the `SchedulerQueueingHints` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/). {{< /note >}} diff --git a/content/en/docs/concepts/security/pod-security-standards.md b/content/en/docs/concepts/security/pod-security-standards.md index 04bdc048727ac..15b64e001a0fb 100644 --- a/content/en/docs/concepts/security/pod-security-standards.md +++ b/content/en/docs/concepts/security/pod-security-standards.md @@ -486,6 +486,12 @@ Restrictions on the following controls are only required if `.spec.os.name` is n - Seccomp - Linux Capabilities +## User namespaces + +User Namespaces are a Linux-only feature to run workloads with increased +isolation. How they work together with Pod Security Standards is described in +the [documentation](/docs/concepts/workloads/pods/user-namespaces#integration-with-pod-security-admission-checks) for Pods that use user namespaces. + ## FAQ ### Why isn't there a profile between privileged and baseline? diff --git a/content/en/docs/concepts/security/service-accounts.md b/content/en/docs/concepts/security/service-accounts.md index 1c921fe219cf4..a7b3d54d76d33 100644 --- a/content/en/docs/concepts/security/service-accounts.md +++ b/content/en/docs/concepts/security/service-accounts.md @@ -247,7 +247,8 @@ request. The API server checks the validity of that bearer token as follows: The TokenRequest API produces _bound tokens_ for a ServiceAccount. This binding is linked to the lifetime of the client, such as a Pod, that is acting -as that ServiceAccount. +as that ServiceAccount. See [Token Volume Projection](/docs/tasks/configure-pod-container/configure-service-account/#serviceaccount-token-volume-projection) +for an example of a bound pod service account token's JWT schema and payload. For tokens issued using the `TokenRequest` API, the API server also checks that the specific object reference that is using the ServiceAccount still exists, @@ -269,7 +270,7 @@ account credentials, you can use the following methods: The Kubernetes project recommends that you use the TokenReview API, because this method invalidates tokens that are bound to API objects such as Secrets, -ServiceAccounts, and Pods when those objects are deleted. For example, if you +ServiceAccounts, Pods or Nodes when those objects are deleted. For example, if you delete the Pod that contains a projected ServiceAccount token, the cluster invalidates that token immediately and a TokenReview immediately fails. If you use OIDC validation instead, your clients continue to treat the token diff --git a/content/en/docs/concepts/services-networking/dual-stack.md b/content/en/docs/concepts/services-networking/dual-stack.md index bf3ccbe83207c..292204b9b53dc 100644 --- a/content/en/docs/concepts/services-networking/dual-stack.md +++ b/content/en/docs/concepts/services-networking/dual-stack.md @@ -65,12 +65,12 @@ To configure IPv4/IPv6 dual-stack, set dual-stack cluster network assignments: * kube-proxy: * `--cluster-cidr=,` * kubelet: - * when there is no `--cloud-provider` the administrator can pass a comma-separated pair of IP - addresses via `--node-ip` to manually configure dual-stack `.status.addresses` for that Node. - If a Pod runs on that node in HostNetwork mode, the Pod reports these IP addresses in its - `.status.podIPs` field. - All `podIPs` in a node match the IP family preference defined by the `.status.addresses` - field for that Node. + * `--node-ip=,` + * This option is required for bare metal dual-stack nodes (nodes that do not define a + cloud provider with the `--cloud-provider` flag). If you are using a cloud provider + and choose to override the node IPs chosen by the cloud provider, set the + `--node-ip` option. + * (The legacy built-in cloud providers do not support dual-stack `--node-ip`.) {{< note >}} An example of an IPv4 CIDR: `10.244.0.0/16` (though you would supply your own address range) @@ -79,13 +79,6 @@ An example of an IPv6 CIDR: `fdXY:IJKL:MNOP:15::/64` (this shows the format but address - see [RFC 4193](https://tools.ietf.org/html/rfc4193)) {{< /note >}} -{{< feature-state for_k8s_version="v1.27" state="alpha" >}} - -When using an external cloud provider, you can pass a dual-stack `--node-ip` value to -kubelet if you enable the `CloudDualStackNodeIPs` feature gate in both kubelet and the -external cloud provider. This is only supported for cloud providers that support dual -stack clusters. - ## Services You can create {{< glossary_tooltip text="Services" term_id="service" >}} which can use IPv4, IPv6, or both. diff --git a/content/en/docs/concepts/services-networking/service.md b/content/en/docs/concepts/services-networking/service.md index d7fb1c365f0cc..fd992995288da 100644 --- a/content/en/docs/concepts/services-networking/service.md +++ b/content/en/docs/concepts/services-networking/service.md @@ -520,16 +520,15 @@ spec: #### Reserve Nodeport ranges to avoid collisions {#avoid-nodeport-collisions} -{{< feature-state for_k8s_version="v1.28" state="beta" >}} +{{< feature-state for_k8s_version="v1.29" state="stable" >}} The policy for assigning ports to NodePort services applies to both the auto-assignment and the manual assignment scenarios. When a user wants to create a NodePort service that uses a specific port, the target port may conflict with another port that has already been assigned. -In this case, you can enable the feature gate `ServiceNodePortStaticSubrange`, which allows you -to use a different port allocation strategy for NodePort Services. The port range for NodePort services -is divided into two bands. Dynamic port assignment uses the upper band by default, and it may use -the lower band once the upper band has been exhausted. Users can then allocate from the lower band -with a lower risk of port collision. + +To avoid this problem, the port range for NodePort services is divided into two bands. +Dynamic port assignment uses the upper band by default, and it may use the lower band once the +upper band has been exhausted. Users can then allocate from the lower band with a lower risk of port collision. #### Custom IP address configuration for `type: NodePort` Services {#service-nodeport-custom-listen-address} @@ -669,6 +668,28 @@ The value of `spec.loadBalancerClass` must be a label-style identifier, with an optional prefix such as "`internal-vip`" or "`example.com/internal-vip`". Unprefixed names are reserved for end-users. +#### Specifying IPMode of load balancer status {#load-balancer-ip-mode} + +{{< feature-state for_k8s_version="v1.29" state="alpha" >}} + +Starting as Alpha in Kubernetes 1.29, +a [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) +named `LoadBalancerIPMode` allows you to set the `.status.loadBalancer.ingress.ipMode` +for a Service with `type` set to `LoadBalancer`. +The `.status.loadBalancer.ingress.ipMode` specifies how the load-balancer IP behaves. +It may be specified only when the `.status.loadBalancer.ingress.ip` field is also specified. + +There are two possible values for `.status.loadBalancer.ingress.ipMode`: "VIP" and "Proxy". +The default value is "VIP" meaning that traffic is delivered to the node +with the destination set to the load-balancer's IP and port. +There are two cases when setting this to "Proxy", depending on how the load-balancer +from the cloud provider delivers the traffics: + +- If the traffic is delivered to the node then DNATed to the pod, the destination would be set to the node's IP and node port; +- If the traffic is delivered directly to the pod, the destination would be set to the pod's IP and port. + +Service implementations may use this information to adjust traffic routing. + #### Internal load balancer In a mixed environment it is sometimes necessary to route traffic from Services inside the same diff --git a/content/en/docs/concepts/storage/persistent-volumes.md b/content/en/docs/concepts/storage/persistent-volumes.md index a8338062368eb..0d5caa4ef2f13 100644 --- a/content/en/docs/concepts/storage/persistent-volumes.md +++ b/content/en/docs/concepts/storage/persistent-volumes.md @@ -17,7 +17,8 @@ weight: 20 This document describes _persistent volumes_ in Kubernetes. Familiarity with -[volumes](/docs/concepts/storage/volumes/) is suggested. +[volumes](/docs/concepts/storage/volumes/), [StorageClasses](/docs/concepts/storage/storage-classes/) +and [VolumeAttributesClasses](/docs/concepts/storage/volume-attributes-classes/) is suggested. @@ -39,8 +40,8 @@ NFS, iSCSI, or a cloud-provider-specific storage system. A _PersistentVolumeClaim_ (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific -size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or -ReadWriteMany, see [AccessModes](#access-modes)). +size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany, +ReadWriteMany, or ReadWriteOncePod, see [AccessModes](#access-modes)). While PersistentVolumeClaims allow a user to consume abstract storage resources, it is common that users need PersistentVolumes with varying properties, such as @@ -618,7 +619,8 @@ The access modes are: `ReadWriteOnce` : the volume can be mounted as read-write by a single node. ReadWriteOnce access - mode still can allow multiple pods to access the volume when the pods are running on the same node. + mode still can allow multiple pods to access the volume when the pods are + running on the same node. For single pod access, please see ReadWriteOncePod. `ReadOnlyMany` : the volume can be mounted as read-only by many nodes. @@ -627,15 +629,22 @@ The access modes are: : the volume can be mounted as read-write by many nodes. `ReadWriteOncePod` -: {{< feature-state for_k8s_version="v1.27" state="beta" >}} +: {{< feature-state for_k8s_version="v1.29" state="stable" >}} the volume can be mounted as read-write by a single Pod. Use ReadWriteOncePod access mode if you want to ensure that only one pod across the whole cluster can - read that PVC or write to it. This is only supported for CSI volumes and - Kubernetes version 1.22+. + read that PVC or write to it. -The blog article -[Introducing Single Pod Access Mode for PersistentVolumes](/blog/2021/09/13/read-write-once-pod-access-mode-alpha/) -covers this in more detail. +{{< note >}} +The `ReadWriteOncePod` access mode is only supported for +{{< glossary_tooltip text="CSI" term_id="csi" >}} volumes and Kubernetes version +1.22+. To use this feature you will need to update the following +[CSI sidecars](https://kubernetes-csi.github.io/docs/sidecar-containers.html) +to these versions or greater: + +* [csi-provisioner:v3.0.0+](https://github.com/kubernetes-csi/external-provisioner/releases/tag/v3.0.0) +* [csi-attacher:v3.3.0+](https://github.com/kubernetes-csi/external-attacher/releases/tag/v3.3.0) +* [csi-resizer:v1.3.0+](https://github.com/kubernetes-csi/external-resizer/releases/tag/v1.3.0) +{{< /note >}} In the CLI, the access modes are abbreviated to: @@ -753,7 +762,7 @@ You can see the name of the PVC bound to the PV using `kubectl describe persiste #### Phase transition timestamp -{{< feature-state for_k8s_version="v1.28" state="alpha" >}} +{{< feature-state for_k8s_version="v1.29" state="beta" >}} The `.status` field for a PersistentVolume can include an alpha `lastPhaseTransitionTime` field. This field records the timestamp of when the volume last transitioned its phase. For newly created diff --git a/content/en/docs/concepts/storage/projected-volumes.md b/content/en/docs/concepts/storage/projected-volumes.md index ac64fa4d7daf8..8d59b8026482f 100644 --- a/content/en/docs/concepts/storage/projected-volumes.md +++ b/content/en/docs/concepts/storage/projected-volumes.md @@ -24,6 +24,7 @@ Currently, the following types of volume sources can be projected: * [`downwardAPI`](/docs/concepts/storage/volumes/#downwardapi) * [`configMap`](/docs/concepts/storage/volumes/#configmap) * [`serviceAccountToken`](#serviceaccounttoken) +* [`clusterTrustBundle`](#clustertrustbundle) All sources are required to be in the same namespace as the Pod. For more details, see the [all-in-one volume](https://git.k8s.io/design-proposals-archive/node/all-in-one-volume.md) design document. @@ -70,6 +71,31 @@ A container using a projected volume source as a [`subPath`](/docs/concepts/stor volume mount will not receive updates for those volume sources. {{< /note >}} +## clusterTrustBundle projected volumes {#clustertrustbundle} + +{{}} + +{{< note >}} +To use this feature in Kubernetes {{< skew currentVersion >}}, you must enable support for ClusterTrustBundle objects with the `ClusterTrustBundle` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) and `--runtime-config=certificates.k8s.io/v1alpha1/clustertrustbundles=true` kube-apiserver flag, then enable the `ClusterTrustBundleProjection` feature gate. +{{< /note >}} + +The `clusterTrustBundle` projected volume source injects the contents of one or more [ClusterTrustBundle](/docs/reference/access-authn-authz/certificate-signing-requests#cluster-trust-bundles) objects as an automatically-updating file in the container filesystem. + +ClusterTrustBundles can be selected either by [name](/docs/reference/access-authn-authz/certificate-signing-requests#ctb-signer-unlinked) or by [signer name](/docs/reference/access-authn-authz/certificate-signing-requests#ctb-signer-linked). + +To select by name, use the `name` field to designate a single ClusterTrustBundle object. + +To select by signer name, use the `signerName` field (and optionally the +`labelSelector` field) to designate a set of ClusterTrustBundle objects that use +the given signer name. If `labelSelector` is not present, then all +ClusterTrustBundles for that signer are selected. + +The kubelet deduplicates the certificates in the selected ClusterTrustBundle objects, normalizes the PEM representations (discarding comments and headers), reorders the certificates, and writes them into the file named by `path`. As the set of selected ClusterTrustBundles or their content changes, kubelet keeps the file up-to-date. + +By default, the kubelet will prevent the pod from starting if the named ClusterTrustBundle is not found, or if `signerName` / `labelSelector` do not match any ClusterTrustBundles. If this behavior is not what you want, then set the `optional` field to `true`, and the pod will start up with an empty file at `path`. + +{{% code_sample file="pods/storage/projected-clustertrustbundle.yaml" %}} + ## SecurityContext interactions The [proposal](https://git.k8s.io/enhancements/keps/sig-storage/2451-service-account-token-volumes#proposal) for file permission handling in projected service account volume enhancement introduced the projected files having the correct owner permissions set. diff --git a/content/en/docs/concepts/storage/storage-classes.md b/content/en/docs/concepts/storage/storage-classes.md index bea656be35400..bfc42810b843c 100644 --- a/content/en/docs/concepts/storage/storage-classes.md +++ b/content/en/docs/concepts/storage/storage-classes.md @@ -17,8 +17,6 @@ with [volumes](/docs/concepts/storage/volumes/) and -## Introduction - A StorageClass provides a way for administrators to describe the "classes" of storage they offer. Different classes might map to quality-of-service levels, or to backup policies, or to arbitrary policies determined by the cluster @@ -26,7 +24,7 @@ administrators. Kubernetes itself is unopinionated about what classes represent. This concept is sometimes called "profiles" in other storage systems. -## The StorageClass Resource +## The StorageClass API Each StorageClass contains the fields `provisioner`, `parameters`, and `reclaimPolicy`, which are used when a PersistentVolume belonging to the diff --git a/content/en/docs/concepts/storage/volume-attributes-classes.md b/content/en/docs/concepts/storage/volume-attributes-classes.md new file mode 100644 index 0000000000000..69b4e41289237 --- /dev/null +++ b/content/en/docs/concepts/storage/volume-attributes-classes.md @@ -0,0 +1,131 @@ +--- +reviewers: +- msau42 +- xing-yang +title: Volume Attributes Classes +content_type: concept +weight: 40 +--- + + +{{< feature-state for_k8s_version="v1.29" state="alpha" >}} + +This page assumes that you are familiar with [StorageClasses](/docs/concepts/storage/storage-classes/), +[volumes](/docs/concepts/storage/volumes/) and [PersistentVolumes](/docs/concepts/storage/persistent-volumes/) +in Kubernetes. + + + +A VolumeAttributesClass provides a way for administrators to describe the mutable +"classes" of storage they offer. Different classes might map to different quality-of-service levels. +Kubernetes itself is unopinionated about what these classes represent. + +This is an alpha feature and disabled by default. + +If you want to test the feature whilst it's alpha, you need to enable the `VolumeAttributesClass` +[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) for the kube-controller-manager and the kube-apiserver. You use the `--feature-gates` command line argument: + +``` +--feature-gates="...,VolumeAttributesClass=true" +``` + +You can also only use VolumeAttributesClasses with storage backed by +{{< glossary_tooltip text="Container Storage Interface" term_id="csi" >}}, and only where the +relevant CSI driver implements the `ModifyVolume` API. + +## The VolumeAttributesClass API + +Each VolumeAttributesClass contains the `driverName` and `parameters`, which are +used when a PersistentVolume (PV) belonging to the class needs to be dynamically provisioned +or modified. + +The name of a VolumeAttributesClass object is significant and is how users can request a particular class. +Administrators set the name and other parameters of a class when first creating VolumeAttributesClass objects. +While the name of a VolumeAttributesClass object in a `PersistentVolumeClaim` is mutable, the parameters in an existing class are immutable. + + +```yaml +apiVersion: storage.k8s.io/v1alpha1 +kind: VolumeAttributesClass +metadata: + name: silver +driverName: pd.csi.storage.gke.io +parameters: + provisioned-iops: "3000" + provisioned-throughput: "50" +``` + + +### Provisioner + +Each VolumeAttributesClass has a provisioner that determines what volume plugin is used for provisioning PVs. The field `driverName` must be specified. + +The feature support for VolumeAttributesClass is implemented in [kubernetes-csi/external-provisioner](https://github.com/kubernetes-csi/external-provisioner). + +You are not restricted to specifying the [kubernetes-csi/external-provisioner](https://github.com/kubernetes-csi/external-provisioner). You can also run and specify external provisioners, +which are independent programs that follow a specification defined by Kubernetes. +Authors of external provisioners have full discretion over where their code lives, how +the provisioner is shipped, how it needs to be run, what volume plugin it uses, etc. + + +### Resizer + +Each VolumeAttributesClass has a resizer that determines what volume plugin is used for modifying PVs. The field `driverName` must be specified. + +The modifying volume feature support for VolumeAttributesClass is implemented in [kubernetes-csi/external-resizer](https://github.com/kubernetes-csi/external-resizer). + +For example, a existing PersistentVolumeClaim is using a VolumeAttributesClass named silver: + +```yaml +apiVersion: v1 +kind: PersistentVolumeClaim +metadata: + name: test-pv-claim +spec: + … + volumeAttributesClassName: silver + … +``` + +A new VolumeAttributesClass gold is available in the cluster: + + +```yaml +apiVersion: storage.k8s.io/v1alpha1 +kind: VolumeAttributesClass +metadata: + name: gold +driverName: pd.csi.storage.gke.io +parameters: + iops: "4000" + throughput: "60" +``` + + +The end user can update the PVC with the new VolumeAttributesClass gold and apply: + + +```yaml +apiVersion: v1 +kind: PersistentVolumeClaim +metadata: + name: test-pv-claim +spec: + … + volumeAttributesClassName: gold + … +``` + + +## Parameters + +VolumeAttributeClasses have parameters that describe volumes belonging to them. Different parameters may be accepted +depending on the provisioner or the resizer. For example, the value `4000`, for the parameter `iops`, +and the parameter `throughput` are specific to GCE PD. +When a parameter is omitted, the default is used at volume provisioning. +If a user apply the PVC with a different VolumeAttributesClass with omitted parameters, the default value of +the parameters may be used depends on the CSI driver implementation. +Please refer to the related CSI driver documentation for more details. + +There can be at most 512 parameters defined for a VolumeAttributesClass. +The total length of the parameters object including its keys and values cannot exceed 256 KiB. \ No newline at end of file diff --git a/content/en/docs/concepts/workloads/controllers/cron-jobs.md b/content/en/docs/concepts/workloads/controllers/cron-jobs.md index 33f914716467d..d2ce56bece06e 100644 --- a/content/en/docs/concepts/workloads/controllers/cron-jobs.md +++ b/content/en/docs/concepts/workloads/controllers/cron-jobs.md @@ -181,15 +181,14 @@ A time zone database from the Go standard library is included in the binaries an ### Unsupported TimeZone specification -The implementation of the CronJob API in Kubernetes {{< skew currentVersion >}} lets you set -the `.spec.schedule` field to include a timezone; for example: `CRON_TZ=UTC * * * * *` -or `TZ=UTC * * * * *`. - -Specifying a timezone that way is **not officially supported** (and never has been). - -If you try to set a schedule that includes `TZ` or `CRON_TZ` timezone specification, -Kubernetes reports a [warning](/blog/2020/09/03/warnings/) to the client. -Future versions of Kubernetes will prevent setting the unofficial timezone mechanism entirely. +Specifying a timezone using `CRON_TZ` or `TZ` variables inside `.spec.schedule` +is **not officially supported** (and never has been). + +Starting with Kubernetes 1.29 if you try to set a schedule that includes `TZ` or `CRON_TZ` +timezone specification, Kubernetes will fail to create the resource with a validation +error. +Updates to CronJobs already using `TZ` or `CRON_TZ` will continue to report a +[warning](/blog/2020/09/03/warnings/) to the client. ### Modifying a CronJob diff --git a/content/en/docs/concepts/workloads/controllers/job.md b/content/en/docs/concepts/workloads/controllers/job.md index 014050e1931bc..cc51f6f84a7a0 100644 --- a/content/en/docs/concepts/workloads/controllers/job.md +++ b/content/en/docs/concepts/workloads/controllers/job.md @@ -382,7 +382,7 @@ from failed Jobs is not lost inadvertently. ### Backoff limit per index {#backoff-limit-per-index} -{{< feature-state for_k8s_version="v1.28" state="alpha" >}} +{{< feature-state for_k8s_version="v1.29" state="beta" >}} {{< note >}} You can only configure the backoff limit per index for an [Indexed](#completion-mode) Job, if you @@ -958,11 +958,12 @@ scaling an indexed Job, such as MPI, Horovord, Ray, and PyTorch training jobs. ### Delayed creation of replacement pods {#pod-replacement-policy} -{{< feature-state for_k8s_version="v1.28" state="alpha" >}} +{{< feature-state for_k8s_version="v1.29" state="beta" >}} {{< note >}} You can only set `podReplacementPolicy` on Jobs if you enable the `JobPodReplacementPolicy` -[feature gate](/docs/reference/command-line-tools-reference/feature-gates/). +[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) +(enabled by default). {{< /note >}} By default, the Job controller recreates Pods as soon they either fail or are terminating (have a deletion timestamp). diff --git a/content/en/docs/concepts/workloads/controllers/statefulset.md b/content/en/docs/concepts/workloads/controllers/statefulset.md index 927b2e53f3ea3..b0176108cea75 100644 --- a/content/en/docs/concepts/workloads/controllers/statefulset.md +++ b/content/en/docs/concepts/workloads/controllers/statefulset.md @@ -116,6 +116,12 @@ spec: storage: 1Gi ``` +{{< note >}} +This example uses the `ReadWriteOnce` access mode, for simplicity. For +production use, the Kubernetes project recommends using the `ReadWriteOncePod` +access mode instead. +{{< /note >}} + In the above example: * A Headless Service, named `nginx`, is used to control the network domain. diff --git a/content/en/docs/concepts/workloads/pods/_index.md b/content/en/docs/concepts/workloads/pods/_index.md index febf062c2ebdc..1132c38793c5a 100644 --- a/content/en/docs/concepts/workloads/pods/_index.md +++ b/content/en/docs/concepts/workloads/pods/_index.md @@ -111,9 +111,9 @@ Some Pods have {{< glossary_tooltip text="init containers" term_id="init-contain as well as {{< glossary_tooltip text="app containers" term_id="app-container" >}}. By default, init containers run and complete before the app containers are started. -{{< feature-state for_k8s_version="v1.28" state="alpha" >}} +{{< feature-state for_k8s_version="v1.29" state="beta" >}} -Enabling the `SidecarContainers` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) +Enabled by default, the `SidecarContainers` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) allows you to specify `restartPolicy: Always` for init containers. Setting the `Always` restart policy ensures that the init containers where you set it are kept running during the entire lifetime of the Pod. diff --git a/content/en/docs/concepts/workloads/pods/pod-lifecycle.md b/content/en/docs/concepts/workloads/pods/pod-lifecycle.md index 6453bfcb23353..ff73d7bc2310a 100644 --- a/content/en/docs/concepts/workloads/pods/pod-lifecycle.md +++ b/content/en/docs/concepts/workloads/pods/pod-lifecycle.md @@ -175,7 +175,7 @@ through which the Pod has or has not passed. Kubelet manages the following PodConditions: * `PodScheduled`: the Pod has been scheduled to a node. -* `PodReadyToStartContainers`: (alpha feature; must be [enabled explicitly](#pod-has-network)) the +* `PodReadyToStartContainers`: (beta feature; enabled by [default](#pod-has-network)) the Pod sandbox has been successfully created and networking configured. * `ContainersReady`: all containers in the Pod are ready. * `Initialized`: all [init containers](/docs/concepts/workloads/pods/init-containers/) @@ -253,19 +253,21 @@ When a Pod's containers are Ready but at least one custom condition is missing o ### Pod network readiness {#pod-has-network} -{{< feature-state for_k8s_version="v1.25" state="alpha" >}} +{{< feature-state for_k8s_version="v1.29" state="beta" >}} {{< note >}} -This condition was renamed from PodHasNetwork to PodReadyToStartContainers. +During its early development, this condition was named `PodHasNetwork`. {{< /note >}} -After a Pod gets scheduled on a node, it needs to be admitted by the Kubelet and -have any volumes mounted. Once these phases are complete, the Kubelet works with +After a Pod gets scheduled on a node, it needs to be admitted by the kubelet and +to have any required storage volumes mounted. Once these phases are complete, +the kubelet works with a container runtime (using {{< glossary_tooltip term_id="cri" >}}) to set up a runtime sandbox and configure networking for the Pod. If the -`PodReadyToStartContainersCondition` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is enabled, -Kubelet reports whether a pod has reached this initialization milestone through -the `PodReadyToStartContainers` condition in the `status.conditions` field of a Pod. +`PodReadyToStartContainersCondition` +[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is enabled +(it is enabled by default for Kubernetes {{< skew currentVersion >}}), the +`PodReadyToStartContainers` condition will be added to the `status.conditions` field of a Pod. The `PodReadyToStartContainers` condition is set to `False` by the Kubelet when it detects a Pod does not have a runtime sandbox with networking configured. This occurs in @@ -515,6 +517,22 @@ termination grace period _begins_. The behavior above is described when the feature gate `EndpointSliceTerminatingCondition` is enabled. {{}} +{{}} +Beginning with Kubernetes 1.29, if your Pod includes one or more sidecar containers +(init containers with an Always restart policy), the kubelet will delay sending +the TERM signal to these sidecar containers until the last main container has fully terminated. +The sidecar containers will be terminated in the reverse order they are defined in the Pod spec. +This ensures that sidecar containers continue serving the other containers in the Pod until they are no longer needed. + +Note that slow termination of a main container will also delay the termination of the sidecar containers. +If the grace period expires before the termination process is complete, the Pod may enter emergency termination. +In this case, all remaining containers in the Pod will be terminated simultaneously with a short grace period. + +Similarly, if the Pod has a preStop hook that exceeds the termination grace period, emergency termination may occur. +In general, if you have used preStop hooks to control the termination order without sidecar containers, you can now +remove them and allow the kubelet to manage sidecar termination automatically. +{{}} + 1. When the grace period expires, the kubelet triggers forcible shutdown. The container runtime sends `SIGKILL` to any processes still running in any container in the Pod. The kubelet also cleans up a hidden `pause` container if that container runtime uses one. @@ -597,4 +615,4 @@ for more details. * For detailed information about Pod and container status in the API, see the API reference documentation covering - [`status`](/docs/reference/kubernetes-api/workload-resources/pod-v1/#PodStatus) for Pod. + [`status`](/docs/reference/kubernetes-api/workload-resources/pod-v1/#PodStatus) for Pod. \ No newline at end of file diff --git a/content/en/docs/concepts/workloads/pods/sidecar-containers.md b/content/en/docs/concepts/workloads/pods/sidecar-containers.md index a1f5a9fbb8fc9..db12852da1466 100644 --- a/content/en/docs/concepts/workloads/pods/sidecar-containers.md +++ b/content/en/docs/concepts/workloads/pods/sidecar-containers.md @@ -5,7 +5,7 @@ weight: 50 --- -{{< feature-state for_k8s_version="v1.28" state="alpha" >}} +{{< feature-state for_k8s_version="v1.29" state="beta" >}} Sidecar containers are the secondary containers that run along with the main application container within the same {{< glossary_tooltip text="Pod" term_id="pod" >}}. diff --git a/content/en/docs/concepts/workloads/pods/user-namespaces.md b/content/en/docs/concepts/workloads/pods/user-namespaces.md index fa51a47d305e8..410b3c90524d2 100644 --- a/content/en/docs/concepts/workloads/pods/user-namespaces.md +++ b/content/en/docs/concepts/workloads/pods/user-namespaces.md @@ -152,6 +152,35 @@ host's file owner/group. [CVE-2021-25741]: https://github.com/kubernetes/kubernetes/issues/104980 +## Integration with Pod security admission checks + +{{< feature-state state="alpha" for_k8s_version="v1.29" >}} + +For Linux Pods that enable user namespaces, Kubernetes relaxes the application of +[Pod Security Standards](/docs/concepts/security/pod-security-standards) in a controlled way. +This behavior can be controlled by the [feature +gate](/docs/reference/command-line-tools-reference/feature-gates/) +`UserNamespacesPodSecurityStandards`, which allows an early opt-in for end +users. Admins have to ensure that user namespaces are enabled by all nodes +within the cluster if using the feature gate. + +If you enable the associated feature gate and create a Pod that uses user +namespaces, the following fields won't be constrained even in contexts that enforce the +_Baseline_ or _Restricted_ pod security standard. This behavior does not +present a security concern because `root` inside a Pod with user namespaces +actually refers to the user inside the container, that is never mapped to a +privileged user on the host. Here's the list of fields that are **not** checks for Pods in those +circumstances: + +- `spec.securityContext.runAsNonRoot` +- `spec.containers[*].securityContext.runAsNonRoot` +- `spec.initContainers[*].securityContext.runAsNonRoot` +- `spec.ephemeralContainers[*].securityContext.runAsNonRoot` +- `spec.securityContext.runAsUser` +- `spec.containers[*].securityContext.runAsUser` +- `spec.initContainers[*].securityContext.runAsUser` +- `spec.ephemeralContainers[*].securityContext.runAsUser` + ## Limitations When using a user namespace for the pod, it is disallowed to use other host diff --git a/content/en/docs/reference/access-authn-authz/authentication.md b/content/en/docs/reference/access-authn-authz/authentication.md index a2ddea62a1635..3e152180ce1ce 100644 --- a/content/en/docs/reference/access-authn-authz/authentication.md +++ b/content/en/docs/reference/access-authn-authz/authentication.md @@ -242,7 +242,7 @@ and are assigned to the groups `system:serviceaccounts` and `system:serviceaccou {{< warning >}} Because service account tokens can also be stored in Secret API objects, any user with -write access to Secrets can request a token, and any user with read access to those +write access to Secrets can request a token, and any user with read access to those Secrets can authenticate as the service account. Be cautious when granting permissions to service accounts and read or write capabilities for Secrets. {{< /warning >}} @@ -293,8 +293,9 @@ sequenceDiagram 1. Your identity provider will provide you with an `access_token`, `id_token` and a `refresh_token` 1. When using `kubectl`, use your `id_token` with the `--token` flag or add it directly to your `kubeconfig` 1. `kubectl` sends your `id_token` in a header called Authorization to the API server -1. The API server will make sure the JWT signature is valid by checking against the certificate named in the configuration +1. The API server will make sure the JWT signature is valid 1. Check to make sure the `id_token` hasn't expired + 1. Perform claim and/or user validation if CEL expressions are configured with `AuthenticationConfiguration`. 1. Make sure the user is authorized 1. Once authorized the API server returns a response to `kubectl` 1. `kubectl` provides feedback to the user @@ -312,6 +313,8 @@ very scalable solution for authentication. It does offer a few challenges: #### Configuring the API Server +##### Using flags + To enable the plugin, configure the following flags on the API server: | Parameter | Description | Example | Required | @@ -326,6 +329,291 @@ To enable the plugin, configure the following flags on the API server: | `--oidc-ca-file` | The path to the certificate for the CA that signed your identity provider's web certificate. Defaults to the host's root CAs. | `/etc/kubernetes/ssl/kc-ca.pem` | No | | `--oidc-signing-algs` | The signing algorithms accepted. Default is "RS256". | `RS512` | No | +##### Using Authentication Configuration + +{{< feature-state for_k8s_version="v1.29" state="alpha" >}} + +JWT Authenticator is an authenticator to authenticate Kubernetes users using JWT compliant tokens. The authenticator will attempt to +parse a raw ID token, verify it's been signed by the configured issuer. The public key to verify the signature is discovered from the issuer's public endpoint using OIDC discovery. + +The API server can be configured to use a JWT authenticator via the `--authentication-config` flag. This flag takes a path to a file containing the `AuthenticationConfiguration`. An example configuration is provided below. +To use this config, the `StructuredAuthenticationConfiguration` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) +has to be enabled. + +{{< note >}} +When the feature is enabled, setting both `--authentication-config` and any of the `--oidc-*` flags will result in an error. If you want to use the feature, you have to remove the `--oidc-*` flags and use the configuration file instead. +{{< /note >}} + +```yaml +--- +# +# CAUTION: this is an example configuration. +# Do not use this for your own cluster! +# +apiVersion: apiserver.config.k8s.io/v1alpha1 +kind: AuthenticationConfiguration +# list of authenticators to authenticate Kubernetes users using JWT compliant tokens. +jwt: +- issuer: + url: https://example.com # Same as --oidc-issuer-url. + audiences: + - my-app # Same as --oidc-client-id. + # rules applied to validate token claims to authenticate users. + claimValidationRules: + # Same as --oidc-required-claim key=value. + - claim: hd + requiredValue: example.com + # Instead of claim and requiredValue, you can use expression to validate the claim. + # expression is a CEL expression that evaluates to a boolean. + # all the expressions must evaluate to true for validation to succeed. + - expression: 'claims.hd == "example.com"' + # Message customizes the error message seen in the API server logs when the validation fails. + message: the hd claim must be set to example.com + - expression: 'claims.exp - claims.nbf <= 86400' + message: total token lifetime must not exceed 24 hours + claimMappings: + # username represents an option for the username attribute. + # This is the only required attribute. + username: + # Same as --oidc-username-claim. Mutually exclusive with username.expression. + claim: "sub" + # Same as --oidc-username-prefix. Mutually exclusive with username.expression. + # if username.claim is set, username.prefix is required. + # Explicitly set it to "" if no prefix is desired. + prefix: "" + # Mutually exclusive with username.claim and username.prefix. + # expression is a CEL expression that evaluates to a string. + expression: 'claims.username + ":external-user"' + # groups represents an option for the groups attribute. + groups: + # Same as --oidc-groups-claim. Mutually exclusive with groups.expression. + claim: "sub" + # Same as --oidc-groups-prefix. Mutually exclusive with groups.expression. + # if groups.claim is set, groups.prefix is required. + # Explicitly set it to "" if no prefix is desired. + prefix: "" + # Mutually exclusive with groups.claim and groups.prefix. + # expression is a CEL expression that evaluates to a string or a list of strings. + expression: 'claims.roles.split(",")' + # uid represents an option for the uid attribute. + uid: + # Mutually exclusive with uid.expression. + claim: 'sub' + # Mutually exclusive with uid.claim + # expression is a CEL expression that evaluates to a string. + expression: 'claims.sub' + # extra attributes to be added to the UserInfo object. Keys must be domain-prefix path and must be unique. + extra: + - key: 'example.com/tenant' + # valueExpression is a CEL expression that evaluates to a string or a list of strings. + valueExpression: 'claims.tenant' + # validation rules applied to the final user object. + userValidationRules: + # expression is a CEL expression that evaluates to a boolean. + # all the expressions must evaluate to true for the user to be valid. + - expression: "!user.username.startsWith('system:')" + # Message customizes the error message seen in the API server logs when the validation fails. + message: 'username cannot used reserved system: prefix' + - expression: "user.groups.all(group, !group.startsWith('system:'))" + message: 'groups cannot used reserved system: prefix' +``` + +* Claim validation rule expression + + `jwt.claimValidationRules[i].expression` represents the expression which will be evaluated by CEL. + CEL expressions have access to the contents of the token payload, organized into `claims` CEL variable. + `claims` is a map of claim names (as strings) to claim values (of any type). +* User validation rule expression + + `jwt.userValidationRules[i].expression` represents the expression which will be evaluated by CEL. + CEL expressions have access to the contents of `userInfo`, organized into `user` CEL variable. + Refer to the [UserInfo](/docs/reference/generated/kubernetes-api/v{{< skew currentVersion >}}/#userinfo-v1-authentication-k8s-io) API documentation for the schema of `user`. +* Claim mapping expression + + `jwt.claimMappings.username.expression`, `jwt.claimMappings.groups.expression`, `jwt.claimMappings.uid.expression` + `jwt.claimMappings.extra[i].valueExpression` represents the expression which will be evaluated by CEL. + CEL expressions have access to the contents of the token payload, organized into `claims` CEL variable. + `claims` is a map of claim names (as strings) to claim values (of any type). + + To learn more, see the [Documentation on CEL](/docs/reference/using-api/cel/) + + Here are examples of the `AuthenticationConfiguration` with different token payloads. + + {{< tabs name="example_configuration" >}} + {{% tab name="Valid token" %}} + ```yaml + apiVersion: apiserver.config.k8s.io/v1alpha1 + kind: AuthenticationConfiguration + jwt: + - issuer: + url: https://example.com + audiences: + - my-app + claimMappings: + username: + expression: 'claims.username + ":external-user"' + groups: + expression: 'claims.roles.split(",")' + uid: + expression: 'claims.sub' + extra: + - key: 'example.com/tenant' + valueExpression: 'claims.tenant' + userValidationRules: + - expression: "!user.username.startsWith('system:')" # the expression will evaluate to true, so validation will succeed. + message: 'username cannot used reserved system: prefix' + ``` + + ```bash + TOKEN=eyJhbGciOiJSUzI1NiIsImtpZCI6ImY3dF9tOEROWmFTQk1oWGw5QXZTWGhBUC04Y0JmZ0JVbFVpTG5oQkgxdXMiLCJ0eXAiOiJKV1QifQ.eyJhdWQiOiJrdWJlcm5ldGVzIiwiZXhwIjoxNzAzMjMyOTQ5LCJpYXQiOjE3MDExMDcyMzMsImlzcyI6Imh0dHBzOi8vZXhhbXBsZS5jb20iLCJqdGkiOiI3YzMzNzk0MjgwN2U3M2NhYTJjMzBjODY4YWMwY2U5MTBiY2UwMmRkY2JmZWJlOGMyM2I4YjVmMjdhZDYyODczIiwibmJmIjoxNzAxMTA3MjMzLCJyb2xlcyI6InVzZXIsYWRtaW4iLCJzdWIiOiJhdXRoIiwidGVuYW50IjoiNzJmOTg4YmYtODZmMS00MWFmLTkxYWItMmQ3Y2QwMTFkYjRhIiwidXNlcm5hbWUiOiJmb28ifQ.TBWF2RkQHm4QQz85AYPcwLxSk-VLvQW-mNDHx7SEOSv9LVwcPYPuPajJpuQn9C_gKq1R94QKSQ5F6UgHMILz8OfmPKmX_00wpwwNVGeevJ79ieX2V-__W56iNR5gJ-i9nn6FYk5pwfVREB0l4HSlpTOmu80gbPWAXY5hLW0ZtcE1JTEEmefORHV2ge8e3jp1xGafNy6LdJWabYuKiw8d7Qga__HxtKB-t0kRMNzLRS7rka_SfQg0dSYektuxhLbiDkqhmRffGlQKXGVzUsuvFw7IGM5ZWnZgEMDzCI357obHeM3tRqpn5WRjtB8oM7JgnCymaJi-P3iCd88iu1xnzA + ``` + where the token payload is: + + ```json + { + "aud": "kubernetes", + "exp": 1703232949, + "iat": 1701107233, + "iss": "https://example.com", + "jti": "7c337942807e73caa2c30c868ac0ce910bce02ddcbfebe8c23b8b5f27ad62873", + "nbf": 1701107233, + "roles": "user,admin", + "sub": "auth", + "tenant": "72f988bf-86f1-41af-91ab-2d7cd011db4a", + "username": "foo" + } + ``` + + The token with the above `AuthenticationConfiguration` will produce the following `UserInfo` object and successfully authenticate the user. + + ```json + { + "username": "foo:external-user", + "uid": "auth", + "groups": [ + "user", + "admin" + ], + "extra": { + "example.com/tenant": "tenant1" + } + } + ``` + {{% /tab %}} + {{% tab name="Fails claim validation" %}} + ```yaml + apiVersion: apiserver.config.k8s.io/v1alpha1 + kind: AuthenticationConfiguration + jwt: + - issuer: + url: https://example.com + audiences: + - my-app + claimValidationRules: + - expression: 'claims.hd == "example.com"' # the token below does not have this claim, so validation will fail. + message: the hd claim must be set to example.com + claimMappings: + username: + expression: 'claims.username + ":external-user"' + groups: + expression: 'claims.roles.split(",")' + uid: + expression: 'claims.sub' + extra: + - key: 'example.com/tenant' + valueExpression: 'claims.tenant' + userValidationRules: + - expression: "!user.username.startsWith('system:')" # the expression will evaluate to true, so validation will succeed. + message: 'username cannot used reserved system: prefix' + ``` + + ```bash + TOKEN=eyJhbGciOiJSUzI1NiIsImtpZCI6ImY3dF9tOEROWmFTQk1oWGw5QXZTWGhBUC04Y0JmZ0JVbFVpTG5oQkgxdXMiLCJ0eXAiOiJKV1QifQ.eyJhdWQiOiJrdWJlcm5ldGVzIiwiZXhwIjoxNzAzMjMyOTQ5LCJpYXQiOjE3MDExMDcyMzMsImlzcyI6Imh0dHBzOi8vZXhhbXBsZS5jb20iLCJqdGkiOiI3YzMzNzk0MjgwN2U3M2NhYTJjMzBjODY4YWMwY2U5MTBiY2UwMmRkY2JmZWJlOGMyM2I4YjVmMjdhZDYyODczIiwibmJmIjoxNzAxMTA3MjMzLCJyb2xlcyI6InVzZXIsYWRtaW4iLCJzdWIiOiJhdXRoIiwidGVuYW50IjoiNzJmOTg4YmYtODZmMS00MWFmLTkxYWItMmQ3Y2QwMTFkYjRhIiwidXNlcm5hbWUiOiJmb28ifQ.TBWF2RkQHm4QQz85AYPcwLxSk-VLvQW-mNDHx7SEOSv9LVwcPYPuPajJpuQn9C_gKq1R94QKSQ5F6UgHMILz8OfmPKmX_00wpwwNVGeevJ79ieX2V-__W56iNR5gJ-i9nn6FYk5pwfVREB0l4HSlpTOmu80gbPWAXY5hLW0ZtcE1JTEEmefORHV2ge8e3jp1xGafNy6LdJWabYuKiw8d7Qga__HxtKB-t0kRMNzLRS7rka_SfQg0dSYektuxhLbiDkqhmRffGlQKXGVzUsuvFw7IGM5ZWnZgEMDzCI357obHeM3tRqpn5WRjtB8oM7JgnCymaJi-P3iCd88iu1xnzA + ``` + where the token payload is: + ```json + { + "aud": "kubernetes", + "exp": 1703232949, + "iat": 1701107233, + "iss": "https://example.com", + "jti": "7c337942807e73caa2c30c868ac0ce910bce02ddcbfebe8c23b8b5f27ad62873", + "nbf": 1701107233, + "roles": "user,admin", + "sub": "auth", + "tenant": "72f988bf-86f1-41af-91ab-2d7cd011db4a", + "username": "foo" + } + ``` + + The token with the above `AuthenticationConfiguration` will fail to authenticate because the `hd` claim is not set to `example.com`. The API server will return `401 Unauthorized` error. + {{% /tab %}} + {{% tab name="Fails user validation" %}} + ```yaml + apiVersion: apiserver.config.k8s.io/v1alpha1 + kind: AuthenticationConfiguration + jwt: + - issuer: + url: https://example.com + audiences: + - my-app + claimValidationRules: + - expression: 'claims.hd == "example.com"' + message: the hd claim must be set to example.com + claimMappings: + username: + expression: '"system:" + claims.username' # this will prefix the username with "system:" and will fail user validation. + groups: + expression: 'claims.roles.split(",")' + uid: + expression: 'claims.sub' + extra: + - key: 'example.com/tenant' + valueExpression: 'claims.tenant' + userValidationRules: + - expression: "!user.username.startsWith('system:')" # the username will be system:foo and expression will evaluate to false, so validation will fail. + message: 'username cannot used reserved system: prefix' + ``` + ```bash + TOKEN=eyJhbGciOiJSUzI1NiIsImtpZCI6ImY3dF9tOEROWmFTQk1oWGw5QXZTWGhBUC04Y0JmZ0JVbFVpTG5oQkgxdXMiLCJ0eXAiOiJKV1QifQ.eyJhdWQiOiJrdWJlcm5ldGVzIiwiZXhwIjoxNzAzMjMyOTQ5LCJoZCI6ImV4YW1wbGUuY29tIiwiaWF0IjoxNzAxMTEzMTAxLCJpc3MiOiJodHRwczovL2V4YW1wbGUuY29tIiwianRpIjoiYjViMDY1MjM3MmNkMjBlMzQ1YjZmZGZmY2RjMjE4MWY0YWZkNmYyNTlhYWI0YjdlMzU4ODEyMzdkMjkyMjBiYyIsIm5iZiI6MTcwMTExMzEwMSwicm9sZXMiOiJ1c2VyLGFkbWluIiwic3ViIjoiYXV0aCIsInRlbmFudCI6IjcyZjk4OGJmLTg2ZjEtNDFhZi05MWFiLTJkN2NkMDExZGI0YSIsInVzZXJuYW1lIjoiZm9vIn0.FgPJBYLobo9jnbHreooBlvpgEcSPWnKfX6dc0IvdlRB-F0dCcgy91oCJeK_aBk-8zH5AKUXoFTlInfLCkPivMOJqMECA1YTrMUwt_IVqwb116AqihfByUYIIqzMjvUbthtbpIeHQm2fF0HbrUqa_Q0uaYwgy8mD807h7sBcUMjNd215ff_nFIHss-9zegH8GI1d9fiBf-g6zjkR1j987EP748khpQh9IxPjMJbSgG_uH5x80YFuqgEWwq-aYJPQxXX6FatP96a2EAn7wfPpGlPRt0HcBOvq5pCnudgCgfVgiOJiLr_7robQu4T1bis0W75VPEvwWtgFcLnvcQx0JWg + ``` + where the token payload is: + + ```json + { + "aud": "kubernetes", + "exp": 1703232949, + "hd": "example.com", + "iat": 1701113101, + "iss": "https://example.com", + "jti": "b5b0652372cd20e345b6fdffcdc2181f4afd6f259aab4b7e35881237d29220bc", + "nbf": 1701113101, + "roles": "user,admin", + "sub": "auth", + "tenant": "72f988bf-86f1-41af-91ab-2d7cd011db4a", + "username": "foo" + } + ``` + + The token with the above `AuthenticationConfiguration` will produce the following `UserInfo` object: + + ```json + { + "username": "system:foo", + "uid": "auth", + "groups": [ + "user", + "admin" + ], + "extra": { + "example.com/tenant": "tenant1" + } + } + ``` + which will fail user validation because the username starts with `system:`. The API server will return `401 Unauthorized` error. + {{% /tab %}} + {{< /tabs >}} + Importantly, the API server is not an OAuth2 client, rather it can only be configured to trust a single issuer. This allows the use of public providers, such as Google, without trusting credentials issued to third parties. Admins who @@ -432,7 +720,7 @@ Webhook authentication is a hook for verifying bearer tokens. * `--authentication-token-webhook-config-file` a configuration file describing how to access the remote webhook service. * `--authentication-token-webhook-cache-ttl` how long to cache authentication decisions. Defaults to two minutes. -* `--authentication-token-webhook-version` determines whether to use `authentication.k8s.io/v1beta1` or `authentication.k8s.io/v1` +* `--authentication-token-webhook-version` determines whether to use `authentication.k8s.io/v1beta1` or `authentication.k8s.io/v1` `TokenReview` objects to send/receive information from the webhook. Defaults to `v1beta1`. The configuration file uses the [kubeconfig](/docs/concepts/configuration/organize-cluster-access-kubeconfig/) @@ -489,9 +777,9 @@ To opt into receiving `authentication.k8s.io/v1` token reviews, the API server m "spec": { # Opaque bearer token sent to the API server "token": "014fbff9a07c...", - + # Optional list of the audience identifiers for the server the token was presented to. - # Audience-aware token authenticators (for example, OIDC token authenticators) + # Audience-aware token authenticators (for example, OIDC token authenticators) # should verify the token was intended for at least one of the audiences in this list, # and return the intersection of this list and the valid audiences for the token in the response status. # This ensures the token is valid to authenticate to the server it was presented to. @@ -509,9 +797,9 @@ To opt into receiving `authentication.k8s.io/v1` token reviews, the API server m "spec": { # Opaque bearer token sent to the API server "token": "014fbff9a07c...", - + # Optional list of the audience identifiers for the server the token was presented to. - # Audience-aware token authenticators (for example, OIDC token authenticators) + # Audience-aware token authenticators (for example, OIDC token authenticators) # should verify the token was intended for at least one of the audiences in this list, # and return the intersection of this list and the valid audiences for the token in the response status. # This ensures the token is valid to authenticate to the server it was presented to. @@ -870,7 +1158,7 @@ rules: {{< note >}} Impersonating a user or group allows you to perform any action as if you were that user or group; for that reason, impersonation is not namespace scoped. -If you want to allow impersonation using Kubernetes RBAC, +If you want to allow impersonation using Kubernetes RBAC, this requires using a `ClusterRole` and a `ClusterRoleBinding`, not a `Role` and `RoleBinding`. {{< /note >}} @@ -1378,7 +1666,7 @@ status: {{% /tab %}} {{< /tabs >}} -This feature is extremely useful when a complicated authentication flow is used in a Kubernetes cluster, +This feature is extremely useful when a complicated authentication flow is used in a Kubernetes cluster, for example, if you use [webhook token authentication](/docs/reference/access-authn-authz/authentication/#webhook-token-authentication) or [authenticating proxy](/docs/reference/access-authn-authz/authentication/#authenticating-proxy). @@ -1390,7 +1678,7 @@ you see the user details and properties for the user that was impersonated. {{< /note >}} By default, all authenticated users can create `SelfSubjectReview` objects when the `APISelfSubjectReview` -feature is enabled. It is allowed by the `system:basic-user` cluster role. +feature is enabled. It is allowed by the `system:basic-user` cluster role. {{< note >}} You can only make `SelfSubjectReview` requests if: diff --git a/content/en/docs/reference/access-authn-authz/authorization.md b/content/en/docs/reference/access-authn-authz/authorization.md index 9a77c7ac21124..621cc9773b474 100644 --- a/content/en/docs/reference/access-authn-authz/authorization.md +++ b/content/en/docs/reference/access-authn-authz/authorization.md @@ -209,6 +209,143 @@ The following flags can be used: You can choose more than one authorization module. Modules are checked in order so an earlier module has higher priority to allow or deny a request. +## Configuring the API Server using an Authorization Config File + +{{< feature-state state="alpha" for_k8s_version="v1.29" >}} + +The Kubernetes API server's authorizer chain can be configured using a +configuration file. + +You specify the path to that authorization configuration using the +`--authorization-config` command line argument. This feature enables +creation of authorization chains with multiple webhooks with well-defined +parameters that validate requests in a certain order and enables fine grained +control - such as explicit Deny on failures. An example configuration with +all possible values is provided below. + +In order to customise the authorizer chain, you need to enable the +`StructuredAuthorizationConfiguration` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/). + +Note: When the feature is enabled, setting both `--authorization-config` and +configuring an authorization webhook using the `--authorization-mode` and +`--authorization-webhook-*` command line flags is not allowed. If done, there +will be an error and API Server would exit right away. + +{{< caution >}} +While the feature is in Alpha/Beta, there is no change if you want to keep on +using command line flags. When the feature goes Beta, the feature flag would +be turned on by default. The feature flag would be removed when feature goes GA. + +When configuring the authorizer chain using a config file, make sure all the +apiserver nodes have the file. Also, take a note of the apiserver configuration +when upgrading/downgrading the clusters. For example, if upgrading to v1.29+ +clusters and using the config file, you would need to make sure the config file +exists before upgrading the cluster. When downgrading to v1.28, you would need +to add the flags back to their bootstrap mechanism. +{{< /caution >}} + +```yaml +# +# DO NOT USE THE CONFIG AS IS. THIS IS AN EXAMPLE. +# +apiVersion: apiserver.config.k8s.io/v1alpha1 +kind: AuthorizationConfiguration +# authorizers are defined in order of precedence +authorizers: + - type: Webhook + # Name used to describe the authorizer + # This is explicitly used in monitoring machinery for metrics + # Note: + # - Validation for this field is similar to how K8s labels are validated today. + # Required, with no default + name: webhook + webhook: + # The duration to cache 'authorized' responses from the webhook + # authorizer. + # Same as setting `--authorization-webhook-cache-authorized-ttl` flag + # Default: 5m0s + authorizedTTL: 30s + # The duration to cache 'unauthorized' responses from the webhook + # authorizer. + # Same as setting `--authorization-webhook-cache-unauthorized-ttl` flag + # Default: 30s + unauthorizedTTL: 30s + # Timeout for the webhook request + # Maximum allowed is 30s. + # Required, with no default. + timeout: 3s + # The API version of the authorization.k8s.io SubjectAccessReview to + # send to and expect from the webhook. + # Same as setting `--authorization-webhook-version` flag + # Required, with no default + # Valid values: v1beta1, v1 + subjectAccessReviewVersion: v1 + # MatchConditionSubjectAccessReviewVersion specifies the SubjectAccessReview + # version the CEL expressions are evaluated against + # Valid values: v1 + # Required only if matchConditions are specified, no default value + matchConditionSubjectAccessReviewVersion: v1 + # Controls the authorization decision when a webhook request fails to + # complete or returns a malformed response or errors evaluating + # matchConditions. + # Valid values: + # - NoOpinion: continue to subsequent authorizers to see if one of + # them allows the request + # - Deny: reject the request without consulting subsequent authorizers + # Required, with no default. + failurePolicy: Deny + connectionInfo: + # Controls how the webhook should communicate with the server. + # Valid values: + # - KubeConfig: use the file specified in kubeConfigFile to locate the + # server. + # - InClusterConfig: use the in-cluster configuration to call the + # SubjectAccessReview API hosted by kube-apiserver. This mode is not + # allowed for kube-apiserver. + type: KubeConfig + # Path to KubeConfigFile for connection info + # Required, if connectionInfo.Type is KubeConfig + kubeConfigFile: /kube-system-authz-webhook.yaml + # matchConditions is a list of conditions that must be met for a request to be sent to this + # webhook. An empty list of matchConditions matches all requests. + # There are a maximum of 64 match conditions allowed. + # + # The exact matching logic is (in order): + # 1. If at least one matchCondition evaluates to FALSE, then the webhook is skipped. + # 2. If ALL matchConditions evaluate to TRUE, then the webhook is called. + # 3. If at least one matchCondition evaluates to an error (but none are FALSE): + # - If failurePolicy=Deny, then the webhook rejects the request + # - If failurePolicy=NoOpinion, then the error is ignored and the webhook is skipped + matchConditions: + # expression represents the expression which will be evaluated by CEL. Must evaluate to bool. + # CEL expressions have access to the contents of the SubjectAccessReview in v1 version. + # If version specified by subjectAccessReviewVersion in the request variable is v1beta1, + # the contents would be converted to the v1 version before evaluating the CEL expression. + # + # Documentation on CEL: https://kubernetes.io/docs/reference/using-api/cel/ + # + # only send resource requests to the webhook + - expression: has(request.resourceAttributes) + # only intercept requests to kube-system + - expression: request.resourceAttributes.namespace == 'kube-system' + # don't intercept requests from kube-system service accounts + - expression: !('system:serviceaccounts:kube-system' in request.user.groups) + - type: Node + name: node + - type: RBAC + name: rbac + - type: Webhook + name: in-cluster-authorizer + webhook: + authorizedTTL: 5m + unauthorizedTTL: 30s + timeout: 3s + subjectAccessReviewVersion: v1 + failurePolicy: NoOpinion + connectionInfo: + type: InClusterConfig +``` + ## Privilege escalation via workload creation or edits {#privilege-escalation-via-pod-creation} Users who can create/edit pods in a namespace, either directly or through a [controller](/docs/concepts/architecture/controller/) @@ -241,4 +378,3 @@ This should be considered when deciding on your RBAC controls. * To learn more about Authentication, see **Authentication** in [Controlling Access to the Kubernetes API](/docs/concepts/security/controlling-access/). * To learn more about Admission Control, see [Using Admission Controllers](/docs/reference/access-authn-authz/admission-controllers/). - diff --git a/content/en/docs/reference/access-authn-authz/certificate-signing-requests.md b/content/en/docs/reference/access-authn-authz/certificate-signing-requests.md index 9da1dd6c1af9a..ec13b0badefca 100644 --- a/content/en/docs/reference/access-authn-authz/certificate-signing-requests.md +++ b/content/en/docs/reference/access-authn-authz/certificate-signing-requests.md @@ -371,7 +371,7 @@ you like. If you want to add a note for human consumption, use the {{< feature-state for_k8s_version="v1.27" state="alpha" >}} {{< note >}} -In Kubernetes {{< skew currentVersion >}}, you must enable the `ClusterTrustBundles` +In Kubernetes {{< skew currentVersion >}}, you must enable the `ClusterTrustBundle` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) _and_ the `certificates.k8s.io/v1alpha1` {{< glossary_tooltip text="API group" term_id="api-group" >}} in order to use @@ -472,6 +472,12 @@ such as role-based access control. To distinguish them from signer-linked ClusterTrustBundles, the names of signer-unlinked ClusterTrustBundles **must not** contain a colon (`:`). +### Accessing ClusterTrustBundles from pods {#ctb-projection} + +{{}} + +The contents of ClusterTrustBundles can be injected into the container filesystem, similar to ConfigMaps and Secrets. See the [clusterTrustBundle projected volume source](/docs/concepts/storage/projected-volumes#clustertrustbundle) for more details. + ## How to issue a certificate for a user {#normal-user} diff --git a/content/en/docs/reference/access-authn-authz/service-accounts-admin.md b/content/en/docs/reference/access-authn-authz/service-accounts-admin.md index ca6f831da835f..f2f1025701bf8 100644 --- a/content/en/docs/reference/access-authn-authz/service-accounts-admin.md +++ b/content/en/docs/reference/access-authn-authz/service-accounts-admin.md @@ -1,9 +1,7 @@ --- reviewers: - - bprashanth - - davidopp - - lavalamp - liggitt + - enj title: Managing Service Accounts content_type: concept weight: 50 @@ -140,6 +138,62 @@ using [TokenRequest](/docs/reference/kubernetes-api/authentication-resources/tok to obtain short-lived API access tokens is recommended instead. {{< /note >}} +## Auto-generated legacy ServiceAccount token clean up {#auto-generated-legacy-serviceaccount-token-clean-up} + +Before version 1.24, Kubernetes automatically generated Secret-based tokens for +ServiceAccounts. To distinguish between automatically generated tokens and +manually created ones, Kubernetes checks for a reference from the +ServiceAccount's secrets field. If the Secret is referenced in the `secrets` +field, it is considered an auto-generated legacy token. Otherwise, it is +considered a manually created legacy token. For example: + +```yaml +apiVersion: v1 +kind: ServiceAccount +metadata: + name: build-robot + namespace: default +secrets: + - name: build-robot-secret # usually NOT present for a manually generated token +``` + +Beginning from version 1.29, legacy ServiceAccount tokens that were generated +automatically will be marked as invalid if they remain unused for a certain +period of time (set to default at one year). Tokens that continue to be unused +for this defined period (again, by default, one year) will subsequently be +purged by the control plane. + +If users use an invalidated auto-generated token, the token validator will + +1. add an audit annotation for the key-value pair + `authentication.k8s.io/legacy-token-invalidated: /`, +1. increment the `invalid_legacy_auto_token_uses_total` metric count, +1. update the Secret label `kubernetes.io/legacy-token-last-used` with the new + date, +1. return an error indicating that the token has been invalidated. + +When receiving this validation error, users can update the Secret to remove the +`kubernetes.io/legacy-token-invalid-since` label to temporarily allow use of +this token. + +Here's an example of an auto-generated legacy token that has been marked with the +`kubernetes.io/legacy-token-last-used` and `kubernetes.io/legacy-token-invalid-since` +labels: + +```yaml +apiVersion: v1 +kind: Secret +metadata: + name: build-robot-secret + namespace: default + labels: + kubernetes.io/legacy-token-last-used: 2022-10-24 + kubernetes.io/legacy-token-invalid-since: 2023-10-25 + annotations: + kubernetes.io/service-account.name: build-robot +type: kubernetes.io/service-account-token +``` + ## Control plane details ### ServiceAccount controller @@ -193,6 +247,51 @@ it does the following when a Pod is created: 1. If the spec of the incoming Pod doesn't already contain any `imagePullSecrets`, then the admission controller adds `imagePullSecrets`, copying them from the `ServiceAccount`. +### Legacy ServiceAccount token tracking controller + +{{< feature-state for_k8s_version="v1.28" state="stable" >}} + +This controller generates a ConfigMap called +`kube-system/kube-apiserver-legacy-service-account-token-tracking` in the +`kube-system` namespace. The ConfigMap records the timestamp when legacy service +account tokens began to be monitored by the system. + +### Legacy ServiceAccount token cleaner + +{{< feature-state for_k8s_version="v1.29" state="beta" >}} + +The legacy ServiceAccount token cleaner runs as part of the +`kube-controller-manager` and checks every 24 hours to see if any auto-generated +legacy ServiceAccount token has not been used in a *specified amount of time*. +If so, the cleaner marks those tokens as invalid. + +The cleaner works by first checking the ConfigMap created by the control plane +(provided that `LegacyServiceAccountTokenTracking` is enabled). If the current +time is a *specified amount of time* after the date in the ConfigMap, the +cleaner then loops through the list of Secrets in the cluster and evaluates each +Secret that has the type `kubernetes.io/service-account-token`. + +If a Secret meets all of the following conditions, the cleaner marks it as +invalid: + +- The Secret is auto-generated, meaning that it is bi-directionally referenced + by a ServiceAccount. +- The Secret is not currently mounted by any pods. +- The Secret has not been used in a *specified amount of time* since it was + created or since it was last used. + +The cleaner marks a Secret invalid by adding a label called +`kubernetes.io/legacy-token-invalid-since` to the Secret, with the current date +as the value. If an invalid Secret is not used in a *specified amount of time*, +the cleaner will delete it. + +{{< note >}} +All the *specified amount of time* above defaults to one year. The cluster +administrator can configure this value through the +`--legacy-service-account-token-clean-up-period` command line argument for the +`kube-controller-manager` component. +{{< /note >}} + ### TokenRequest API {{< feature-state for_k8s_version="v1.22" state="stable" >}} @@ -300,6 +399,12 @@ token: ... If you launch a new Pod into the `examplens` namespace, it can use the `myserviceaccount` service-account-token Secret that you just created. +{{< caution >}} +Do not reference manually created Secrets in the `secrets` field of a +ServiceAccount. Or the manually created Secrets will be cleaned if it is not used for a long +time. Please refer to [auto-generated legacy ServiceAccount token clean up](#auto-generated-legacy-serviceaccount-token-clean-up). +{{< /caution >}} + ## Delete/invalidate a ServiceAccount token {#delete-token} If you know the name of the Secret that contains the token you want to remove: diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md b/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md index 75cf88333d0b5..6d3c129d28f54 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md @@ -120,6 +120,9 @@ In the following table: | `CronJobControllerV2` | `false` | Alpha | 1.20 | 1.20 | | `CronJobControllerV2` | `true` | Beta | 1.21 | 1.21 | | `CronJobControllerV2` | `true` | GA | 1.22 | 1.23 | +| `CronJobTimeZone` | `false` | Alpha | 1.24 | 1.24 | +| `CronJobTimeZone` | `true` | Beta | 1.25 | 1.26 | +| `CronJobTimeZone` | `true` | GA | 1.27 | 1.28 | | `CustomPodDNS` | `false` | Alpha | 1.9 | 1.9 | | `CustomPodDNS` | `true` | Beta| 1.10 | 1.13 | | `CustomPodDNS` | `true` | GA | 1.14 | 1.16 | @@ -153,6 +156,10 @@ In the following table: | `DisableAcceleratorUsageMetrics` | `false` | Alpha | 1.19 | 1.19 | | `DisableAcceleratorUsageMetrics` | `true` | Beta | 1.20 | 1.24 | | `DisableAcceleratorUsageMetrics` | `true` | GA | 1.25 | 1.27 | +| `DownwardAPIHugePages` | `false` | Alpha | 1.20 | 1.20 | +| `DownwardAPIHugePages` | `false` | Beta | 1.21 | 1.21 | +| `DownwardAPIHugePages` | `true` | Beta | 1.22 | 1.26 | +| `DownwardAPIHugePages` | `true` | GA | 1.27 | 1.28 | | `DryRun` | `false` | Alpha | 1.12 | 1.12 | | `DryRun` | `true` | Beta | 1.13 | 1.18 | | `DryRun` | `true` | GA | 1.19 | 1.27 | @@ -200,6 +207,9 @@ In the following table: | `ExternalPolicyForExternalIP` | `true` | GA | 1.18 | 1.22 | | `GCERegionalPersistentDisk` | `true` | Beta | 1.10 | 1.12 | | `GCERegionalPersistentDisk` | `true` | GA | 1.13 | 1.16 | +| `GRPCContainerProbe` | `false` | Alpha | 1.23 | 1.23 | +| `GRPCContainerProbe` | `true` | Beta | 1.24 | 1.26 | +| `GRPCContainerProbe` | `true` | GA | 1.27 | 1.28 | | `GenericEphemeralVolume` | `false` | Alpha | 1.19 | 1.20 | | `GenericEphemeralVolume` | `true` | Beta | 1.21 | 1.22 | | `GenericEphemeralVolume` | `true` | GA | 1.23 | 1.24 | @@ -228,6 +238,8 @@ In the following table: | `IngressClassNamespacedParams` | `true` | GA | 1.23 | 1.24 | | `Initializers` | `false` | Alpha | 1.7 | 1.13 | | `Initializers` | - | Deprecated | 1.14 | 1.14 | +| `JobMutableNodeSchedulingDirectives` | `true` | Beta | 1.23 | 1.26 | +| `JobMutableNodeSchedulingDirectives` | `true` | GA | 1.27 | 1.28 | | `KMSv1` | `true` | Deprecated | 1.28 | | | `KubeletConfigFile` | `false` | Alpha | 1.8 | 1.9 | | `KubeletConfigFile` | - | Deprecated | 1.10 | 1.10 | @@ -240,6 +252,8 @@ In the following table: | `LegacyNodeRoleBehavior` | `false` | Alpha | 1.16 | 1.18 | | `LegacyNodeRoleBehavior` | `true` | Beta | 1.19 | 1.20 | | `LegacyNodeRoleBehavior` | `false` | GA | 1.21 | 1.22 | +| `LegacyServiceAccountTokenNoAutoGeneration` | `true` | Beta | 1.24 | 1.25 | +| `LegacyServiceAccountTokenNoAutoGeneration` | `true` | GA | 1.26 | 1.28 | | `LocalStorageCapacityIsolation` | `false` | Alpha | 1.7 | 1.9 | | `LocalStorageCapacityIsolation` | `true` | Beta | 1.10 | 1.24 | | `LocalStorageCapacityIsolation` | `true` | GA | 1.25 | 1.26 | @@ -303,6 +317,9 @@ In the following table: | `ResourceQuotaScopeSelectors` | `false` | Alpha | 1.11 | 1.11 | | `ResourceQuotaScopeSelectors` | `true` | Beta | 1.12 | 1.16 | | `ResourceQuotaScopeSelectors` | `true` | GA | 1.17 | 1.18 | +| `RetroactiveDefaultStorageClass` | `false` | Alpha | 1.25 | 1.25 | +| `RetroactiveDefaultStorageClass` | `true` | Beta | 1.26 | 1.27 | +| `RetroactiveDefaultStorageClass` | `true` | GA | 1.28 | 1.28 | | `RootCAConfigMap` | `false` | Alpha | 1.13 | 1.19 | | `RootCAConfigMap` | `true` | Beta | 1.20 | 1.20 | | `RootCAConfigMap` | `true` | GA | 1.21 | 1.22 | @@ -393,6 +410,9 @@ In the following table: | `TokenRequestProjection` | `false` | Alpha | 1.11 | 1.11 | | `TokenRequestProjection` | `true` | Beta | 1.12 | 1.19 | | `TokenRequestProjection` | `true` | GA | 1.20 | 1.21 | +| `TopologyManager` | `false` | Alpha | 1.16 | 1.17 | +| `TopologyManager` | `true` | Beta | 1.18 | 1.26 | +| `TopologyManager` | `true` | GA | 1.27 | 1.28 | | `UserNamespacesStatelessPodsSupport` | `false` | Alpha | 1.25 | 1.27 | | `ValidateProxyRedirects` | `false` | Alpha | 1.12 | 1.13 | | `ValidateProxyRedirects` | `true` | Beta | 1.14 | 1.21 | @@ -591,10 +611,6 @@ In the following table: [Configure volume permission and ownership change policy for Pods](/docs/tasks/configure-pod-container/security-context/#configure-volume-permission-and-ownership-change-policy-for-pods) for more details. -- `CronJobControllerV2`: Use an alternative implementation of the - {{< glossary_tooltip text="CronJob" term_id="cronjob" >}} controller. Otherwise, - version 1 of the same controller is selected. - - `ControllerManagerLeaderMigration`: Enables Leader Migration for [kube-controller-manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/#initial-leader-migration-configuration) and [cloud-controller-manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/#deploy-cloud-controller-manager) @@ -602,6 +618,12 @@ In the following table: controllers from the kube-controller-manager into an external controller-manager (e.g. the cloud-controller-manager) in an HA cluster without downtime. +- `CronJobControllerV2`: Use an alternative implementation of the + {{< glossary_tooltip text="CronJob" term_id="cronjob" >}} controller. Otherwise, + version 1 of the same controller is selected. + +- `CronJobTimeZone`: Allow the use of the `timeZone` optional field in [CronJobs](/docs/concepts/workloads/controllers/cron-jobs/) + - `CustomPodDNS`: Enable customizing the DNS settings for a Pod using its `dnsConfig` property. Check [Pod's DNS Config](/docs/concepts/services-networking/dns-pod-service/#pods-dns-config) for more details. @@ -636,6 +658,9 @@ In the following table: - `DisableAcceleratorUsageMetrics`: [Disable accelerator metrics collected by the kubelet](/docs/concepts/cluster-administration/system-metrics/#disable-accelerator-metrics). +- `DownwardAPIHugePages`: Enables usage of hugepages in + [downward API](/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information). + - `DryRun`: Enable server-side [dry run](/docs/reference/using-api/api-concepts/#dry-run) requests so that validation, merging, and mutation can be tested without committing. @@ -695,6 +720,9 @@ In the following table: - `GCERegionalPersistentDisk`: Enable the regional PD feature on GCE. +- `GRPCContainerProbe`: Enables the gRPC probe method for {Liveness,Readiness,Startup}Probe. + See [Configure Liveness, Readiness and Startup Probes](/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-grpc-liveness-probe). + - `GenericEphemeralVolume`: Enables ephemeral, inline volumes that support all features of normal volumes (can be provided by third-party storage vendors, storage capacity tracking, restore from snapshot, etc.). @@ -731,6 +759,9 @@ In the following table: - `Initializers`: Allow asynchronous coordination of object creation using the Initializers admission plugin. +- `JobMutableNodeSchedulingDirectives`: Allows updating node scheduling directives in + the pod template of [Job](/docs/concepts/workloads/controllers/job). + - `KubeletConfigFile`: Enable loading kubelet configuration from a file specified using a config file. See [setting kubelet parameters via a config file](/docs/tasks/administer-cluster/kubelet-config-file/) @@ -746,6 +777,9 @@ In the following table: node disruption will ignore the `node-role.kubernetes.io/master` label in favor of the feature-specific labels provided by `NodeDisruptionExclusion` and `ServiceNodeExclusion`. +- `LegacyServiceAccountTokenNoAutoGeneration`: Stop auto-generation of Secret-based + [service account tokens](/docs/concepts/security/service-accounts/#get-a-token). + - `LocalStorageCapacityIsolation`: Enable the consumption of [local ephemeral storage](/docs/concepts/configuration/manage-resources-containers/) and also the `sizeLimit` property of an @@ -818,6 +852,8 @@ In the following table: - `ResourceQuotaScopeSelectors`: Enable resource quota scope selectors. +- `RetroactiveDefaultStorageClass`: Allow assigning StorageClass to unbound PVCs retroactively. + - `RootCAConfigMap`: Configure the `kube-controller-manager` to publish a {{< glossary_tooltip text="ConfigMap" term_id="configmap" >}} named `kube-root-ca.crt` to every namespace. This ConfigMap contains a CA bundle used for verifying connections @@ -920,6 +956,10 @@ In the following table: - `TokenRequestProjection`: Enable the injection of service account tokens into a Pod through a [`projected` volume](/docs/concepts/storage/volumes/#projected). +- `TopologyManager`: Enable a mechanism to coordinate fine-grained hardware resource + assignments for different components in Kubernetes. See + [Control Topology Management Policies on a node](/docs/tasks/administer-cluster/topology-manager/). + - `UserNamespacesStatelessPodsSupport`: Enable user namespace support for stateless Pods. This flag was renamed on newer releases to `UserNamespacesSupport`. - `ValidateProxyRedirects`: This flag controls whether the API server should validate that redirects diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 1c25dcae011fb..3f118e64a8799 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -55,8 +55,6 @@ For a reference to old feature gates that are removed, please refer to | Feature | Default | Stage | Since | Until | |---------|---------|-------|-------|-------| -| `APIListChunking` | `false` | Alpha | 1.8 | 1.8 | -| `APIListChunking` | `true` | Beta | 1.9 | | | `APIPriorityAndFairness` | `false` | Alpha | 1.18 | 1.19 | | `APIPriorityAndFairness` | `true` | Beta | 1.20 | | | `APIResponseCompression` | `false` | Alpha | 1.7 | 1.15 | @@ -79,12 +77,12 @@ For a reference to old feature gates that are removed, please refer to | `CRDValidationRatcheting` | `false` | Alpha | 1.28 | | | `CSIMigrationPortworx` | `false` | Alpha | 1.23 | 1.24 | | `CSIMigrationPortworx` | `false` | Beta | 1.25 | | -| `CSINodeExpandSecret` | `false` | Alpha | 1.25 | 1.26 | -| `CSINodeExpandSecret` | `true` | Beta | 1.27 | | | `CSIVolumeHealth` | `false` | Alpha | 1.21 | | | `CloudControllerManagerWebhook` | `false` | Alpha | 1.27 | | -| `CloudDualStackNodeIPs` | `false` | Alpha | 1.27 | | -| `ClusterTrustBundle` | `false` | Alpha | 1.27 | | +| `CloudDualStackNodeIPs` | `false` | Alpha | 1.27 | 1.28 | +| `CloudDualStackNodeIPs` | `true` | Beta | 1.29 | | +| `ClusterTrustBundle` | false | Alpha | 1.27 | | +| `ClusterTrustBundleProjection` | `false` | Alpha | 1.29 | | | `ComponentSLIs` | `false` | Alpha | 1.26 | 1.26 | | `ComponentSLIs` | `true` | Beta | 1.27 | | | `ConsistentListFromCache` | `false` | Alpha | 1.28 | | @@ -93,11 +91,10 @@ For a reference to old feature gates that are removed, please refer to | `CronJobsScheduledAnnotation` | `true` | Beta | 1.28 | | | `CrossNamespaceVolumeDataSource` | `false` | Alpha| 1.26 | | | `CustomCPUCFSQuotaPeriod` | `false` | Alpha | 1.12 | | -| `CustomResourceValidationExpressions` | `false` | Alpha | 1.23 | 1.24 | -| `CustomResourceValidationExpressions` | `true` | Beta | 1.25 | | | `DevicePluginCDIDevices` | `false` | Alpha | 1.28 | | | `DisableCloudProviders` | `false` | Alpha | 1.22 | | | `DisableKubeletCloudCredentialProviders` | `false` | Alpha | 1.23 | | +| `DisableNodeKubeProxyVersion` | `false` | Alpha | 1.29 | | | `DynamicResourceAllocation` | `false` | Alpha | 1.26 | | | `ElasticIndexedJob` | `true` | Beta | 1.27 | | | `EventedPLEG` | `false` | Alpha | 1.26 | 1.26 | @@ -118,15 +115,12 @@ For a reference to old feature gates that are removed, please refer to | `InTreePluginOpenStackUnregister` | `false` | Alpha | 1.21 | | | `InTreePluginPortworxUnregister` | `false` | Alpha | 1.23 | | | `InTreePluginvSphereUnregister` | `false` | Alpha | 1.21 | | -| `JobBackoffLimitPerIndex` | `false` | Alpha | 1.28 | | +| `JobBackoffLimitPerIndex` | `false` | Alpha | 1.28 | 1.28 | +| `JobBackoffLimitPerIndex` | `true` | Beta | 1.29 | | | `JobPodFailurePolicy` | `false` | Alpha | 1.25 | 1.25 | | `JobPodFailurePolicy` | `true` | Beta | 1.26 | | -| `JobPodReplacementPolicy` | `false` | Alpha | 1.28 | | -| `JobReadyPods` | `false` | Alpha | 1.23 | 1.23 | -| `JobReadyPods` | `true` | Beta | 1.24 | | -| `KMSv2` | `false` | Alpha | 1.25 | 1.26 | -| `KMSv2` | `true` | Beta | 1.27 | | -| `KMSv2KDF` | `false` | Beta | 1.28 | | +| `JobPodReplacementPolicy` | `false` | Alpha | 1.28 | 1.28 | +| `JobPodReplacementPolicy` | `true` | Beta | 1.29 | | | `KubeProxyDrainingTerminatingNodes` | `false` | Alpha | 1.28 | | | `KubeletCgroupDriverFromCRI` | `false` | Alpha | 1.28 | | | `KubeletInUserNamespace` | `false` | Alpha | 1.22 | | @@ -134,12 +128,15 @@ For a reference to old feature gates that are removed, please refer to | `KubeletPodResourcesGet` | `false` | Alpha | 1.27 | | | `KubeletTracing` | `false` | Alpha | 1.25 | 1.26 | | `KubeletTracing` | `true` | Beta | 1.27 | | -| `LegacyServiceAccountTokenCleanUp` | `false` | Alpha | 1.28 | | +| `LegacyServiceAccountTokenCleanUp` | `false` | Alpha | 1.28 | 1.28 | +| `LegacyServiceAccountTokenCleanUp` | `true` | Beta | 1.29 | | +| `LoadBalancerIPMode` | `false` | Alpha | 1.29 | | | `LocalStorageCapacityIsolationFSQuotaMonitoring` | `false` | Alpha | 1.15 | - | | `LogarithmicScaleDown` | `false` | Alpha | 1.21 | 1.21 | | `LogarithmicScaleDown` | `true` | Beta | 1.22 | | | `LoggingAlphaOptions` | `false` | Alpha | 1.24 | - | | `LoggingBetaOptions` | `true` | Beta | 1.24 | - | +| `MatchLabelKeysInPodAffinity` | `false` | Alpha | 1.29 | - | | `MatchLabelKeysInPodTopologySpread` | `false` | Alpha | 1.25 | 1.26 | | `MatchLabelKeysInPodTopologySpread` | `true` | Beta | 1.27 | - | | `MaxUnavailableStatefulSet` | `false` | Alpha | 1.24 | | @@ -149,7 +146,6 @@ For a reference to old feature gates that are removed, please refer to | `MinDomainsInPodTopologySpread` | `false` | Alpha | 1.24 | 1.24 | | `MinDomainsInPodTopologySpread` | `false` | Beta | 1.25 | 1.26 | | `MinDomainsInPodTopologySpread` | `true` | Beta | 1.27 | | -| `MultiCIDRRangeAllocator` | `false` | Alpha | 1.25 | | | `MultiCIDRServiceAllocator` | `false` | Alpha | 1.27 | | | `NewVolumeManagerReconstruction` | `false` | Beta | 1.27 | 1.27 | | `NewVolumeManagerReconstruction` | `true` | Beta | 1.28 | | @@ -162,37 +158,44 @@ For a reference to old feature gates that are removed, please refer to | `OpenAPIEnums` | `true` | Beta | 1.24 | | | `PDBUnhealthyPodEvictionPolicy` | `false` | Alpha | 1.26 | 1.26 | | `PDBUnhealthyPodEvictionPolicy` | `true` | Beta | 1.27 | | -| `PersistentVolumeLastPhaseTransistionTime` | `false` | Alpha | 1.28 | | +| `PersistentVolumeLastPhaseTransistionTime` | `false` | Alpha | 1.28 | 1.28 | +| `PersistentVolumeLastPhaseTransistionTime` | `true` | Beta | 1.29 | | | `PodAndContainerStatsFromCRI` | `false` | Alpha | 1.23 | | | `PodDeletionCost` | `false` | Alpha | 1.21 | 1.21 | | `PodDeletionCost` | `true` | Beta | 1.22 | | | `PodDisruptionConditions` | `false` | Alpha | 1.25 | 1.25 | | `PodDisruptionConditions` | `true` | Beta | 1.26 | | -| `PodHostIPs` | `false` | Alpha | 1.28 | | +| `PodHostIPs` | `false` | Alpha | 1.28 | 1.28 | +| `PodHostIPs` | `true` | Beta | 1.29 | | | `PodIndexLabel` | `true` | Beta | 1.28 | | -| `PodReadyToStartContainersCondition` | `false` | Alpha | 1.28 | | +| `PodLifecycleSleepAction` | `false` | Alpha | 1.29 | | +| `PodReadyToStartContainersCondition` | `false` | Alpha | 1.28 | 1.28 | +| `PodReadyToStartContainersCondition` | `true` | Beta | 1.29 | | | `PodSchedulingReadiness` | `false` | Alpha | 1.26 | 1.26 | | `PodSchedulingReadiness` | `true` | Beta | 1.27 | | | `ProcMountType` | `false` | Alpha | 1.12 | | | `QOSReserved` | `false` | Alpha | 1.11 | | -| `ReadWriteOncePod` | `false` | Alpha | 1.22 | 1.26 | -| `ReadWriteOncePod` | `true` | Beta | 1.27 | | | `RecoverVolumeExpansionFailure` | `false` | Alpha | 1.23 | | | `RemainingItemCount` | `false` | Alpha | 1.15 | 1.15 | | `RemainingItemCount` | `true` | Beta | 1.16 | | | `RotateKubeletServerCertificate` | `false` | Alpha | 1.7 | 1.11 | | `RotateKubeletServerCertificate` | `true` | Beta | 1.12 | | +| `RuntimeClassInImageCriApi` | `false` | Alpha | 1.29 | | | `SELinuxMountReadWriteOncePod` | `false` | Alpha | 1.25 | 1.26 | | `SELinuxMountReadWriteOncePod` | `false` | Beta | 1.27 | 1.27 | | `SELinuxMountReadWriteOncePod` | `true` | Beta | 1.28 | | -| `SchedulerQueueingHints` | `true` | Beta | 1.28 | | +| `SchedulerQueueingHints` | `true` | Beta | 1.28 | 1.28 | +| `SchedulerQueueingHints` | `false` | Beta | 1.29 | | | `SecurityContextDeny` | `false` | Alpha | 1.27 | | -| `ServiceNodePortStaticSubrange` | `false` | Alpha | 1.27 | 1.27 | -| `ServiceNodePortStaticSubrange` | `true` | Beta | 1.28 | | -| `SidecarContainers` | `false` | Alpha | 1.28 | | +| `SeparateTaintEvictionController` | `true` | Beta | 1.29 | | +| `ServiceAccountTokenJTI` | `false` | Alpha | 1.29 | | +| `ServiceAccountTokenNodeBinding` | `false` | Alpha | 1.29 | | +| `ServiceAccountTokenNodeBindingValidation` | `false` | Alpha | 1.29 | | +| `ServiceAccountTokenPodNodeInfo` | `false` | Alpha | 1.29 | | +| `SidecarContainers` | `false` | Alpha | 1.28 | 1.28 | +| `SidecarContainers` | `true` | Beta | 1.29 | | | `SizeMemoryBackedVolumes` | `false` | Alpha | 1.20 | 1.21 | | `SizeMemoryBackedVolumes` | `true` | Beta | 1.22 | | -| `SkipReadOnlyValidationGCE` | `false` | Alpha | 1.28 | | | `StableLoadBalancerNodeSet` | `true` | Beta | 1.27 | | | `StatefulSetAutoDeletePVC` | `false` | Alpha | 1.23 | 1.26 | | `StatefulSetAutoDeletePVC` | `false` | Beta | 1.27 | | @@ -209,12 +212,16 @@ For a reference to old feature gates that are removed, please refer to | `TopologyManagerPolicyBetaOptions` | `true` | Beta | 1.28 | | | `TopologyManagerPolicyOptions` | `false` | Alpha | 1.26 | 1.27 | | `TopologyManagerPolicyOptions` | `true` | Beta | 1.28 | | +| `TranslateStreamCloseWebsocketRequests` | `false` | Alpha | 1.29 | | | `UnauthenticatedHTTP2DOSMitigation` | `false` | Beta | 1.28 | | +| `UnauthenticatedHTTP2DOSMitigation` | `true` | Beta | 1.29 | | | `UnknownVersionInteroperabilityProxy` | `false` | Alpha | 1.28 | | +| `UserNamespacesPodSecurityStandards` | `false` | Alpha | 1.29 | | | `UserNamespacesSupport` | `false` | Alpha | 1.28 | | | `ValidatingAdmissionPolicy` | `false` | Alpha | 1.26 | 1.27 | | `ValidatingAdmissionPolicy` | `false` | Beta | 1.28 | | | `VolumeCapacityPriority` | `false` | Alpha | 1.21 | | +| `VolumeAttributesClass` | `false` | Alpha | 1.29 | | | `WatchList` | `false` | Alpha | 1.27 | | | `WinDSR` | `false` | Alpha | 1.14 | | | `WinOverlay` | `false` | Alpha | 1.14 | 1.19 | @@ -228,6 +235,9 @@ For a reference to old feature gates that are removed, please refer to | Feature | Default | Stage | Since | Until | |---------|---------|-------|-------|-------| +| `APIListChunking` | `false` | Alpha | 1.8 | 1.8 | +| `APIListChunking` | `true` | Beta | 1.9 | 1.28 | +| `APIListChunking` | `true` | GA | 1.29 | - | | `APISelfSubjectReview` | `false` | Alpha | 1.26 | 1.26 | | `APISelfSubjectReview` | `true` | Beta | 1.27 | 1.27 | | `APISelfSubjectReview` | `true` | GA | 1.28 | - | @@ -244,18 +254,20 @@ For a reference to old feature gates that are removed, please refer to | `CSIMigrationvSphere` | `false` | Beta | 1.19 | 1.24 | | `CSIMigrationvSphere` | `true` | Beta | 1.25 | 1.25 | | `CSIMigrationvSphere` | `true` | GA | 1.26 | - | +| `CSINodeExpandSecret` | `false` | Alpha | 1.25 | 1.26 | +| `CSINodeExpandSecret` | `true` | Beta | 1.27 | 1.28 | +| `CSINodeExpandSecret` | `true` | GA | 1.29 | | +| `ComponentSLIs` | `false` | Alpha | 1.26 | 1.26 | +| `ComponentSLIs` | `true` | Beta | 1.27 | 1.28| +| `ComponentSLIs` | `true` | GA | 1.29 | - | | `ConsistentHTTPGetHandlers` | `true` | GA | 1.25 | - | -| `CronJobTimeZone` | `false` | Alpha | 1.24 | 1.24 | -| `CronJobTimeZone` | `true` | Beta | 1.25 | 1.26 | -| `CronJobTimeZone` | `true` | GA | 1.27 | - | +| `CustomResourceValidationExpressions` | `false` | Alpha | 1.23 | 1.24 | +| `CustomResourceValidationExpressions` | `true` | Beta | 1.25 | 1.28 | +| `CustomResourceValidationExpressions` | `true` | GA | 1.29 | - | | `DaemonSetUpdateSurge` | `false` | Alpha | 1.21 | 1.21 | | `DaemonSetUpdateSurge` | `true` | Beta | 1.22 | 1.24 | | `DaemonSetUpdateSurge` | `true` | GA | 1.25 | | | `DefaultHostNetworkHostPortsInPodTemplates` | `false` | Deprecated | 1.28 | | -| `DownwardAPIHugePages` | `false` | Alpha | 1.20 | 1.20 | -| `DownwardAPIHugePages` | `false` | Beta | 1.21 | 1.21 | -| `DownwardAPIHugePages` | `true` | Beta | 1.22 | 1.26 | -| `DownwardAPIHugePages` | `true` | GA | 1.27 | | | `EfficientWatchResumption` | `false` | Alpha | 1.20 | 1.20 | | `EfficientWatchResumption` | `true` | Beta | 1.21 | 1.23 | | `EfficientWatchResumption` | `true` | GA | 1.24 | | @@ -265,29 +277,31 @@ For a reference to old feature gates that are removed, please refer to | `ExpandedDNSConfig` | `true` | GA | 1.28 | | | `ExperimentalHostUserNamespaceDefaulting` | `false` | Beta | 1.5 | 1.27 | | `ExperimentalHostUserNamespaceDefaulting` | `false` | Deprecated | 1.28 | | -| `GRPCContainerProbe` | `false` | Alpha | 1.23 | 1.23 | -| `GRPCContainerProbe` | `true` | Beta | 1.24 | 1.26 | -| `GRPCContainerProbe` | `true` | GA | 1.27 | | | `IPTablesOwnershipCleanup` | `false` | Alpha | 1.25 | 1.26 | | `IPTablesOwnershipCleanup` | `true` | Beta | 1.27 | 1.27 | | `IPTablesOwnershipCleanup` | `true` | GA | 1.28 | | | `InTreePluginRBDUnregister` | `false` | Alpha | 1.23 | 1.27 | | `InTreePluginRBDUnregister` | `false` | Deprecated | 1.28 | | -| `JobMutableNodeSchedulingDirectives` | `true` | Beta | 1.23 | 1.26 | -| `JobMutableNodeSchedulingDirectives` | `true` | GA | 1.27 | | +| `JobReadyPods` | `false` | Alpha | 1.23 | 1.23 | +| `JobReadyPods` | `true` | Beta | 1.24 | 1.28 | +| `JobReadyPods` | `true` | GA | 1.29 | | | `JobTrackingWithFinalizers` | `false` | Alpha | 1.22 | 1.22 | | `JobTrackingWithFinalizers` | `false` | Beta | 1.23 | 1.24 | | `JobTrackingWithFinalizers` | `true` | Beta | 1.25 | 1.25 | | `JobTrackingWithFinalizers` | `true` | GA | 1.26 | | -| `KMSv1` | `true` | Deprecated | 1.28 | | +| `KMSv1` | `true` | Deprecated | 1.28 | 1.28 | +| `KMSv1` | `false` | Deprecated | 1.29 | | +| `KMSv2` | `false` | Alpha | 1.25 | 1.26 | +| `KMSv2` | `true` | Beta | 1.27 | 1.28 | +| `KMSv2` | `true` | GA | 1.29 | | +| `KMSv2KDF` | `false` | Beta | 1.28 | 1.28 | +| `KMSv2KDF` | `true` | GA | 1.29 | | | `KubeletPodResources` | `false` | Alpha | 1.13 | 1.14 | | `KubeletPodResources` | `true` | Beta | 1.15 | 1.27 | | `KubeletPodResources` | `true` | GA | 1.28 | | | `KubeletPodResourcesGetAllocatable` | `false` | Alpha | 1.21 | 1.22 | | `KubeletPodResourcesGetAllocatable` | `true` | Beta | 1.23 | 1.27 | | `KubeletPodResourcesGetAllocatable` | `true` | GA | 1.28 | | -| `LegacyServiceAccountTokenNoAutoGeneration` | `true` | Beta | 1.24 | 1.25 | -| `LegacyServiceAccountTokenNoAutoGeneration` | `true` | GA | 1.26 | | | `LegacyServiceAccountTokenTracking` | `false` | Alpha | 1.26 | 1.26 | | `LegacyServiceAccountTokenTracking` | `true` | Beta | 1.27 | 1.27 | | `LegacyServiceAccountTokenTracking` | `true` | GA | 1.28 | | @@ -307,12 +321,12 @@ For a reference to old feature gates that are removed, please refer to | `ProxyTerminatingEndpoints` | `false` | Alpha | 1.22 | 1.25 | | `ProxyTerminatingEndpoints` | `true` | Beta | 1.26 | 1.27 | | `ProxyTerminatingEndpoints` | `true` | GA | 1.28 | | +| `ReadWriteOncePod` | `false` | Alpha | 1.22 | 1.26 | +| `ReadWriteOncePod` | `true` | Beta | 1.27 | 1.28 | +| `ReadWriteOncePod` | `true` | GA | 1.29 | | | `RemoveSelfLink` | `false` | Alpha | 1.16 | 1.19 | | `RemoveSelfLink` | `true` | Beta | 1.20 | 1.23 | | `RemoveSelfLink` | `true` | GA | 1.24 | | -| `RetroactiveDefaultStorageClass` | `false` | Alpha | 1.25 | 1.25 | -| `RetroactiveDefaultStorageClass` | `true` | Beta | 1.26 | 1.27 | -| `RetroactiveDefaultStorageClass` | `true` | GA | 1.28 | | | `SeccompDefault` | `false` | Alpha | 1.22 | 1.24 | | `SeccompDefault` | `true` | Beta | 1.25 | 1.26 | | `SeccompDefault` | `true` | GA | 1.27 | - | @@ -322,9 +336,17 @@ For a reference to old feature gates that are removed, please refer to | `ServerSideFieldValidation` | `false` | Alpha | 1.23 | 1.24 | | `ServerSideFieldValidation` | `true` | Beta | 1.25 | 1.26 | | `ServerSideFieldValidation` | `true` | GA | 1.27 | - | -| `TopologyManager` | `false` | Alpha | 1.16 | 1.17 | -| `TopologyManager` | `true` | Beta | 1.18 | 1.26 | -| `TopologyManager` | `true` | GA | 1.27 | - | +| `ServiceIPStaticSubrange` | `false` | Alpha | 1.24 | 1.24 | +| `ServiceIPStaticSubrange` | `true` | Beta | 1.25 | 1.25 | +| `ServiceIPStaticSubrange` | `true` | GA | 1.26 | - | +| `ServiceInternalTrafficPolicy` | `false` | Alpha | 1.21 | 1.21 | +| `ServiceInternalTrafficPolicy` | `true` | Beta | 1.22 | 1.25 | +| `ServiceInternalTrafficPolicy` | `true` | GA | 1.26 | - | +| `ServiceNodePortStaticSubrange` | `false` | Alpha | 1.27 | 1.27 | +| `ServiceNodePortStaticSubrange` | `true` | Beta | 1.28 | 1.28 | +| `ServiceNodePortStaticSubrange` | `true` | GA | 1.29 | - | +| `SkipReadOnlyValidationGCE` | `false` | Alpha | 1.28 | 1.28 | +| `SkipReadOnlyValidationGCE` | `true` | Deprecated | 1.29 | | | `WatchBookmark` | `false` | Alpha | 1.15 | 1.15 | | `WatchBookmark` | `true` | Beta | 1.16 | 1.16 | | `WatchBookmark` | `true` | GA | 1.17 | - | @@ -435,7 +457,8 @@ Each feature gate is designed for enabling/disabling a specific feature: - `CloudDualStackNodeIPs`: Enables dual-stack `kubelet --node-ip` with external cloud providers. See [Configure IPv4/IPv6 dual-stack](/docs/concepts/services-networking/dual-stack/#configure-ipv4-ipv6-dual-stack) for more details. -- `ClusterTrustBundle`: Enable ClusterTrustBundle objects and kubelet integration. +- `ClusterTrustBundle`: Enable ClusterTrustBundle objects. +- `ClusterTrustBundleProjection`: [`clusterTrustBundle` projected volume sources](/docs/concepts/storage/projected-volumes#clustertrustbundle). - `ComponentSLIs`: Enable the `/metrics/slis` endpoint on Kubernetes components like kubelet, kube-scheduler, kube-proxy, kube-controller-manager, cloud-controller-manager allowing you to scrape health check metrics. @@ -477,8 +500,7 @@ Each feature gate is designed for enabling/disabling a specific feature: component flag. - `DisableKubeletCloudCredentialProviders`: Disable the in-tree functionality in kubelet to authenticate to a cloud provider container registry for image pull credentials. -- `DownwardAPIHugePages`: Enables usage of hugepages in - [downward API](/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information). +- `DisableNodeKubeProxyVersion`: Disable setting the `kubeProxyVersion` field of the Node. - `DynamicResourceAllocation`: Enables support for resources with custom parameters and a lifecycle that is independent of a Pod. - `ElasticIndexedJob`: Enables Indexed Jobs to be scaled up or down by mutating both @@ -602,15 +624,19 @@ Each feature gate is designed for enabling/disabling a specific feature: - `KubeletTracing`: Add support for distributed tracing in the kubelet. When enabled, kubelet CRI interface and authenticated http servers are instrumented to generate OpenTelemetry trace spans. - See [Traces for Kubernetes System Components](/docs/concepts/cluster-administration/system-traces/) - for more details. + See [Traces for Kubernetes System Components](/docs/concepts/cluster-administration/system-traces) for more details. - `LegacyServiceAccountTokenNoAutoGeneration`: Stop auto-generation of Secret-based [service account tokens](/docs/concepts/security/service-accounts/#get-a-token). -- `LegacyServiceAccountTokenCleanUp`: Enable cleaning up Secret-based +- `LegacyServiceAccountTokenCleanUp`: Enable invalidating auto-generated Secret-based [service account tokens](/docs/concepts/security/service-accounts/#get-a-token) - when they are not used in a specified time (default to be one year). + when they have not been used in a specified time (defaults to one year). Clean up + the auto-generated Secret-based tokens if they have been invalidated for a specified time + (defaults to one year). - `LegacyServiceAccountTokenTracking`: Track usage of Secret-based [service account tokens](/docs/concepts/security/service-accounts/#get-a-token). +- `LoadBalancerIPMode`: Allows setting `ipMode` for Services where `type` is set to `LoadBalancer`. + See [Specifying IPMode of load balancer status](/docs/concepts/services-networking/service/#load-balancer-ip-mode) + for more information. - `LocalStorageCapacityIsolationFSQuotaMonitoring`: When `LocalStorageCapacityIsolation` is enabled for [local ephemeral storage](/docs/concepts/configuration/manage-resources-containers/) @@ -622,6 +648,8 @@ Each feature gate is designed for enabling/disabling a specific feature: based on logarithmic bucketing of pod timestamps. - `LoggingAlphaOptions`: Allow fine-tuing of experimental, alpha-quality logging options. - `LoggingBetaOptions`: Allow fine-tuing of experimental, beta-quality logging options. +- `MatchLabelKeysInPodAffinity`: Enable the `matchLabelKeys` and `mismatchLabelKeys` field for + [pod (anti)affinity](/docs/concepts/scheduling-eviction/assign-pod-node/). - `MatchLabelKeysInPodTopologySpread`: Enable the `matchLabelKeys` field for [Pod topology spread constraints](/docs/concepts/scheduling-eviction/topology-spread-constraints/). - `MaxUnavailableStatefulSet`: Enables setting the `maxUnavailable` field for the @@ -636,8 +664,8 @@ Each feature gate is designed for enabling/disabling a specific feature: [Pod topology spread constraints](/docs/concepts/scheduling-eviction/topology-spread-constraints/). - `MinimizeIPTablesRestore`: Enables new performance improvement logics in the kube-proxy iptables mode. -- `MultiCIDRRangeAllocator`: Enables the MultiCIDR range allocator. -- `MultiCIDRServiceAllocator`: Track IP address allocations for Service cluster IPs using IPAddress objects. +- `MultiCIDRServiceAllocator`: Allow to dynamically configure the cluster Service IP ranges using + ServiceCIDR objects and track IP address allocations for Service cluster IPs using IPAddress objects. - `NewVolumeManagerReconstruction`: Enables improved discovery of mounted volumes during kubelet startup. Since this code has been significantly refactored, we allow to opt-out in case kubelet gets stuck at the startup or is not unmounting volumes from terminated Pods. Note that this @@ -678,10 +706,8 @@ Each feature gate is designed for enabling/disabling a specific feature: the pod is being deleted due to a disruption. - `PodHostIPs`: Enable the `status.hostIPs` field for pods and the {{< glossary_tooltip term_id="downward-api" text="downward API" >}}. The field lets you expose host IP addresses to workloads. -- `PodIndexLabel`: Enables the Job controller and StatefulSet controller to add the pod index as a label - when creating new pods. See [Job completion mode docs](/docs/concepts/workloads/controllers/job/#completion-mode) - and [StatefulSet pod index label docs](/docs/concepts/workloads/controllers/statefulset/#pod-index-label) - for more details. +- `PodIndexLabel`: Enables the Job controller and StatefulSet controller to add the pod index as a label when creating new pods. See [Job completion mode docs](/docs/concepts/workloads/controllers/job#completion-mode) and [StatefulSet pod index label docs](/docs/concepts/workloads/controllers/statefulset/#pod-index-label) for more details. +- `PodLifecycleSleepAction`: Enables the `sleep` action in Container lifecycle hooks. - `PodReadyToStartContainersCondition`: Enable the kubelet to mark the [PodReadyToStartContainers](/docs/concepts/workloads/pods/pod-lifecycle/#pod-has-network) condition on pods. This was previously (1.25-1.27) known as `PodHasNetworkCondition`. - `PodSchedulingReadiness`: Enable setting `schedulingGates` field to control a Pod's @@ -710,10 +736,11 @@ Each feature gate is designed for enabling/disabling a specific feature: objects and collections. This field has been deprecated since the Kubernetes v1.16 release. When this feature is enabled, the `.metadata.selfLink` field remains part of the Kubernetes API, but is always unset. -- `RetroactiveDefaultStorageClass`: Allow assigning StorageClass to unbound PVCs retroactively. - `RotateKubeletServerCertificate`: Enable the rotation of the server TLS certificate on the kubelet. See [kubelet configuration](/docs/reference/access-authn-authz/kubelet-tls-bootstrapping/#kubelet-configuration) for more details. +- `RuntimeClassInImageCriApi` : Enables images to be pulled based on the [runtime class] + (/docs/concepts/containers/runtime-class/) of the pods that reference them. - `SELinuxMountReadWriteOncePod`: Speeds up container startup by allowing kubelet to mount volumes for a Pod directly with the correct SELinux label instead of changing each file on the volumes recursively. The initial implementation focused on ReadWriteOncePod volumes. @@ -726,11 +753,22 @@ Each feature gate is designed for enabling/disabling a specific feature: for all workloads. The seccomp profile is specified in the `securityContext` of a Pod and/or a Container. - `SecurityContextDeny`: This gate signals that the `SecurityContextDeny` admission controller is deprecated. +- `SeparateTaintEvictionController`: Enables running `TaintEvictionController`, + that performs [Taint-based Evictions](/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-based-evictions), + in a controller separated from `NodeLifecycleController`. When this feature is + enabled, users can optionally disable Taint-based Eviction setting the + `--controllers=-taint-eviction-controller` flag on the `kube-controller-manager`. - `ServerSideApply`: Enables the [Sever Side Apply (SSA)](/docs/reference/using-api/server-side-apply/) feature on the API Server. - `ServerSideFieldValidation`: Enables server-side field validation. This means the validation of resource schema is performed at the API server side rather than the client side (for example, the `kubectl create` or `kubectl apply` command line). +- `ServiceAccountTokenJTI`: Controls whether JTIs (UUIDs) are embedded into generated service account tokens, + and whether these JTIs are recorded into the Kubernetes audit log for future requests made by these tokens. +- `ServiceAccountTokenNodeBinding`: Controls whether the apiserver allows binding service account tokens to Node objects. +- `ServiceAccountTokenNodeBindingValidation`: Controls whether the apiserver will validate a Node reference in service account tokens. +- `ServiceAccountTokenPodNodeInfo`: Controls whether the apiserver embeds the node name and uid + for the associated node when issuing service account tokens bound to Pod objects. - `ServiceNodePortStaticSubrange`: Enables the use of different port allocation strategies for NodePort Services. For more details, see [reserve NodePort ranges to avoid collisions](/docs/concepts/services-networking/service/#avoid-nodeport-collisions). @@ -767,18 +805,28 @@ Each feature gate is designed for enabling/disabling a specific feature: This feature gate guards *a group* of topology manager options whose quality level is beta. This feature gate will never graduate to stable. - `TopologyManagerPolicyOptions`: Allow fine-tuning of topology manager policies, +- `TranslateStreamCloseWebsocketRequests`: Allow WebSocket streaming of the + remote command sub-protocol (`exec`, `cp`, `attach`) from clients requesting + version 5 (v5) of the sub-protocol. - `UnauthenticatedHTTP2DOSMitigation`: Enables HTTP/2 Denial of Service (DoS) - mitigations for unauthenticated clients. - Kubernetes v1.28.0 through v1.28.2 do not include this feature gate. + mitigations for unauthenticated clients. + Kubernetes v1.28.0 through v1.28.2 do not include this feature gate. - `UnknownVersionInteroperabilityProxy`: Proxy resource requests to the correct peer kube-apiserver when multiple kube-apiservers exist at varied versions. See [Mixed version proxy](/docs/concepts/architecture/mixed-version-proxy/) for more information. +- `UserNamespacesPodSecurityStandards`: Enable Pod Security Standards policies relaxation for pods + that run with namespaces. You must set the value of this feature gate consistently across all nodes in + your cluster, and you must also enable `UserNamespacesSupport` to use this feature. + See [User Namespaces](/docs/concepts/workloads/pods/user-namespaces/#integration-with-pod-security-admission-checks) for more details. - `UserNamespacesSupport`: Enable user namespace support for Pods. Before Kubernetes v1.28, this feature gate was named `UserNamespacesStatelessPodsSupport`. - `ValidatingAdmissionPolicy`: Enable [ValidatingAdmissionPolicy](/docs/reference/access-authn-authz/validating-admission-policy/) support for CEL validations be used in Admission Control. - `VolumeCapacityPriority`: Enable support for prioritizing nodes in different topologies based on available PV capacity. +- `VolumeAttributesClass`: Enable support for VolumeAttributesClasses. + See [Volume Attributes Classes](/docs/concepts/storage/volume-attributes-classes/) + for more information. - `WatchBookmark`: Enable support for watch bookmark events. - `WatchList` : Enable support for [streaming initial state of objects in watch requests](/docs/reference/using-api/api-concepts/#streaming-lists). - `WinDSR`: Allows kube-proxy to create DSR loadbalancers for Windows. diff --git a/content/en/docs/reference/instrumentation/metrics.md b/content/en/docs/reference/instrumentation/metrics.md index 643191e5fdab0..0bc4338b60431 100644 --- a/content/en/docs/reference/instrumentation/metrics.md +++ b/content/en/docs/reference/instrumentation/metrics.md @@ -6,10 +6,10 @@ description: >- Details of the metric data that Kubernetes components export. --- -## Metrics (v1.28) +## Metrics (v1.29) - - + + This page details the metrics that different Kubernetes components export. You can query the metrics endpoint for these components using an HTTP scrape, and fetch the current metrics data in Prometheus format. @@ -17,3029 +17,3029 @@ components using an HTTP scrape, and fetch the current metrics data in Prometheu Stable metrics observe strict API contracts and no labels can be added or removed from stable metrics during their lifetime. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
NameStability LevelTypeHelpLabelsConst LabelsDeprecated Version
apiserver_admission_controller_admission_duration_secondsSTABLEHistogramAdmission controller latency histogram in seconds, identified by name and broken out for each operation and API resource and type (validate or admit).
name
operation
rejected
type
apiserver_admission_step_admission_duration_secondsSTABLEHistogramAdmission sub-step latency histogram in seconds, broken out for each operation and API resource and step type (validate or admit).
operation
rejected
type
apiserver_admission_webhook_admission_duration_secondsSTABLEHistogramAdmission webhook latency histogram in seconds, identified by name and broken out for each operation and API resource and type (validate or admit).
name
operation
rejected
type
apiserver_current_inflight_requestsSTABLEGaugeMaximal number of currently used inflight request limit of this apiserver per request kind in last second.
request_kind
apiserver_longrunning_requestsSTABLEGaugeGauge of all active long-running apiserver requests broken out by verb, group, version, resource, scope and component. Not all requests are tracked this way.
component
group
resource
scope
subresource
verb
version
apiserver_request_duration_secondsSTABLEHistogramResponse latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope and component.
component
dry_run
group
resource
scope
subresource
verb
version
apiserver_request_totalSTABLECounterCounter of apiserver requests broken out for each verb, dry run value, group, version, resource, scope, component, and HTTP response code.
code
component
dry_run
group
resource
scope
subresource
verb
version
apiserver_requested_deprecated_apisSTABLEGaugeGauge of deprecated APIs that have been requested, broken out by API group, version, resource, subresource, and removed_release.
group
removed_release
resource
subresource
version
apiserver_response_sizesSTABLEHistogramResponse size distribution in bytes for each group, version, verb, resource, subresource, scope and component.
component
group
resource
scope
subresource
verb
version
apiserver_storage_objectsSTABLEGaugeNumber of stored objects at the time of last check split by kind.
resource
cronjob_controller_job_creation_skew_duration_secondsSTABLEHistogramTime between when a cronjob is scheduled to be run, and when the corresponding job is created
job_controller_job_pods_finished_totalSTABLECounterThe number of finished Pods that are fully tracked
completion_mode
result
job_controller_job_sync_duration_secondsSTABLEHistogramThe time it took to sync a job
action
completion_mode
result
job_controller_job_syncs_totalSTABLECounterThe number of job syncs
action
completion_mode
result
job_controller_jobs_finished_totalSTABLECounterThe number of finished jobs
completion_mode
reason
result
kube_pod_resource_limitSTABLECustomResources limit for workloads on the cluster, broken down by pod. This shows the resource usage the scheduler and kubelet expect per pod for resources along with the unit for the resource if any.
namespace
pod
node
scheduler
priority
resource
unit
kube_pod_resource_requestSTABLECustomResources requested by workloads on the cluster, broken down by pod. This shows the resource usage the scheduler and kubelet expect per pod for resources along with the unit for the resource if any.
namespace
pod
node
scheduler
priority
resource
unit
node_collector_evictions_totalSTABLECounterNumber of Node evictions that happened since current instance of NodeController started.
zone
scheduler_framework_extension_point_duration_secondsSTABLEHistogramLatency for running all plugins of a specific extension point.
extension_point
profile
status
scheduler_pending_podsSTABLEGaugeNumber of pending pods, by the queue type. 'active' means number of pods in activeQ; 'backoff' means number of pods in backoffQ; 'unschedulable' means number of pods in unschedulablePods that the scheduler attempted to schedule and failed; 'gated' is the number of unschedulable pods that the scheduler never attempted to schedule because they are gated.
queue
scheduler_pod_scheduling_attemptsSTABLEHistogramNumber of attempts to successfully schedule a pod.
scheduler_pod_scheduling_duration_secondsSTABLEHistogramE2e latency for a pod being scheduled which may include multiple scheduling attempts.
attempts
scheduler_preemption_attempts_totalSTABLECounterTotal preemption attempts in the cluster till now
scheduler_preemption_victimsSTABLEHistogramNumber of selected preemption victims
scheduler_queue_incoming_pods_totalSTABLECounterNumber of pods added to scheduling queues by event and queue type.
event
queue
scheduler_schedule_attempts_totalSTABLECounterNumber of attempts to schedule pods, by the result. 'unschedulable' means a pod could not be scheduled, while 'error' means an internal scheduler problem.
profile
result
scheduler_scheduling_attempt_duration_secondsSTABLEHistogramScheduling attempt latency in seconds (scheduling algorithm + binding)
profile
result
+
+
apiserver_admission_controller_admission_duration_seconds
+
Admission controller latency histogram in seconds, identified by name and broken out for each operation and API resource and type (validate or admit).
+
    +
  • STABLE
  • +
  • Histogram
  • +
  • nameoperationrejectedtype
+
+
apiserver_admission_step_admission_duration_seconds
+
Admission sub-step latency histogram in seconds, broken out for each operation and API resource and step type (validate or admit).
+
    +
  • STABLE
  • +
  • Histogram
  • +
  • operationrejectedtype
+
+
apiserver_admission_webhook_admission_duration_seconds
+
Admission webhook latency histogram in seconds, identified by name and broken out for each operation and API resource and type (validate or admit).
+
    +
  • STABLE
  • +
  • Histogram
  • +
  • nameoperationrejectedtype
+
+
apiserver_current_inflight_requests
+
Maximal number of currently used inflight request limit of this apiserver per request kind in last second.
+
    +
  • STABLE
  • +
  • Gauge
  • +
  • request_kind
+
+
apiserver_longrunning_requests
+
Gauge of all active long-running apiserver requests broken out by verb, group, version, resource, scope and component. Not all requests are tracked this way.
+
    +
  • STABLE
  • +
  • Gauge
  • +
  • componentgroupresourcescopesubresourceverbversion
+
+
apiserver_request_duration_seconds
+
Response latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope and component.
+
    +
  • STABLE
  • +
  • Histogram
  • +
  • componentdry_rungroupresourcescopesubresourceverbversion
+
+
apiserver_request_total
+
Counter of apiserver requests broken out for each verb, dry run value, group, version, resource, scope, component, and HTTP response code.
+
    +
  • STABLE
  • +
  • Counter
  • +
  • codecomponentdry_rungroupresourcescopesubresourceverbversion
+
+
apiserver_requested_deprecated_apis
+
Gauge of deprecated APIs that have been requested, broken out by API group, version, resource, subresource, and removed_release.
+
    +
  • STABLE
  • +
  • Gauge
  • +
  • groupremoved_releaseresourcesubresourceversion
+
+
apiserver_response_sizes
+
Response size distribution in bytes for each group, version, verb, resource, subresource, scope and component.
+
    +
  • STABLE
  • +
  • Histogram
  • +
  • componentgroupresourcescopesubresourceverbversion
+
+
apiserver_storage_objects
+
Number of stored objects at the time of last check split by kind.
+
    +
  • STABLE
  • +
  • Gauge
  • +
  • resource
+
+
container_cpu_usage_seconds_total
+
Cumulative cpu time consumed by the container in core-seconds
+
    +
  • STABLE
  • +
  • Custom
  • +
  • containerpodnamespace
+
+
container_memory_working_set_bytes
+
Current working set of the container in bytes
+
    +
  • STABLE
  • +
  • Custom
  • +
  • containerpodnamespace
+
+
container_start_time_seconds
+
Start time of the container since unix epoch in seconds
+
    +
  • STABLE
  • +
  • Custom
  • +
  • containerpodnamespace
+
+
cronjob_controller_job_creation_skew_duration_seconds
+
Time between when a cronjob is scheduled to be run, and when the corresponding job is created
+
    +
  • STABLE
  • +
  • Histogram
  • +
+
+
job_controller_job_pods_finished_total
+
The number of finished Pods that are fully tracked
+
    +
  • STABLE
  • +
  • Counter
  • +
  • completion_moderesult
+
+
job_controller_job_sync_duration_seconds
+
The time it took to sync a job
+
    +
  • STABLE
  • +
  • Histogram
  • +
  • actioncompletion_moderesult
+
+
job_controller_job_syncs_total
+
The number of job syncs
+
    +
  • STABLE
  • +
  • Counter
  • +
  • actioncompletion_moderesult
+
+
job_controller_jobs_finished_total
+
The number of finished jobs
+
    +
  • STABLE
  • +
  • Counter
  • +
  • completion_modereasonresult
+
+
kube_pod_resource_limit
+
Resources limit for workloads on the cluster, broken down by pod. This shows the resource usage the scheduler and kubelet expect per pod for resources along with the unit for the resource if any.
+
    +
  • STABLE
  • +
  • Custom
  • +
  • namespacepodnodeschedulerpriorityresourceunit
+
+
kube_pod_resource_request
+
Resources requested by workloads on the cluster, broken down by pod. This shows the resource usage the scheduler and kubelet expect per pod for resources along with the unit for the resource if any.
+
    +
  • STABLE
  • +
  • Custom
  • +
  • namespacepodnodeschedulerpriorityresourceunit
+
+
node_collector_evictions_total
+
Number of Node evictions that happened since current instance of NodeController started.
+
    +
  • STABLE
  • +
  • Counter
  • +
  • zone
+
+
node_cpu_usage_seconds_total
+
Cumulative cpu time consumed by the node in core-seconds
+
    +
  • STABLE
  • +
  • Custom
  • +
+
+
node_memory_working_set_bytes
+
Current working set of the node in bytes
+
    +
  • STABLE
  • +
  • Custom
  • +
+
+
pod_cpu_usage_seconds_total
+
Cumulative cpu time consumed by the pod in core-seconds
+
    +
  • STABLE
  • +
  • Custom
  • +
  • podnamespace
+
+
pod_memory_working_set_bytes
+
Current working set of the pod in bytes
+
    +
  • STABLE
  • +
  • Custom
  • +
  • podnamespace
+
+
resource_scrape_error
+
1 if there was an error while getting container metrics, 0 otherwise
+
    +
  • STABLE
  • +
  • Custom
  • +
+
+
scheduler_framework_extension_point_duration_seconds
+
Latency for running all plugins of a specific extension point.
+
    +
  • STABLE
  • +
  • Histogram
  • +
  • extension_pointprofilestatus
+
+
scheduler_pending_pods
+
Number of pending pods, by the queue type. 'active' means number of pods in activeQ; 'backoff' means number of pods in backoffQ; 'unschedulable' means number of pods in unschedulablePods that the scheduler attempted to schedule and failed; 'gated' is the number of unschedulable pods that the scheduler never attempted to schedule because they are gated.
+
    +
  • STABLE
  • +
  • Gauge
  • +
  • queue
+
+
scheduler_pod_scheduling_attempts
+
Number of attempts to successfully schedule a pod.
+
    +
  • STABLE
  • +
  • Histogram
  • +
+
+
scheduler_pod_scheduling_duration_seconds
+
E2e latency for a pod being scheduled which may include multiple scheduling attempts.
+
    +
  • STABLE
  • +
  • Histogram
  • +
  • attempts
  • 1.28.0
+
+
scheduler_preemption_attempts_total
+
Total preemption attempts in the cluster till now
+
    +
  • STABLE
  • +
  • Counter
  • +
+
+
scheduler_preemption_victims
+
Number of selected preemption victims
+
    +
  • STABLE
  • +
  • Histogram
  • +
+
+
scheduler_queue_incoming_pods_total
+
Number of pods added to scheduling queues by event and queue type.
+
    +
  • STABLE
  • +
  • Counter
  • +
  • eventqueue
+
+
scheduler_schedule_attempts_total
+
Number of attempts to schedule pods, by the result. 'unschedulable' means a pod could not be scheduled, while 'error' means an internal scheduler problem.
+
    +
  • STABLE
  • +
  • Counter
  • +
  • profileresult
+
+
scheduler_scheduling_attempt_duration_seconds
+
Scheduling attempt latency in seconds (scheduling algorithm + binding)
+
    +
  • STABLE
  • +
  • Histogram
  • +
  • profileresult
+
+
### List of Beta Kubernetes Metrics -Beta metrics observe a looser API contract than its stable counterparts. No labels can be removed from beta metrics during their lifetime, however, labels can be added while the metric is in the beta stage. This offers the assurance that beta metrics will honor existing dashboards and alerts, while allowing for amendments in the future. - - - - - - - - - - - - - - +Beta metrics observe a looser API contract than its stable counterparts. No labels can be removed from beta metrics during their lifetime, however, labels can be added while the metric is in the beta stage. This offers the assurance that beta metrics will honor existing dashboards and alerts, while allowing for amendments in the future. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
NameStability LevelTypeHelpLabelsConst LabelsDeprecated Version
apiserver_flowcontrol_current_executing_requestsBETAGaugeNumber of requests in initial (for a WATCH) or any (for a non-WATCH) execution stage in the API Priority and Fairness subsystem
flow_schema
priority_level
apiserver_flowcontrol_current_executing_seatsBETAGaugeConcurrency (number of seats) occupied by the currently executing (initial stage for a WATCH, any stage otherwise) requests in the API Priority and Fairness subsystem
flow_schema
priority_level
apiserver_flowcontrol_current_inqueue_requestsBETAGaugeNumber of requests currently pending in queues of the API Priority and Fairness subsystem
flow_schema
priority_level
apiserver_flowcontrol_dispatched_requests_totalBETACounterNumber of requests executed by API Priority and Fairness subsystem
flow_schema
priority_level
apiserver_flowcontrol_nominal_limit_seatsBETAGaugeNominal number of execution seats configured for each priority level
priority_level
apiserver_flowcontrol_rejected_requests_totalBETACounterNumber of requests rejected by API Priority and Fairness subsystem
flow_schema
priority_level
reason
apiserver_flowcontrol_request_wait_duration_secondsBETAHistogramLength of time a request spent waiting in its queue
execute
flow_schema
priority_level
disabled_metrics_totalBETACounterThe count of disabled metrics.
hidden_metrics_totalBETACounterThe count of hidden metrics.
kubernetes_feature_enabledBETAGaugeThis metric records the data about the stage and enablement of a k8s feature.
name
stage
kubernetes_healthcheckBETAGaugeThis metric records the result of a single healthcheck.
name
type
kubernetes_healthchecks_totalBETACounterThis metric records the results of all healthcheck.
name
status
type
registered_metrics_totalBETACounterThe count of registered metrics broken by stability level and deprecation version.
deprecated_version
stability_level
+
+
apiserver_flowcontrol_current_executing_requests
+
Number of requests in initial (for a WATCH) or any (for a non-WATCH) execution stage in the API Priority and Fairness subsystem
+
    +
  • BETA
  • +
  • Gauge
  • +
  • flow_schemapriority_level
+
+
apiserver_flowcontrol_current_executing_seats
+
Concurrency (number of seats) occupied by the currently executing (initial stage for a WATCH, any stage otherwise) requests in the API Priority and Fairness subsystem
+
    +
  • BETA
  • +
  • Gauge
  • +
  • flow_schemapriority_level
+
+
apiserver_flowcontrol_current_inqueue_requests
+
Number of requests currently pending in queues of the API Priority and Fairness subsystem
+
    +
  • BETA
  • +
  • Gauge
  • +
  • flow_schemapriority_level
+
+
apiserver_flowcontrol_dispatched_requests_total
+
Number of requests executed by API Priority and Fairness subsystem
+
    +
  • BETA
  • +
  • Counter
  • +
  • flow_schemapriority_level
+
+
apiserver_flowcontrol_nominal_limit_seats
+
Nominal number of execution seats configured for each priority level
+
    +
  • BETA
  • +
  • Gauge
  • +
  • priority_level
+
+
apiserver_flowcontrol_rejected_requests_total
+
Number of requests rejected by API Priority and Fairness subsystem
+
    +
  • BETA
  • +
  • Counter
  • +
  • flow_schemapriority_levelreason
+
+
apiserver_flowcontrol_request_wait_duration_seconds
+
Length of time a request spent waiting in its queue
+
    +
  • BETA
  • +
  • Histogram
  • +
  • executeflow_schemapriority_level
+
+
disabled_metrics_total
+
The count of disabled metrics.
+
    +
  • BETA
  • +
  • Counter
  • +
+
+
hidden_metrics_total
+
The count of hidden metrics.
+
    +
  • BETA
  • +
  • Counter
  • +
+
+
kubernetes_feature_enabled
+
This metric records the data about the stage and enablement of a k8s feature.
+
    +
  • BETA
  • +
  • Gauge
  • +
  • namestage
+
+
kubernetes_healthcheck
+
This metric records the result of a single healthcheck.
+
    +
  • BETA
  • +
  • Gauge
  • +
  • nametype
+
+
kubernetes_healthchecks_total
+
This metric records the results of all healthcheck.
+
    +
  • BETA
  • +
  • Counter
  • +
  • namestatustype
+
+
registered_metrics_total
+
The count of registered metrics broken by stability level and deprecation version.
+
    +
  • BETA
  • +
  • Counter
  • +
  • deprecated_versionstability_level
+
+
scheduler_pod_scheduling_sli_duration_seconds
+
E2e latency for a pod being scheduled, from the time the pod enters the scheduling queue an d might involve multiple scheduling attempts.
+
    +
  • BETA
  • +
  • Histogram
  • +
  • attempts
+
+
### List of Alpha Kubernetes Metrics -Alpha metrics do not have any API guarantees. These metrics must be used at your own risk, subsequent versions of Kubernetes may remove these metrics altogether, or mutate the API in such a way that breaks existing dashboards and alerts. - - - - - - - - - - - - - - +Alpha metrics do not have any API guarantees. These metrics must be used at your own risk, subsequent versions of Kubernetes may remove these metrics altogether, or mutate the API in such a way that breaks existing dashboards and alerts. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
NameStability LevelTypeHelpLabelsConst LabelsDeprecated Version
aggregator_discovery_aggregation_count_totalALPHACounterCounter of number of times discovery was aggregated
aggregator_openapi_v2_regeneration_countALPHACounterCounter of OpenAPI v2 spec regeneration count broken down by causing APIService name and reason.
apiservice
reason
aggregator_openapi_v2_regeneration_durationALPHAGaugeGauge of OpenAPI v2 spec regeneration duration in seconds.
reason
aggregator_unavailable_apiserviceALPHACustomGauge of APIServices which are marked as unavailable broken down by APIService name.
name
aggregator_unavailable_apiservice_totalALPHACounterCounter of APIServices which are marked as unavailable broken down by APIService name and reason.
name
reason
apiextensions_openapi_v2_regeneration_countALPHACounterCounter of OpenAPI v2 spec regeneration count broken down by causing CRD name and reason.
crd
reason
apiextensions_openapi_v3_regeneration_countALPHACounterCounter of OpenAPI v3 spec regeneration count broken down by group, version, causing CRD and reason.
crd
group
reason
version
apiserver_admission_match_condition_evaluation_errors_totalALPHACounterAdmission match condition evaluation errors count, identified by name of resource containing the match condition and broken out for each kind containing matchConditions (webhook or policy), operation and admission type (validate or admit).
kind
name
operation
type
apiserver_admission_match_condition_evaluation_secondsALPHAHistogramAdmission match condition evaluation time in seconds, identified by name and broken out for each kind containing matchConditions (webhook or policy), operation and type (validate or admit).
kind
name
operation
type
apiserver_admission_match_condition_exclusions_totalALPHACounterAdmission match condition evaluation exclusions count, identified by name of resource containing the match condition and broken out for each kind containing matchConditions (webhook or policy), operation and admission type (validate or admit).
kind
name
operation
type
apiserver_admission_step_admission_duration_seconds_summaryALPHASummaryAdmission sub-step latency summary in seconds, broken out for each operation and API resource and step type (validate or admit).
operation
rejected
type
apiserver_admission_webhook_fail_open_countALPHACounterAdmission webhook fail open count, identified by name and broken out for each admission type (validating or mutating).
name
type
apiserver_admission_webhook_rejection_countALPHACounterAdmission webhook rejection count, identified by name and broken out for each admission type (validating or admit) and operation. Additional labels specify an error type (calling_webhook_error or apiserver_internal_error if an error occurred; no_error otherwise) and optionally a non-zero rejection code if the webhook rejects the request with an HTTP status code (honored by the apiserver when the code is greater or equal to 400). Codes greater than 600 are truncated to 600, to keep the metrics cardinality bounded.
error_type
name
operation
rejection_code
type
apiserver_admission_webhook_request_totalALPHACounterAdmission webhook request total, identified by name and broken out for each admission type (validating or mutating) and operation. Additional labels specify whether the request was rejected or not and an HTTP status code. Codes greater than 600 are truncated to 600, to keep the metrics cardinality bounded.
code
name
operation
rejected
type
apiserver_audit_error_totalALPHACounterCounter of audit events that failed to be audited properly. Plugin identifies the plugin affected by the error.
plugin
apiserver_audit_event_totalALPHACounterCounter of audit events generated and sent to the audit backend.
apiserver_audit_level_totalALPHACounterCounter of policy levels for audit events (1 per request).
level
apiserver_audit_requests_rejected_totalALPHACounterCounter of apiserver requests rejected due to an error in audit logging backend.
apiserver_cache_list_fetched_objects_totalALPHACounterNumber of objects read from watch cache in the course of serving a LIST request
index
resource_prefix
apiserver_cache_list_returned_objects_totalALPHACounterNumber of objects returned for a LIST request from watch cache
resource_prefix
apiserver_cache_list_totalALPHACounterNumber of LIST requests served from watch cache
index
resource_prefix
apiserver_cel_compilation_duration_secondsALPHAHistogramCEL compilation time in seconds.
apiserver_cel_evaluation_duration_secondsALPHAHistogramCEL evaluation time in seconds.
apiserver_certificates_registry_csr_honored_duration_totalALPHACounterTotal number of issued CSRs with a requested duration that was honored, sliced by signer (only kubernetes.io signer names are specifically identified)
signerName
apiserver_certificates_registry_csr_requested_duration_totalALPHACounterTotal number of issued CSRs with a requested duration, sliced by signer (only kubernetes.io signer names are specifically identified)
signerName
apiserver_client_certificate_expiration_secondsALPHAHistogramDistribution of the remaining lifetime on the certificate used to authenticate a request.
apiserver_conversion_webhook_duration_secondsALPHAHistogramConversion webhook request latency
failure_type
result
apiserver_conversion_webhook_request_totalALPHACounterCounter for conversion webhook requests with success/failure and failure error type
failure_type
result
apiserver_crd_conversion_webhook_duration_secondsALPHAHistogramCRD webhook conversion duration in seconds
crd_name
from_version
succeeded
to_version
apiserver_current_inqueue_requestsALPHAGaugeMaximal number of queued requests in this apiserver per request kind in last second.
request_kind
apiserver_delegated_authn_request_duration_secondsALPHAHistogramRequest latency in seconds. Broken down by status code.
code
apiserver_delegated_authn_request_totalALPHACounterNumber of HTTP requests partitioned by status code.
code
apiserver_delegated_authz_request_duration_secondsALPHAHistogramRequest latency in seconds. Broken down by status code.
code
apiserver_delegated_authz_request_totalALPHACounterNumber of HTTP requests partitioned by status code.
code
apiserver_egress_dialer_dial_duration_secondsALPHAHistogramDial latency histogram in seconds, labeled by the protocol (http-connect or grpc), transport (tcp or uds)
protocol
transport
apiserver_egress_dialer_dial_failure_countALPHACounterDial failure count, labeled by the protocol (http-connect or grpc), transport (tcp or uds), and stage (connect or proxy). The stage indicates at which stage the dial failed
protocol
stage
transport
apiserver_egress_dialer_dial_start_totalALPHACounterDial starts, labeled by the protocol (http-connect or grpc) and transport (tcp or uds).
protocol
transport
apiserver_encryption_config_controller_automatic_reload_failures_totalALPHACounterTotal number of failed automatic reloads of encryption configuration.
apiserver_encryption_config_controller_automatic_reload_last_timestamp_secondsALPHAGaugeTimestamp of the last successful or failed automatic reload of encryption configuration.
status
apiserver_encryption_config_controller_automatic_reload_success_totalALPHACounterTotal number of successful automatic reloads of encryption configuration.
apiserver_envelope_encryption_dek_cache_fill_percentALPHAGaugePercent of the cache slots currently occupied by cached DEKs.
apiserver_envelope_encryption_dek_cache_inter_arrival_time_secondsALPHAHistogramTime (in seconds) of inter arrival of transformation requests.
transformation_type
apiserver_envelope_encryption_invalid_key_id_from_status_totalALPHACounterNumber of times an invalid keyID is returned by the Status RPC call split by error.
error
provider_name
apiserver_envelope_encryption_key_id_hash_last_timestamp_secondsALPHAGaugeThe last time in seconds when a keyID was used.
key_id_hash
provider_name
transformation_type
apiserver_envelope_encryption_key_id_hash_status_last_timestamp_secondsALPHAGaugeThe last time in seconds when a keyID was returned by the Status RPC call.
key_id_hash
provider_name
apiserver_envelope_encryption_key_id_hash_totalALPHACounterNumber of times a keyID is used split by transformation type and provider.
key_id_hash
provider_name
transformation_type
apiserver_envelope_encryption_kms_operations_latency_secondsALPHAHistogramKMS operation duration with gRPC error code status total.
grpc_status_code
method_name
provider_name
apiserver_flowcontrol_current_limit_seatsALPHAGaugecurrent derived number of execution seats available to each priority level
priority_level
apiserver_flowcontrol_current_rALPHAGaugeR(time of last change)
priority_level
apiserver_flowcontrol_demand_seatsALPHATimingRatioHistogramObservations, at the end of every nanosecond, of (the number of seats each priority level could use) / (nominal number of seats for that level)
priority_level
apiserver_flowcontrol_demand_seats_averageALPHAGaugeTime-weighted average, over last adjustment period, of demand_seats
priority_level
apiserver_flowcontrol_demand_seats_high_watermarkALPHAGaugeHigh watermark, over last adjustment period, of demand_seats
priority_level
apiserver_flowcontrol_demand_seats_smoothedALPHAGaugeSmoothed seat demands
priority_level
apiserver_flowcontrol_demand_seats_stdevALPHAGaugeTime-weighted standard deviation, over last adjustment period, of demand_seats
priority_level
apiserver_flowcontrol_dispatch_rALPHAGaugeR(time of last dispatch)
priority_level
apiserver_flowcontrol_epoch_advance_totalALPHACounterNumber of times the queueset's progress meter jumped backward
priority_level
success
apiserver_flowcontrol_latest_sALPHAGaugeS(most recently dispatched request)
priority_level
apiserver_flowcontrol_lower_limit_seatsALPHAGaugeConfigured lower bound on number of execution seats available to each priority level
priority_level
apiserver_flowcontrol_next_discounted_s_boundsALPHAGaugemin and max, over queues, of S(oldest waiting request in queue) - estimated work in progress
bound
priority_level
apiserver_flowcontrol_next_s_boundsALPHAGaugemin and max, over queues, of S(oldest waiting request in queue)
bound
priority_level
apiserver_flowcontrol_priority_level_request_utilizationALPHATimingRatioHistogramObservations, at the end of every nanosecond, of number of requests (as a fraction of the relevant limit) waiting or in any stage of execution (but only initial stage for WATCHes)
phase
priority_level
apiserver_flowcontrol_priority_level_seat_utilizationALPHATimingRatioHistogramObservations, at the end of every nanosecond, of utilization of seats for any stage of execution (but only initial stage for WATCHes)
priority_level
phase:executing
apiserver_flowcontrol_read_vs_write_current_requestsALPHATimingRatioHistogramObservations, at the end of every nanosecond, of the number of requests (as a fraction of the relevant limit) waiting or in regular stage of execution
phase
request_kind
apiserver_flowcontrol_request_concurrency_in_useALPHAGaugeConcurrency (number of seats) occupied by the currently executing (initial stage for a WATCH, any stage otherwise) requests in the API Priority and Fairness subsystem
flow_schema
priority_level
1.31.0
apiserver_flowcontrol_request_concurrency_limitALPHAGaugeNominal number of execution seats configured for each priority level
priority_level
1.30.0
apiserver_flowcontrol_request_dispatch_no_accommodation_totalALPHACounterNumber of times a dispatch attempt resulted in a non accommodation due to lack of available seats
flow_schema
priority_level
apiserver_flowcontrol_request_execution_secondsALPHAHistogramDuration of initial stage (for a WATCH) or any (for a non-WATCH) stage of request execution in the API Priority and Fairness subsystem
flow_schema
priority_level
type
apiserver_flowcontrol_request_queue_length_after_enqueueALPHAHistogramLength of queue in the API Priority and Fairness subsystem, as seen by each request after it is enqueued
flow_schema
priority_level
apiserver_flowcontrol_seat_fair_fracALPHAGaugeFair fraction of server's concurrency to allocate to each priority level that can use it
apiserver_flowcontrol_target_seatsALPHAGaugeSeat allocation targets
priority_level
apiserver_flowcontrol_upper_limit_seatsALPHAGaugeConfigured upper bound on number of execution seats available to each priority level
priority_level
apiserver_flowcontrol_watch_count_samplesALPHAHistogramcount of watchers for mutating requests in API Priority and Fairness
flow_schema
priority_level
apiserver_flowcontrol_work_estimated_seatsALPHAHistogramNumber of estimated seats (maximum of initial and final seats) associated with requests in API Priority and Fairness
flow_schema
priority_level
apiserver_init_events_totalALPHACounterCounter of init events processed in watch cache broken by resource type.
resource
apiserver_kube_aggregator_x509_insecure_sha1_totalALPHACounterCounts the number of requests to servers with insecure SHA1 signatures in their serving certificate OR the number of connection failures due to the insecure SHA1 signatures (either/or, based on the runtime environment)
apiserver_kube_aggregator_x509_missing_san_totalALPHACounterCounts the number of requests to servers missing SAN extension in their serving certificate OR the number of connection failures due to the lack of x509 certificate SAN extension missing (either/or, based on the runtime environment)
apiserver_request_aborts_totalALPHACounterNumber of requests which apiserver aborted possibly due to a timeout, for each group, version, verb, resource, subresource and scope
group
resource
scope
subresource
verb
version
apiserver_request_body_sizesALPHAHistogramApiserver request body sizes broken out by size.
resource
verb
apiserver_request_filter_duration_secondsALPHAHistogramRequest filter latency distribution in seconds, for each filter type
filter
apiserver_request_post_timeout_totalALPHACounterTracks the activity of the request handlers after the associated requests have been timed out by the apiserver
source
status
apiserver_request_sli_duration_secondsALPHAHistogramResponse latency distribution (not counting webhook duration and priority & fairness queue wait times) in seconds for each verb, group, version, resource, subresource, scope and component.
component
group
resource
scope
subresource
verb
version
apiserver_request_slo_duration_secondsALPHAHistogramResponse latency distribution (not counting webhook duration and priority & fairness queue wait times) in seconds for each verb, group, version, resource, subresource, scope and component.
component
group
resource
scope
subresource
verb
version
1.27.0
apiserver_request_terminations_totalALPHACounterNumber of requests which apiserver terminated in self-defense.
code
component
group
resource
scope
subresource
verb
version
apiserver_request_timestamp_comparison_timeALPHAHistogramTime taken for comparison of old vs new objects in UPDATE or PATCH requests
code_path
apiserver_rerouted_request_totalALPHACounterTotal number of requests that were proxied to a peer kube apiserver because the local apiserver was not capable of serving it
code
apiserver_selfrequest_totalALPHACounterCounter of apiserver self-requests broken out for each verb, API resource and subresource.
resource
subresource
verb
apiserver_storage_data_key_generation_duration_secondsALPHAHistogramLatencies in seconds of data encryption key(DEK) generation operations.
apiserver_storage_data_key_generation_failures_totalALPHACounterTotal number of failed data encryption key(DEK) generation operations.
apiserver_storage_db_total_size_in_bytesALPHAGaugeTotal size of the storage database file physically allocated in bytes.
endpoint
1.28.0
apiserver_storage_decode_errors_totalALPHACounterNumber of stored object decode errors split by object type
resource
apiserver_storage_envelope_transformation_cache_misses_totalALPHACounterTotal number of cache misses while accessing key decryption key(KEK).
apiserver_storage_events_received_totalALPHACounterNumber of etcd events received split by kind.
resource
apiserver_storage_list_evaluated_objects_totalALPHACounterNumber of objects tested in the course of serving a LIST request from storage
resource
apiserver_storage_list_fetched_objects_totalALPHACounterNumber of objects read from storage in the course of serving a LIST request
resource
apiserver_storage_list_returned_objects_totalALPHACounterNumber of objects returned for a LIST request from storage
resource
apiserver_storage_list_totalALPHACounterNumber of LIST requests served from storage
resource
apiserver_storage_size_bytesALPHACustomSize of the storage database file physically allocated in bytes.
cluster
apiserver_storage_transformation_duration_secondsALPHAHistogramLatencies in seconds of value transformation operations.
transformation_type
transformer_prefix
apiserver_storage_transformation_operations_totalALPHACounterTotal number of transformations. Successful transformation will have a status 'OK' and a varied status string when the transformation fails. This status and transformation_type fields may be used for alerting on encryption/decryption failure using transformation_type from_storage for decryption and to_storage for encryption
status
transformation_type
transformer_prefix
apiserver_terminated_watchers_totalALPHACounterCounter of watchers closed due to unresponsiveness broken by resource type.
resource
apiserver_tls_handshake_errors_totalALPHACounterNumber of requests dropped with 'TLS handshake error from' error
apiserver_validating_admission_policy_check_duration_secondsALPHAHistogramValidation admission latency for individual validation expressions in seconds, labeled by policy and further including binding, state and enforcement action taken.
enforcement_action
policy
policy_binding
state
apiserver_validating_admission_policy_check_totalALPHACounterValidation admission policy check total, labeled by policy and further identified by binding, enforcement action taken, and state.
enforcement_action
policy
policy_binding
state
apiserver_validating_admission_policy_definition_totalALPHACounterValidation admission policy count total, labeled by state and enforcement action.
enforcement_action
state
apiserver_watch_cache_events_dispatched_totalALPHACounterCounter of events dispatched in watch cache broken by resource type.
resource
apiserver_watch_cache_events_received_totalALPHACounterCounter of events received in watch cache broken by resource type.
resource
apiserver_watch_cache_initializations_totalALPHACounterCounter of watch cache initializations broken by resource type.
resource
apiserver_watch_events_sizesALPHAHistogramWatch event size distribution in bytes
group
kind
version
apiserver_watch_events_totalALPHACounterNumber of events sent in watch clients
group
kind
version
apiserver_webhooks_x509_insecure_sha1_totalALPHACounterCounts the number of requests to servers with insecure SHA1 signatures in their serving certificate OR the number of connection failures due to the insecure SHA1 signatures (either/or, based on the runtime environment)
apiserver_webhooks_x509_missing_san_totalALPHACounterCounts the number of requests to servers missing SAN extension in their serving certificate OR the number of connection failures due to the lack of x509 certificate SAN extension missing (either/or, based on the runtime environment)
attach_detach_controller_attachdetach_controller_forced_detachesALPHACounterNumber of times the A/D Controller performed a forced detach
reason
attachdetach_controller_total_volumesALPHACustomNumber of volumes in A/D Controller
plugin_name
state
authenticated_user_requestsALPHACounterCounter of authenticated requests broken out by username.
username
authentication_attemptsALPHACounterCounter of authenticated attempts.
result
authentication_duration_secondsALPHAHistogramAuthentication duration in seconds broken out by result.
result
authentication_token_cache_active_fetch_countALPHAGauge
status
authentication_token_cache_fetch_totalALPHACounter
status
authentication_token_cache_request_duration_secondsALPHAHistogram
status
authentication_token_cache_request_totalALPHACounter
status
authorization_attempts_totalALPHACounterCounter of authorization attempts broken down by result. It can be either 'allowed', 'denied', 'no-opinion' or 'error'.
result
authorization_duration_secondsALPHAHistogramAuthorization duration in seconds broken out by result.
result
cloud_provider_webhook_request_duration_secondsALPHAHistogramRequest latency in seconds. Broken down by status code.
code
webhook
cloud_provider_webhook_request_totalALPHACounterNumber of HTTP requests partitioned by status code.
code
webhook
cloudprovider_azure_api_request_duration_secondsALPHAHistogramLatency of an Azure API call
request
resource_group
source
subscription_id
cloudprovider_azure_api_request_errorsALPHACounterNumber of errors for an Azure API call
request
resource_group
source
subscription_id
cloudprovider_azure_api_request_ratelimited_countALPHACounterNumber of rate limited Azure API calls
request
resource_group
source
subscription_id
cloudprovider_azure_api_request_throttled_countALPHACounterNumber of throttled Azure API calls
request
resource_group
source
subscription_id
cloudprovider_azure_op_duration_secondsALPHAHistogramLatency of an Azure service operation
request
resource_group
source
subscription_id
cloudprovider_azure_op_failure_countALPHACounterNumber of failed Azure service operations
request
resource_group
source
subscription_id
cloudprovider_gce_api_request_duration_secondsALPHAHistogramLatency of a GCE API call
region
request
version
zone
cloudprovider_gce_api_request_errorsALPHACounterNumber of errors for an API call
region
request
version
zone
cloudprovider_vsphere_api_request_duration_secondsALPHAHistogramLatency of vsphere api call
request
cloudprovider_vsphere_api_request_errorsALPHACountervsphere Api errors
request
cloudprovider_vsphere_operation_duration_secondsALPHAHistogramLatency of vsphere operation call
operation
cloudprovider_vsphere_operation_errorsALPHACountervsphere operation errors
operation
cloudprovider_vsphere_vcenter_versionsALPHACustomVersions for connected vSphere vCenters
hostname
version
build
container_cpu_usage_seconds_totalALPHACustomCumulative cpu time consumed by the container in core-seconds
container
pod
namespace
container_memory_working_set_bytesALPHACustomCurrent working set of the container in bytes
container
pod
namespace
container_start_time_secondsALPHACustomStart time of the container since unix epoch in seconds
container
pod
namespace
container_swap_usage_bytesALPHACustomCurrent amount of the container swap usage in bytes. Reported only on non-windows systems
container
pod
namespace
csi_operations_secondsALPHAHistogramContainer Storage Interface operation duration with gRPC error code status total
driver_name
grpc_status_code
method_name
migrated
endpoint_slice_controller_changesALPHACounterNumber of EndpointSlice changes
operation
endpoint_slice_controller_desired_endpoint_slicesALPHAGaugeNumber of EndpointSlices that would exist with perfect endpoint allocation
endpoint_slice_controller_endpoints_added_per_syncALPHAHistogramNumber of endpoints added on each Service sync
endpoint_slice_controller_endpoints_desiredALPHAGaugeNumber of endpoints desired
endpoint_slice_controller_endpoints_removed_per_syncALPHAHistogramNumber of endpoints removed on each Service sync
endpoint_slice_controller_endpointslices_changed_per_syncALPHAHistogramNumber of EndpointSlices changed on each Service sync
topology
endpoint_slice_controller_num_endpoint_slicesALPHAGaugeNumber of EndpointSlices
endpoint_slice_controller_syncsALPHACounterNumber of EndpointSlice syncs
result
endpoint_slice_mirroring_controller_addresses_skipped_per_syncALPHAHistogramNumber of addresses skipped on each Endpoints sync due to being invalid or exceeding MaxEndpointsPerSubset
endpoint_slice_mirroring_controller_changesALPHACounterNumber of EndpointSlice changes
operation
endpoint_slice_mirroring_controller_desired_endpoint_slicesALPHAGaugeNumber of EndpointSlices that would exist with perfect endpoint allocation
endpoint_slice_mirroring_controller_endpoints_added_per_syncALPHAHistogramNumber of endpoints added on each Endpoints sync
endpoint_slice_mirroring_controller_endpoints_desiredALPHAGaugeNumber of endpoints desired
endpoint_slice_mirroring_controller_endpoints_removed_per_syncALPHAHistogramNumber of endpoints removed on each Endpoints sync
endpoint_slice_mirroring_controller_endpoints_sync_durationALPHAHistogramDuration of syncEndpoints() in seconds
endpoint_slice_mirroring_controller_endpoints_updated_per_syncALPHAHistogramNumber of endpoints updated on each Endpoints sync
endpoint_slice_mirroring_controller_num_endpoint_slicesALPHAGaugeNumber of EndpointSlices
ephemeral_volume_controller_create_failures_totalALPHACounterNumber of PersistenVolumeClaims creation requests
ephemeral_volume_controller_create_totalALPHACounterNumber of PersistenVolumeClaims creation requests
etcd_bookmark_countsALPHAGaugeNumber of etcd bookmarks (progress notify events) split by kind.
resource
etcd_lease_object_countsALPHAHistogramNumber of objects attached to a single etcd lease.
etcd_request_duration_secondsALPHAHistogramEtcd request latency in seconds for each operation and object type.
operation
type
etcd_request_errors_totalALPHACounterEtcd failed request counts for each operation and object type.
operation
type
etcd_requests_totalALPHACounterEtcd request counts for each operation and object type.
operation
type
etcd_version_infoALPHAGaugeEtcd server's binary version
binary_version
field_validation_request_duration_secondsALPHAHistogramResponse latency distribution in seconds for each field validation value
field_validation
force_cleaned_failed_volume_operation_errors_totalALPHACounterThe number of volumes that failed force cleanup after their reconstruction failed during kubelet startup.
force_cleaned_failed_volume_operations_totalALPHACounterThe number of volumes that were force cleaned after their reconstruction failed during kubelet startup. This includes both successful and failed cleanups.
garbagecollector_controller_resources_sync_error_totalALPHACounterNumber of garbage collector resources sync errors
get_token_countALPHACounterCounter of total Token() requests to the alternate token source
get_token_fail_countALPHACounterCounter of failed Token() requests to the alternate token source
horizontal_pod_autoscaler_controller_metric_computation_duration_secondsALPHAHistogramThe time(seconds) that the HPA controller takes to calculate one metric. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. The label 'error' should be either 'spec', 'internal', or 'none'. The label 'metric_type' corresponds to HPA.spec.metrics[*].type
action
error
metric_type
horizontal_pod_autoscaler_controller_metric_computation_totalALPHACounterNumber of metric computations. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. Also, the label 'error' should be either 'spec', 'internal', or 'none'. The label 'metric_type' corresponds to HPA.spec.metrics[*].type
action
error
metric_type
horizontal_pod_autoscaler_controller_reconciliation_duration_secondsALPHAHistogramThe time(seconds) that the HPA controller takes to reconcile once. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. Also, the label 'error' should be either 'spec', 'internal', or 'none'. Note that if both spec and internal errors happen during a reconciliation, the first one to occur is reported in `error` label.
action
error
horizontal_pod_autoscaler_controller_reconciliations_totalALPHACounterNumber of reconciliations of HPA controller. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. Also, the label 'error' should be either 'spec', 'internal', or 'none'. Note that if both spec and internal errors happen during a reconciliation, the first one to occur is reported in `error` label.
action
error
job_controller_pod_failures_handled_by_failure_policy_totalALPHACounter`The number of failed Pods handled by failure policy with, respect to the failure policy action applied based on the matched, rule. Possible values of the action label correspond to the, possible values for the failure policy rule action, which are:, "FailJob", "Ignore" and "Count".`
action
job_controller_terminated_pods_tracking_finalizer_totalALPHACounter`The number of terminated pods (phase=Failed|Succeeded), that have the finalizer batch.kubernetes.io/job-tracking, The event label can be "add" or "delete".`
event
kube_apiserver_clusterip_allocator_allocated_ipsALPHAGaugeGauge measuring the number of allocated IPs for Services
cidr
kube_apiserver_clusterip_allocator_allocation_errors_totalALPHACounterNumber of errors trying to allocate Cluster IPs
cidr
scope
kube_apiserver_clusterip_allocator_allocation_totalALPHACounterNumber of Cluster IPs allocations
cidr
scope
kube_apiserver_clusterip_allocator_available_ipsALPHAGaugeGauge measuring the number of available IPs for Services
cidr
kube_apiserver_nodeport_allocator_allocated_portsALPHAGaugeGauge measuring the number of allocated NodePorts for Services
kube_apiserver_nodeport_allocator_available_portsALPHAGaugeGauge measuring the number of available NodePorts for Services
kube_apiserver_pod_logs_backend_tls_failure_totalALPHACounterTotal number of requests for pods/logs that failed due to kubelet server TLS verification
kube_apiserver_pod_logs_insecure_backend_totalALPHACounterTotal number of requests for pods/logs sliced by usage type: enforce_tls, skip_tls_allowed, skip_tls_denied
usage
kube_apiserver_pod_logs_pods_logs_backend_tls_failure_totalALPHACounterTotal number of requests for pods/logs that failed due to kubelet server TLS verification1.27.0
kube_apiserver_pod_logs_pods_logs_insecure_backend_totalALPHACounterTotal number of requests for pods/logs sliced by usage type: enforce_tls, skip_tls_allowed, skip_tls_denied
usage
1.27.0
kubelet_active_podsALPHAGaugeThe number of pods the kubelet considers active and which are being considered when admitting new pods. static is true if the pod is not from the apiserver.
static
kubelet_certificate_manager_client_expiration_renew_errorsALPHACounterCounter of certificate renewal errors.
kubelet_certificate_manager_client_ttl_secondsALPHAGaugeGauge of the TTL (time-to-live) of the Kubelet's client certificate. The value is in seconds until certificate expiry (negative if already expired). If client certificate is invalid or unused, the value will be +INF.
kubelet_certificate_manager_server_rotation_secondsALPHAHistogramHistogram of the number of seconds the previous certificate lived before being rotated.
kubelet_certificate_manager_server_ttl_secondsALPHAGaugeGauge of the shortest TTL (time-to-live) of the Kubelet's serving certificate. The value is in seconds until certificate expiry (negative if already expired). If serving certificate is invalid or unused, the value will be +INF.
kubelet_cgroup_manager_duration_secondsALPHAHistogramDuration in seconds for cgroup manager operations. Broken down by method.
operation_type
kubelet_container_log_filesystem_used_bytesALPHACustomBytes used by the container's logs on the filesystem.
uid
namespace
pod
container
kubelet_containers_per_pod_countALPHAHistogramThe number of containers per pod.
kubelet_cpu_manager_pinning_errors_totalALPHACounterThe number of cpu core allocations which required pinning failed.
kubelet_cpu_manager_pinning_requests_totalALPHACounterThe number of cpu core allocations which required pinning.
kubelet_credential_provider_plugin_durationALPHAHistogramDuration of execution in seconds for credential provider plugin
plugin_name
kubelet_credential_provider_plugin_errorsALPHACounterNumber of errors from credential provider plugin
plugin_name
kubelet_desired_podsALPHAGaugeThe number of pods the kubelet is being instructed to run. static is true if the pod is not from the apiserver.
static
kubelet_device_plugin_alloc_duration_secondsALPHAHistogramDuration in seconds to serve a device plugin Allocation request. Broken down by resource name.
resource_name
kubelet_device_plugin_registration_totalALPHACounterCumulative number of device plugin registrations. Broken down by resource name.
resource_name
kubelet_evented_pleg_connection_error_countALPHACounterThe number of errors encountered during the establishment of streaming connection with the CRI runtime.
kubelet_evented_pleg_connection_latency_secondsALPHAHistogramThe latency of streaming connection with the CRI runtime, measured in seconds.
kubelet_evented_pleg_connection_success_countALPHACounterThe number of times a streaming client was obtained to receive CRI Events.
kubelet_eviction_stats_age_secondsALPHAHistogramTime between when stats are collected, and when pod is evicted based on those stats by eviction signal
eviction_signal
kubelet_evictionsALPHACounterCumulative number of pod evictions by eviction signal
eviction_signal
kubelet_graceful_shutdown_end_time_secondsALPHAGaugeLast graceful shutdown start time since unix epoch in seconds
kubelet_graceful_shutdown_start_time_secondsALPHAGaugeLast graceful shutdown start time since unix epoch in seconds
kubelet_http_inflight_requestsALPHAGaugeNumber of the inflight http requests
long_running
method
path
server_type
kubelet_http_requests_duration_secondsALPHAHistogramDuration in seconds to serve http requests
long_running
method
path
server_type
kubelet_http_requests_totalALPHACounterNumber of the http requests received since the server started
long_running
method
path
server_type
kubelet_lifecycle_handler_http_fallbacks_totalALPHACounterThe number of times lifecycle handlers successfully fell back to http from https.
kubelet_managed_ephemeral_containersALPHAGaugeCurrent number of ephemeral containers in pods managed by this kubelet.
kubelet_mirror_podsALPHAGaugeThe number of mirror pods the kubelet will try to create (one per admitted static pod)
kubelet_node_nameALPHAGaugeThe node's name. The count is always 1.
node
kubelet_orphan_pod_cleaned_volumesALPHAGaugeThe total number of orphaned Pods whose volumes were cleaned in the last periodic sweep.
kubelet_orphan_pod_cleaned_volumes_errorsALPHAGaugeThe number of orphaned Pods whose volumes failed to be cleaned in the last periodic sweep.
kubelet_orphaned_runtime_pods_totalALPHACounterNumber of pods that have been detected in the container runtime without being already known to the pod worker. This typically indicates the kubelet was restarted while a pod was force deleted in the API or in the local configuration, which is unusual.
kubelet_pleg_discard_eventsALPHACounterThe number of discard events in PLEG.
kubelet_pleg_last_seen_secondsALPHAGaugeTimestamp in seconds when PLEG was last seen active.
kubelet_pleg_relist_duration_secondsALPHAHistogramDuration in seconds for relisting pods in PLEG.
kubelet_pleg_relist_interval_secondsALPHAHistogramInterval in seconds between relisting in PLEG.
kubelet_pod_resources_endpoint_errors_getALPHACounterNumber of requests to the PodResource Get endpoint which returned error. Broken down by server api version.
server_api_version
kubelet_pod_resources_endpoint_errors_get_allocatableALPHACounterNumber of requests to the PodResource GetAllocatableResources endpoint which returned error. Broken down by server api version.
server_api_version
kubelet_pod_resources_endpoint_errors_listALPHACounterNumber of requests to the PodResource List endpoint which returned error. Broken down by server api version.
server_api_version
kubelet_pod_resources_endpoint_requests_getALPHACounterNumber of requests to the PodResource Get endpoint. Broken down by server api version.
server_api_version
kubelet_pod_resources_endpoint_requests_get_allocatableALPHACounterNumber of requests to the PodResource GetAllocatableResources endpoint. Broken down by server api version.
server_api_version
kubelet_pod_resources_endpoint_requests_listALPHACounterNumber of requests to the PodResource List endpoint. Broken down by server api version.
server_api_version
kubelet_pod_resources_endpoint_requests_totalALPHACounterCumulative number of requests to the PodResource endpoint. Broken down by server api version.
server_api_version
kubelet_pod_start_duration_secondsALPHAHistogramDuration in seconds from kubelet seeing a pod for the first time to the pod starting to run
kubelet_pod_start_sli_duration_secondsALPHAHistogramDuration in seconds to start a pod, excluding time to pull images and run init containers, measured from pod creation timestamp to when all its containers are reported as started and observed via watch
kubelet_pod_status_sync_duration_secondsALPHAHistogramDuration in seconds to sync a pod status update. Measures time from detection of a change to pod status until the API is successfully updated for that pod, even if multiple intevening changes to pod status occur.
kubelet_pod_worker_duration_secondsALPHAHistogramDuration in seconds to sync a single pod. Broken down by operation type: create, update, or sync
operation_type
kubelet_pod_worker_start_duration_secondsALPHAHistogramDuration in seconds from kubelet seeing a pod to starting a worker.
kubelet_preemptionsALPHACounterCumulative number of pod preemptions by preemption resource
preemption_signal
kubelet_restarted_pods_totalALPHACounterNumber of pods that have been restarted because they were deleted and recreated with the same UID while the kubelet was watching them (common for static pods, extremely uncommon for API pods)
static
kubelet_run_podsandbox_duration_secondsALPHAHistogramDuration in seconds of the run_podsandbox operations. Broken down by RuntimeClass.Handler.
runtime_handler
kubelet_run_podsandbox_errors_totalALPHACounterCumulative number of the run_podsandbox operation errors by RuntimeClass.Handler.
runtime_handler
kubelet_running_containersALPHAGaugeNumber of containers currently running
container_state
kubelet_running_podsALPHAGaugeNumber of pods that have a running pod sandbox
kubelet_runtime_operations_duration_secondsALPHAHistogramDuration in seconds of runtime operations. Broken down by operation type.
operation_type
kubelet_runtime_operations_errors_totalALPHACounterCumulative number of runtime operation errors by operation type.
operation_type
kubelet_runtime_operations_totalALPHACounterCumulative number of runtime operations by operation type.
operation_type
kubelet_server_expiration_renew_errorsALPHACounterCounter of certificate renewal errors.
kubelet_started_containers_errors_totalALPHACounterCumulative number of errors when starting containers
code
container_type
kubelet_started_containers_totalALPHACounterCumulative number of containers started
container_type
kubelet_started_host_process_containers_errors_totalALPHACounterCumulative number of errors when starting hostprocess containers. This metric will only be collected on Windows.
code
container_type
kubelet_started_host_process_containers_totalALPHACounterCumulative number of hostprocess containers started. This metric will only be collected on Windows.
container_type
kubelet_started_pods_errors_totalALPHACounterCumulative number of errors when starting pods
kubelet_started_pods_totalALPHACounterCumulative number of pods started
kubelet_topology_manager_admission_duration_msALPHAHistogramDuration in milliseconds to serve a pod admission request.
kubelet_topology_manager_admission_errors_totalALPHACounterThe number of admission request failures where resources could not be aligned.
kubelet_topology_manager_admission_requests_totalALPHACounterThe number of admission requests where resources have to be aligned.
kubelet_volume_metric_collection_duration_secondsALPHAHistogramDuration in seconds to calculate volume stats
metric_source
kubelet_volume_stats_available_bytesALPHACustomNumber of available bytes in the volume
namespace
persistentvolumeclaim
kubelet_volume_stats_capacity_bytesALPHACustomCapacity in bytes of the volume
namespace
persistentvolumeclaim
kubelet_volume_stats_health_status_abnormalALPHACustomAbnormal volume health status. The count is either 1 or 0. 1 indicates the volume is unhealthy, 0 indicates volume is healthy
namespace
persistentvolumeclaim
kubelet_volume_stats_inodesALPHACustomMaximum number of inodes in the volume
namespace
persistentvolumeclaim
kubelet_volume_stats_inodes_freeALPHACustomNumber of free inodes in the volume
namespace
persistentvolumeclaim
kubelet_volume_stats_inodes_usedALPHACustomNumber of used inodes in the volume
namespace
persistentvolumeclaim
kubelet_volume_stats_used_bytesALPHACustomNumber of used bytes in the volume
namespace
persistentvolumeclaim
kubelet_working_podsALPHAGaugeNumber of pods the kubelet is actually running, broken down by lifecycle phase, whether the pod is desired, orphaned, or runtime only (also orphaned), and whether the pod is static. An orphaned pod has been removed from local configuration or force deleted in the API and consumes resources that are not otherwise visible.
config
lifecycle
static
kubeproxy_network_programming_duration_secondsALPHAHistogramIn Cluster Network Programming Latency in seconds
kubeproxy_proxy_healthz_totalALPHACounterCumulative proxy healthz HTTP status
code
kubeproxy_proxy_livez_totalALPHACounterCumulative proxy livez HTTP status
code
kubeproxy_sync_full_proxy_rules_duration_secondsALPHAHistogramSyncProxyRules latency in seconds for full resyncs
kubeproxy_sync_partial_proxy_rules_duration_secondsALPHAHistogramSyncProxyRules latency in seconds for partial resyncs
kubeproxy_sync_proxy_rules_duration_secondsALPHAHistogramSyncProxyRules latency in seconds
kubeproxy_sync_proxy_rules_endpoint_changes_pendingALPHAGaugePending proxy rules Endpoint changes
kubeproxy_sync_proxy_rules_endpoint_changes_totalALPHACounterCumulative proxy rules Endpoint changes
kubeproxy_sync_proxy_rules_iptables_lastALPHAGaugeNumber of iptables rules written by kube-proxy in last sync
table
kubeproxy_sync_proxy_rules_iptables_partial_restore_failures_totalALPHACounterCumulative proxy iptables partial restore failures
kubeproxy_sync_proxy_rules_iptables_restore_failures_totalALPHACounterCumulative proxy iptables restore failures
kubeproxy_sync_proxy_rules_iptables_totalALPHAGaugeTotal number of iptables rules owned by kube-proxy
table
kubeproxy_sync_proxy_rules_last_queued_timestamp_secondsALPHAGaugeThe last time a sync of proxy rules was queued
kubeproxy_sync_proxy_rules_last_timestamp_secondsALPHAGaugeThe last time proxy rules were successfully synced
kubeproxy_sync_proxy_rules_no_local_endpoints_totalALPHAGaugeNumber of services with a Local traffic policy and no endpoints
traffic_policy
kubeproxy_sync_proxy_rules_service_changes_pendingALPHAGaugePending proxy rules Service changes
kubeproxy_sync_proxy_rules_service_changes_totalALPHACounterCumulative proxy rules Service changes
kubernetes_build_infoALPHAGaugeA metric with a constant '1' value labeled by major, minor, git version, git commit, git tree state, build date, Go version, and compiler from which Kubernetes was built, and platform on which it is running.
build_date
compiler
git_commit
git_tree_state
git_version
go_version
major
minor
platform
leader_election_master_statusALPHAGaugeGauge of if the reporting system is master of the relevant lease, 0 indicates backup, 1 indicates master. 'name' is the string used to identify the lease. Please make sure to group by name.
name
node_authorizer_graph_actions_duration_secondsALPHAHistogramHistogram of duration of graph actions in node authorizer.
operation
node_collector_unhealthy_nodes_in_zoneALPHAGaugeGauge measuring number of not Ready Nodes per zones.
zone
node_collector_update_all_nodes_health_duration_secondsALPHAHistogramDuration in seconds for NodeController to update the health of all nodes.
node_collector_update_node_health_duration_secondsALPHAHistogramDuration in seconds for NodeController to update the health of a single node.
node_collector_zone_healthALPHAGaugeGauge measuring percentage of healthy nodes per zone.
zone
node_collector_zone_sizeALPHAGaugeGauge measuring number of registered Nodes per zones.
zone
node_controller_cloud_provider_taint_removal_delay_secondsALPHAHistogramNumber of seconds after node creation when NodeController removed the cloud-provider taint of a single node.
node_controller_initial_node_sync_delay_secondsALPHAHistogramNumber of seconds after node creation when NodeController finished the initial synchronization of a single node.
node_cpu_usage_seconds_totalALPHACustomCumulative cpu time consumed by the node in core-seconds
node_ipam_controller_cidrset_allocation_tries_per_requestALPHAHistogramNumber of endpoints added on each Service sync
clusterCIDR
node_ipam_controller_cidrset_cidrs_allocations_totalALPHACounterCounter measuring total number of CIDR allocations.
clusterCIDR
node_ipam_controller_cidrset_cidrs_releases_totalALPHACounterCounter measuring total number of CIDR releases.
clusterCIDR
node_ipam_controller_cidrset_usage_cidrsALPHAGaugeGauge measuring percentage of allocated CIDRs.
clusterCIDR
node_ipam_controller_cirdset_max_cidrsALPHAGaugeMaximum number of CIDRs that can be allocated.
clusterCIDR
node_ipam_controller_multicidrset_allocation_tries_per_requestALPHAHistogramHistogram measuring CIDR allocation tries per request.
clusterCIDR
node_ipam_controller_multicidrset_cidrs_allocations_totalALPHACounterCounter measuring total number of CIDR allocations.
clusterCIDR
node_ipam_controller_multicidrset_cidrs_releases_totalALPHACounterCounter measuring total number of CIDR releases.
clusterCIDR
node_ipam_controller_multicidrset_usage_cidrsALPHAGaugeGauge measuring percentage of allocated CIDRs.
clusterCIDR
node_ipam_controller_multicirdset_max_cidrsALPHAGaugeMaximum number of CIDRs that can be allocated.
clusterCIDR
node_memory_working_set_bytesALPHACustomCurrent working set of the node in bytes
node_swap_usage_bytesALPHACustomCurrent swap usage of the node in bytes. Reported only on non-windows systems
number_of_l4_ilbsALPHAGaugeNumber of L4 ILBs
feature
plugin_manager_total_pluginsALPHACustomNumber of plugins in Plugin Manager
socket_path
state
pod_cpu_usage_seconds_totalALPHACustomCumulative cpu time consumed by the pod in core-seconds
pod
namespace
pod_gc_collector_force_delete_pod_errors_totalALPHACounterNumber of errors encountered when forcefully deleting the pods since the Pod GC Controller started.
namespace
reason
pod_gc_collector_force_delete_pods_totalALPHACounterNumber of pods that are being forcefully deleted since the Pod GC Controller started.
namespace
reason
pod_memory_working_set_bytesALPHACustomCurrent working set of the pod in bytes
pod
namespace
pod_security_errors_totalALPHACounterNumber of errors preventing normal evaluation. Non-fatal errors may result in the latest restricted profile being used for enforcement.
fatal
request_operation
resource
subresource
pod_security_evaluations_totalALPHACounterNumber of policy evaluations that occurred, not counting ignored or exempt requests.
decision
mode
policy_level
policy_version
request_operation
resource
subresource
pod_security_exemptions_totalALPHACounterNumber of exempt requests, not counting ignored or out of scope requests.
request_operation
resource
subresource
pod_swap_usage_bytesALPHACustomCurrent amount of the pod swap usage in bytes. Reported only on non-windows systems
pod
namespace
prober_probe_duration_secondsALPHAHistogramDuration in seconds for a probe response.
container
namespace
pod
probe_type
prober_probe_totalALPHACounterCumulative number of a liveness, readiness or startup probe for a container by result.
container
namespace
pod
pod_uid
probe_type
result
pv_collector_bound_pv_countALPHACustomGauge measuring number of persistent volume currently bound
storage_class
pv_collector_bound_pvc_countALPHACustomGauge measuring number of persistent volume claim currently bound
namespace
pv_collector_total_pv_countALPHACustomGauge measuring total number of persistent volumes
plugin_name
volume_mode
pv_collector_unbound_pv_countALPHACustomGauge measuring number of persistent volume currently unbound
storage_class
pv_collector_unbound_pvc_countALPHACustomGauge measuring number of persistent volume claim currently unbound
namespace
reconstruct_volume_operations_errors_totalALPHACounterThe number of volumes that failed reconstruction from the operating system during kubelet startup.
reconstruct_volume_operations_totalALPHACounterThe number of volumes that were attempted to be reconstructed from the operating system during kubelet startup. This includes both successful and failed reconstruction.
replicaset_controller_sorting_deletion_age_ratioALPHAHistogramThe ratio of chosen deleted pod's ages to the current youngest pod's age (at the time). Should be <2.The intent of this metric is to measure the rough efficacy of the LogarithmicScaleDown feature gate's effect onthe sorting (and deletion) of pods when a replicaset scales down. This only considers Ready pods when calculating and reporting.
resourceclaim_controller_create_attempts_totalALPHACounterNumber of ResourceClaims creation requests
resourceclaim_controller_create_failures_totalALPHACounterNumber of ResourceClaims creation request failures
rest_client_dns_resolution_duration_secondsALPHAHistogramDNS resolver latency in seconds. Broken down by host.
host
rest_client_exec_plugin_call_totalALPHACounterNumber of calls to an exec plugin, partitioned by the type of event encountered (no_error, plugin_execution_error, plugin_not_found_error, client_internal_error) and an optional exit code. The exit code will be set to 0 if and only if the plugin call was successful.
call_status
code
rest_client_exec_plugin_certificate_rotation_ageALPHAHistogramHistogram of the number of seconds the last auth exec plugin client certificate lived before being rotated. If auth exec plugin client certificates are unused, histogram will contain no data.
rest_client_exec_plugin_ttl_secondsALPHAGaugeGauge of the shortest TTL (time-to-live) of the client certificate(s) managed by the auth exec plugin. The value is in seconds until certificate expiry (negative if already expired). If auth exec plugins are unused or manage no TLS certificates, the value will be +INF.
rest_client_rate_limiter_duration_secondsALPHAHistogramClient side rate limiter latency in seconds. Broken down by verb, and host.
host
verb
rest_client_request_duration_secondsALPHAHistogramRequest latency in seconds. Broken down by verb, and host.
host
verb
rest_client_request_retries_totalALPHACounterNumber of request retries, partitioned by status code, verb, and host.
code
host
verb
rest_client_request_size_bytesALPHAHistogramRequest size in bytes. Broken down by verb and host.
host
verb
rest_client_requests_totalALPHACounterNumber of HTTP requests, partitioned by status code, method, and host.
code
host
method
rest_client_response_size_bytesALPHAHistogramResponse size in bytes. Broken down by verb and host.
host
verb
rest_client_transport_cache_entriesALPHAGaugeNumber of transport entries in the internal cache.
rest_client_transport_create_calls_totalALPHACounterNumber of calls to get a new transport, partitioned by the result of the operation hit: obtained from the cache, miss: created and added to the cache, uncacheable: created and not cached
result
retroactive_storageclass_errors_totalALPHACounterTotal number of failed retroactive StorageClass assignments to persistent volume claim
retroactive_storageclass_totalALPHACounterTotal number of retroactive StorageClass assignments to persistent volume claim
root_ca_cert_publisher_sync_duration_secondsALPHAHistogramNumber of namespace syncs happened in root ca cert publisher.
code
root_ca_cert_publisher_sync_totalALPHACounterNumber of namespace syncs happened in root ca cert publisher.
code
running_managed_controllersALPHAGaugeIndicates where instances of a controller are currently running
manager
name
scheduler_goroutinesALPHAGaugeNumber of running goroutines split by the work they do such as binding.
operation
scheduler_permit_wait_duration_secondsALPHAHistogramDuration of waiting on permit.
result
scheduler_plugin_evaluation_totalALPHACounterNumber of attempts to schedule pods by each plugin and the extension point (available only in PreFilter and Filter.).
extension_point
plugin
profile
scheduler_plugin_execution_duration_secondsALPHAHistogramDuration for running a plugin at a specific extension point.
extension_point
plugin
status
scheduler_scheduler_cache_sizeALPHAGaugeNumber of nodes, pods, and assumed (bound) pods in the scheduler cache.
type
scheduler_scheduling_algorithm_duration_secondsALPHAHistogramScheduling algorithm latency in seconds
scheduler_unschedulable_podsALPHAGaugeThe number of unschedulable pods broken down by plugin name. A pod will increment the gauge for all plugins that caused it to not schedule and so this metric have meaning only when broken down by plugin.
plugin
profile
scheduler_volume_binder_cache_requests_totalALPHACounterTotal number for request volume binding cache
operation
scheduler_volume_scheduling_stage_error_totalALPHACounterVolume scheduling stage error count
operation
scrape_errorALPHACustom1 if there was an error while getting container metrics, 0 otherwise
service_controller_loadbalancer_sync_totalALPHACounterA metric counting the amount of times any load balancer has been configured, as an effect of service/node changes on the cluster
service_controller_nodesync_error_totalALPHACounterA metric counting the amount of times any load balancer has been configured and errored, as an effect of node changes on the cluster
service_controller_nodesync_latency_secondsALPHAHistogramA metric measuring the latency for nodesync which updates loadbalancer hosts on cluster node updates.
service_controller_update_loadbalancer_host_latency_secondsALPHAHistogramA metric measuring the latency for updating each load balancer hosts.
serviceaccount_legacy_tokens_totalALPHACounterCumulative legacy service account tokens used
serviceaccount_stale_tokens_totalALPHACounterCumulative stale projected service account tokens used
serviceaccount_valid_tokens_totalALPHACounterCumulative valid projected service account tokens used
storage_count_attachable_volumes_in_useALPHACustomMeasure number of volumes in use
node
volume_plugin
storage_operation_duration_secondsALPHAHistogramStorage operation duration
migrated
operation_name
status
volume_plugin
ttl_after_finished_controller_job_deletion_duration_secondsALPHAHistogramThe time it took to delete the job since it became eligible for deletion
volume_manager_selinux_container_errors_totalALPHAGaugeNumber of errors when kubelet cannot compute SELinux context for a container. Kubelet can't start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of containers.
volume_manager_selinux_container_warnings_totalALPHAGaugeNumber of errors when kubelet cannot compute SELinux context for a container that are ignored. They will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.
volume_manager_selinux_pod_context_mismatch_errors_totalALPHAGaugeNumber of errors when a Pod defines different SELinux contexts for its containers that use the same volume. Kubelet can't start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of Pods.
volume_manager_selinux_pod_context_mismatch_warnings_totalALPHAGaugeNumber of errors when a Pod defines different SELinux contexts for its containers that use the same volume. They are not errors yet, but they will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.
volume_manager_selinux_volume_context_mismatch_errors_totalALPHAGaugeNumber of errors when a Pod uses a volume that is already mounted with a different SELinux context than the Pod needs. Kubelet can't start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of Pods.
volume_manager_selinux_volume_context_mismatch_warnings_totalALPHAGaugeNumber of errors when a Pod uses a volume that is already mounted with a different SELinux context than the Pod needs. They are not errors yet, but they will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.
volume_manager_selinux_volumes_admitted_totalALPHAGaugeNumber of volumes whose SELinux context was fine and will be mounted with mount -o context option.
volume_manager_total_volumesALPHACustomNumber of volumes in Volume Manager
plugin_name
state
volume_operation_total_errorsALPHACounterTotal volume operation errors
operation_name
plugin_name
volume_operation_total_secondsALPHAHistogramStorage operation end to end duration in seconds
operation_name
plugin_name
watch_cache_capacityALPHAGaugeTotal capacity of watch cache broken by resource type.
resource
watch_cache_capacity_decrease_totalALPHACounterTotal number of watch cache capacity decrease events broken by resource type.
resource
watch_cache_capacity_increase_totalALPHACounterTotal number of watch cache capacity increase events broken by resource type.
resource
workqueue_adds_totalALPHACounterTotal number of adds handled by workqueue
name
workqueue_depthALPHAGaugeCurrent depth of workqueue
name
workqueue_longest_running_processor_secondsALPHAGaugeHow many seconds has the longest running processor for workqueue been running.
name
workqueue_queue_duration_secondsALPHAHistogramHow long in seconds an item stays in workqueue before being requested.
name
workqueue_retries_totalALPHACounterTotal number of retries handled by workqueue
name
workqueue_unfinished_work_secondsALPHAGaugeHow many seconds of work has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
name
workqueue_work_duration_secondsALPHAHistogramHow long in seconds processing an item from workqueue takes.
name
+
+
aggregator_discovery_aggregation_count_total
+
Counter of number of times discovery was aggregated
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
aggregator_openapi_v2_regeneration_count
+
Counter of OpenAPI v2 spec regeneration count broken down by causing APIService name and reason.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • apiservicereason
+
+
aggregator_openapi_v2_regeneration_duration
+
Gauge of OpenAPI v2 spec regeneration duration in seconds.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • reason
+
+
aggregator_unavailable_apiservice
+
Gauge of APIServices which are marked as unavailable broken down by APIService name.
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • name
+
+
aggregator_unavailable_apiservice_total
+
Counter of APIServices which are marked as unavailable broken down by APIService name and reason.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • namereason
+
+
apiextensions_openapi_v2_regeneration_count
+
Counter of OpenAPI v2 spec regeneration count broken down by causing CRD name and reason.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • crdreason
+
+
apiextensions_openapi_v3_regeneration_count
+
Counter of OpenAPI v3 spec regeneration count broken down by group, version, causing CRD and reason.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • crdgroupreasonversion
+
+
apiserver_admission_match_condition_evaluation_errors_total
+
Admission match condition evaluation errors count, identified by name of resource containing the match condition and broken out for each kind containing matchConditions (webhook or policy), operation and admission type (validate or admit).
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • kindnameoperationtype
+
+
apiserver_admission_match_condition_evaluation_seconds
+
Admission match condition evaluation time in seconds, identified by name and broken out for each kind containing matchConditions (webhook or policy), operation and type (validate or admit).
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • kindnameoperationtype
+
+
apiserver_admission_match_condition_exclusions_total
+
Admission match condition evaluation exclusions count, identified by name of resource containing the match condition and broken out for each kind containing matchConditions (webhook or policy), operation and admission type (validate or admit).
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • kindnameoperationtype
+
+
apiserver_admission_step_admission_duration_seconds_summary
+
Admission sub-step latency summary in seconds, broken out for each operation and API resource and step type (validate or admit).
+
    +
  • ALPHA
  • +
  • Summary
  • +
  • operationrejectedtype
+
+
apiserver_admission_webhook_fail_open_count
+
Admission webhook fail open count, identified by name and broken out for each admission type (validating or mutating).
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • nametype
+
+
apiserver_admission_webhook_rejection_count
+
Admission webhook rejection count, identified by name and broken out for each admission type (validating or admit) and operation. Additional labels specify an error type (calling_webhook_error or apiserver_internal_error if an error occurred; no_error otherwise) and optionally a non-zero rejection code if the webhook rejects the request with an HTTP status code (honored by the apiserver when the code is greater or equal to 400). Codes greater than 600 are truncated to 600, to keep the metrics cardinality bounded.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • error_typenameoperationrejection_codetype
+
+
apiserver_admission_webhook_request_total
+
Admission webhook request total, identified by name and broken out for each admission type (validating or mutating) and operation. Additional labels specify whether the request was rejected or not and an HTTP status code. Codes greater than 600 are truncated to 600, to keep the metrics cardinality bounded.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • codenameoperationrejectedtype
+
+
apiserver_audit_error_total
+
Counter of audit events that failed to be audited properly. Plugin identifies the plugin affected by the error.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • plugin
+
+
apiserver_audit_event_total
+
Counter of audit events generated and sent to the audit backend.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
apiserver_audit_level_total
+
Counter of policy levels for audit events (1 per request).
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • level
+
+
apiserver_audit_requests_rejected_total
+
Counter of apiserver requests rejected due to an error in audit logging backend.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
apiserver_cache_list_fetched_objects_total
+
Number of objects read from watch cache in the course of serving a LIST request
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • indexresource_prefix
+
+
apiserver_cache_list_returned_objects_total
+
Number of objects returned for a LIST request from watch cache
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • resource_prefix
+
+
apiserver_cache_list_total
+
Number of LIST requests served from watch cache
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • indexresource_prefix
+
+
apiserver_cel_compilation_duration_seconds
+
CEL compilation time in seconds.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
apiserver_cel_evaluation_duration_seconds
+
CEL evaluation time in seconds.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
apiserver_certificates_registry_csr_honored_duration_total
+
Total number of issued CSRs with a requested duration that was honored, sliced by signer (only kubernetes.io signer names are specifically identified)
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • signerName
+
+
apiserver_certificates_registry_csr_requested_duration_total
+
Total number of issued CSRs with a requested duration, sliced by signer (only kubernetes.io signer names are specifically identified)
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • signerName
+
+
apiserver_client_certificate_expiration_seconds
+
Distribution of the remaining lifetime on the certificate used to authenticate a request.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
apiserver_conversion_webhook_duration_seconds
+
Conversion webhook request latency
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • failure_typeresult
+
+
apiserver_conversion_webhook_request_total
+
Counter for conversion webhook requests with success/failure and failure error type
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • failure_typeresult
+
+
apiserver_crd_conversion_webhook_duration_seconds
+
CRD webhook conversion duration in seconds
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • crd_namefrom_versionsucceededto_version
+
+
apiserver_current_inqueue_requests
+
Maximal number of queued requests in this apiserver per request kind in last second.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • request_kind
+
+
apiserver_delegated_authn_request_duration_seconds
+
Request latency in seconds. Broken down by status code.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • code
+
+
apiserver_delegated_authn_request_total
+
Number of HTTP requests partitioned by status code.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • code
+
+
apiserver_delegated_authz_request_duration_seconds
+
Request latency in seconds. Broken down by status code.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • code
+
+
apiserver_delegated_authz_request_total
+
Number of HTTP requests partitioned by status code.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • code
+
+
apiserver_egress_dialer_dial_duration_seconds
+
Dial latency histogram in seconds, labeled by the protocol (http-connect or grpc), transport (tcp or uds)
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • protocoltransport
+
+
apiserver_egress_dialer_dial_failure_count
+
Dial failure count, labeled by the protocol (http-connect or grpc), transport (tcp or uds), and stage (connect or proxy). The stage indicates at which stage the dial failed
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • protocolstagetransport
+
+
apiserver_egress_dialer_dial_start_total
+
Dial starts, labeled by the protocol (http-connect or grpc) and transport (tcp or uds).
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • protocoltransport
+
+
apiserver_encryption_config_controller_automatic_reload_failures_total
+
Total number of failed automatic reloads of encryption configuration split by apiserver identity.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • apiserver_id_hash
+
+
apiserver_encryption_config_controller_automatic_reload_last_timestamp_seconds
+
Timestamp of the last successful or failed automatic reload of encryption configuration split by apiserver identity.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • apiserver_id_hashstatus
+
+
apiserver_encryption_config_controller_automatic_reload_success_total
+
Total number of successful automatic reloads of encryption configuration split by apiserver identity.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • apiserver_id_hash
+
+
apiserver_envelope_encryption_dek_cache_fill_percent
+
Percent of the cache slots currently occupied by cached DEKs.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
apiserver_envelope_encryption_dek_cache_inter_arrival_time_seconds
+
Time (in seconds) of inter arrival of transformation requests.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • transformation_type
+
+
apiserver_envelope_encryption_dek_source_cache_size
+
Number of records in data encryption key (DEK) source cache. On a restart, this value is an approximation of the number of decrypt RPC calls the server will make to the KMS plugin.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • provider_name
+
+
apiserver_envelope_encryption_invalid_key_id_from_status_total
+
Number of times an invalid keyID is returned by the Status RPC call split by error.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • errorprovider_name
+
+
apiserver_envelope_encryption_key_id_hash_last_timestamp_seconds
+
The last time in seconds when a keyID was used.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • apiserver_id_hashkey_id_hashprovider_nametransformation_type
+
+
apiserver_envelope_encryption_key_id_hash_status_last_timestamp_seconds
+
The last time in seconds when a keyID was returned by the Status RPC call.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • apiserver_id_hashkey_id_hashprovider_name
+
+
apiserver_envelope_encryption_key_id_hash_total
+
Number of times a keyID is used split by transformation type, provider, and apiserver identity.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • apiserver_id_hashkey_id_hashprovider_nametransformation_type
+
+
apiserver_envelope_encryption_kms_operations_latency_seconds
+
KMS operation duration with gRPC error code status total.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • grpc_status_codemethod_nameprovider_name
+
+
apiserver_flowcontrol_current_inqueue_seats
+
Number of seats currently pending in queues of the API Priority and Fairness subsystem
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • flow_schemapriority_level
+
+
apiserver_flowcontrol_current_limit_seats
+
current derived number of execution seats available to each priority level
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • priority_level
+
+
apiserver_flowcontrol_current_r
+
R(time of last change)
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • priority_level
+
+
apiserver_flowcontrol_demand_seats
+
Observations, at the end of every nanosecond, of (the number of seats each priority level could use) / (nominal number of seats for that level)
+
    +
  • ALPHA
  • +
  • TimingRatioHistogram
  • +
  • priority_level
+
+
apiserver_flowcontrol_demand_seats_average
+
Time-weighted average, over last adjustment period, of demand_seats
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • priority_level
+
+
apiserver_flowcontrol_demand_seats_high_watermark
+
High watermark, over last adjustment period, of demand_seats
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • priority_level
+
+
apiserver_flowcontrol_demand_seats_smoothed
+
Smoothed seat demands
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • priority_level
+
+
apiserver_flowcontrol_demand_seats_stdev
+
Time-weighted standard deviation, over last adjustment period, of demand_seats
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • priority_level
+
+
apiserver_flowcontrol_dispatch_r
+
R(time of last dispatch)
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • priority_level
+
+
apiserver_flowcontrol_epoch_advance_total
+
Number of times the queueset's progress meter jumped backward
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • priority_levelsuccess
+
+
apiserver_flowcontrol_latest_s
+
S(most recently dispatched request)
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • priority_level
+
+
apiserver_flowcontrol_lower_limit_seats
+
Configured lower bound on number of execution seats available to each priority level
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • priority_level
+
+
apiserver_flowcontrol_next_discounted_s_bounds
+
min and max, over queues, of S(oldest waiting request in queue) - estimated work in progress
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • boundpriority_level
+
+
apiserver_flowcontrol_next_s_bounds
+
min and max, over queues, of S(oldest waiting request in queue)
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • boundpriority_level
+
+
apiserver_flowcontrol_priority_level_request_utilization
+
Observations, at the end of every nanosecond, of number of requests (as a fraction of the relevant limit) waiting or in any stage of execution (but only initial stage for WATCHes)
+
    +
  • ALPHA
  • +
  • TimingRatioHistogram
  • +
  • phasepriority_level
+
+
apiserver_flowcontrol_priority_level_seat_utilization
+
Observations, at the end of every nanosecond, of utilization of seats for any stage of execution (but only initial stage for WATCHes)
+
    +
  • ALPHA
  • +
  • TimingRatioHistogram
  • +
  • priority_level
  • phase:executing
+
+
apiserver_flowcontrol_read_vs_write_current_requests
+
Observations, at the end of every nanosecond, of the number of requests (as a fraction of the relevant limit) waiting or in regular stage of execution
+
    +
  • ALPHA
  • +
  • TimingRatioHistogram
  • +
  • phaserequest_kind
+
+
apiserver_flowcontrol_request_concurrency_in_use
+
Concurrency (number of seats) occupied by the currently executing (initial stage for a WATCH, any stage otherwise) requests in the API Priority and Fairness subsystem
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • flow_schemapriority_level
  • 1.31.0
+
+
apiserver_flowcontrol_request_concurrency_limit
+
Nominal number of execution seats configured for each priority level
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • priority_level
  • 1.30.0
+
+
apiserver_flowcontrol_request_dispatch_no_accommodation_total
+
Number of times a dispatch attempt resulted in a non accommodation due to lack of available seats
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • flow_schemapriority_level
+
+
apiserver_flowcontrol_request_execution_seconds
+
Duration of initial stage (for a WATCH) or any (for a non-WATCH) stage of request execution in the API Priority and Fairness subsystem
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • flow_schemapriority_leveltype
+
+
apiserver_flowcontrol_request_queue_length_after_enqueue
+
Length of queue in the API Priority and Fairness subsystem, as seen by each request after it is enqueued
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • flow_schemapriority_level
+
+
apiserver_flowcontrol_seat_fair_frac
+
Fair fraction of server's concurrency to allocate to each priority level that can use it
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
apiserver_flowcontrol_target_seats
+
Seat allocation targets
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • priority_level
+
+
apiserver_flowcontrol_upper_limit_seats
+
Configured upper bound on number of execution seats available to each priority level
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • priority_level
+
+
apiserver_flowcontrol_watch_count_samples
+
count of watchers for mutating requests in API Priority and Fairness
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • flow_schemapriority_level
+
+
apiserver_flowcontrol_work_estimated_seats
+
Number of estimated seats (maximum of initial and final seats) associated with requests in API Priority and Fairness
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • flow_schemapriority_level
+
+
apiserver_init_events_total
+
Counter of init events processed in watch cache broken by resource type.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • resource
+
+
apiserver_kube_aggregator_x509_insecure_sha1_total
+
Counts the number of requests to servers with insecure SHA1 signatures in their serving certificate OR the number of connection failures due to the insecure SHA1 signatures (either/or, based on the runtime environment)
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
apiserver_kube_aggregator_x509_missing_san_total
+
Counts the number of requests to servers missing SAN extension in their serving certificate OR the number of connection failures due to the lack of x509 certificate SAN extension missing (either/or, based on the runtime environment)
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
apiserver_request_aborts_total
+
Number of requests which apiserver aborted possibly due to a timeout, for each group, version, verb, resource, subresource and scope
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • groupresourcescopesubresourceverbversion
+
+
apiserver_request_body_sizes
+
Apiserver request body sizes broken out by size.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • resourceverb
+
+
apiserver_request_filter_duration_seconds
+
Request filter latency distribution in seconds, for each filter type
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • filter
+
+
apiserver_request_post_timeout_total
+
Tracks the activity of the request handlers after the associated requests have been timed out by the apiserver
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • sourcestatus
+
+
apiserver_request_sli_duration_seconds
+
Response latency distribution (not counting webhook duration and priority & fairness queue wait times) in seconds for each verb, group, version, resource, subresource, scope and component.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • componentgroupresourcescopesubresourceverbversion
+
+
apiserver_request_slo_duration_seconds
+
Response latency distribution (not counting webhook duration and priority & fairness queue wait times) in seconds for each verb, group, version, resource, subresource, scope and component.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • componentgroupresourcescopesubresourceverbversion
  • 1.27.0
+
+
apiserver_request_terminations_total
+
Number of requests which apiserver terminated in self-defense.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • codecomponentgroupresourcescopesubresourceverbversion
+
+
apiserver_request_timestamp_comparison_time
+
Time taken for comparison of old vs new objects in UPDATE or PATCH requests
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • code_path
+
+
apiserver_rerouted_request_total
+
Total number of requests that were proxied to a peer kube apiserver because the local apiserver was not capable of serving it
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • code
+
+
apiserver_selfrequest_total
+
Counter of apiserver self-requests broken out for each verb, API resource and subresource.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • resourcesubresourceverb
+
+
apiserver_storage_data_key_generation_duration_seconds
+
Latencies in seconds of data encryption key(DEK) generation operations.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
apiserver_storage_data_key_generation_failures_total
+
Total number of failed data encryption key(DEK) generation operations.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
apiserver_storage_db_total_size_in_bytes
+
Total size of the storage database file physically allocated in bytes.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • endpoint
  • 1.28.0
+
+
apiserver_storage_decode_errors_total
+
Number of stored object decode errors split by object type
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • resource
+
+
apiserver_storage_envelope_transformation_cache_misses_total
+
Total number of cache misses while accessing key decryption key(KEK).
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
apiserver_storage_events_received_total
+
Number of etcd events received split by kind.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • resource
+
+
apiserver_storage_list_evaluated_objects_total
+
Number of objects tested in the course of serving a LIST request from storage
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • resource
+
+
apiserver_storage_list_fetched_objects_total
+
Number of objects read from storage in the course of serving a LIST request
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • resource
+
+
apiserver_storage_list_returned_objects_total
+
Number of objects returned for a LIST request from storage
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • resource
+
+
apiserver_storage_list_total
+
Number of LIST requests served from storage
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • resource
+
+
apiserver_storage_size_bytes
+
Size of the storage database file physically allocated in bytes.
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • cluster
+
+
apiserver_storage_transformation_duration_seconds
+
Latencies in seconds of value transformation operations.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • transformation_typetransformer_prefix
+
+
apiserver_storage_transformation_operations_total
+
Total number of transformations. Successful transformation will have a status 'OK' and a varied status string when the transformation fails. This status and transformation_type fields may be used for alerting on encryption/decryption failure using transformation_type from_storage for decryption and to_storage for encryption
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • statustransformation_typetransformer_prefix
+
+
apiserver_terminated_watchers_total
+
Counter of watchers closed due to unresponsiveness broken by resource type.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • resource
+
+
apiserver_tls_handshake_errors_total
+
Number of requests dropped with 'TLS handshake error from' error
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
apiserver_validating_admission_policy_check_duration_seconds
+
Validation admission latency for individual validation expressions in seconds, labeled by policy and further including binding, state and enforcement action taken.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • enforcement_actionpolicypolicy_bindingstate
+
+
apiserver_validating_admission_policy_check_total
+
Validation admission policy check total, labeled by policy and further identified by binding, enforcement action taken, and state.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • enforcement_actionpolicypolicy_bindingstate
+
+
apiserver_validating_admission_policy_definition_total
+
Validation admission policy count total, labeled by state and enforcement action.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • enforcement_actionstate
+
+
apiserver_watch_cache_events_dispatched_total
+
Counter of events dispatched in watch cache broken by resource type.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • resource
+
+
apiserver_watch_cache_events_received_total
+
Counter of events received in watch cache broken by resource type.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • resource
+
+
apiserver_watch_cache_initializations_total
+
Counter of watch cache initializations broken by resource type.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • resource
+
+
apiserver_watch_events_sizes
+
Watch event size distribution in bytes
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • groupkindversion
+
+
apiserver_watch_events_total
+
Number of events sent in watch clients
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • groupkindversion
+
+
apiserver_webhooks_x509_insecure_sha1_total
+
Counts the number of requests to servers with insecure SHA1 signatures in their serving certificate OR the number of connection failures due to the insecure SHA1 signatures (either/or, based on the runtime environment)
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
apiserver_webhooks_x509_missing_san_total
+
Counts the number of requests to servers missing SAN extension in their serving certificate OR the number of connection failures due to the lack of x509 certificate SAN extension missing (either/or, based on the runtime environment)
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
attach_detach_controller_attachdetach_controller_forced_detaches
+
Number of times the A/D Controller performed a forced detach
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • reason
+
+
attachdetach_controller_total_volumes
+
Number of volumes in A/D Controller
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • plugin_namestate
+
+
authenticated_user_requests
+
Counter of authenticated requests broken out by username.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • username
+
+
authentication_attempts
+
Counter of authenticated attempts.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • result
+
+
authentication_duration_seconds
+
Authentication duration in seconds broken out by result.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • result
+
+
authentication_token_cache_active_fetch_count
+
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • status
+
+
authentication_token_cache_fetch_total
+
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • status
+
+
authentication_token_cache_request_duration_seconds
+
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • status
+
+
authentication_token_cache_request_total
+
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • status
+
+
authorization_attempts_total
+
Counter of authorization attempts broken down by result. It can be either 'allowed', 'denied', 'no-opinion' or 'error'.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • result
+
+
authorization_duration_seconds
+
Authorization duration in seconds broken out by result.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • result
+
+
cloud_provider_webhook_request_duration_seconds
+
Request latency in seconds. Broken down by status code.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • codewebhook
+
+
cloud_provider_webhook_request_total
+
Number of HTTP requests partitioned by status code.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • codewebhook
+
+
cloudprovider_azure_api_request_duration_seconds
+
Latency of an Azure API call
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • requestresource_groupsourcesubscription_id
+
+
cloudprovider_azure_api_request_errors
+
Number of errors for an Azure API call
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • requestresource_groupsourcesubscription_id
+
+
cloudprovider_azure_api_request_ratelimited_count
+
Number of rate limited Azure API calls
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • requestresource_groupsourcesubscription_id
+
+
cloudprovider_azure_api_request_throttled_count
+
Number of throttled Azure API calls
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • requestresource_groupsourcesubscription_id
+
+
cloudprovider_azure_op_duration_seconds
+
Latency of an Azure service operation
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • requestresource_groupsourcesubscription_id
+
+
cloudprovider_azure_op_failure_count
+
Number of failed Azure service operations
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • requestresource_groupsourcesubscription_id
+
+
cloudprovider_gce_api_request_duration_seconds
+
Latency of a GCE API call
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • regionrequestversionzone
+
+
cloudprovider_gce_api_request_errors
+
Number of errors for an API call
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • regionrequestversionzone
+
+
cloudprovider_vsphere_api_request_duration_seconds
+
Latency of vsphere api call
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • request
+
+
cloudprovider_vsphere_api_request_errors
+
vsphere Api errors
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • request
+
+
cloudprovider_vsphere_operation_duration_seconds
+
Latency of vsphere operation call
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • operation
+
+
cloudprovider_vsphere_operation_errors
+
vsphere operation errors
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • operation
+
+
cloudprovider_vsphere_vcenter_versions
+
Versions for connected vSphere vCenters
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • hostnameversionbuild
+
+
container_swap_usage_bytes
+
Current amount of the container swap usage in bytes. Reported only on non-windows systems
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • containerpodnamespace
+
+
csi_operations_seconds
+
Container Storage Interface operation duration with gRPC error code status total
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • driver_namegrpc_status_codemethod_namemigrated
+
+
endpoint_slice_controller_changes
+
Number of EndpointSlice changes
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • operation
+
+
endpoint_slice_controller_desired_endpoint_slices
+
Number of EndpointSlices that would exist with perfect endpoint allocation
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
endpoint_slice_controller_endpoints_added_per_sync
+
Number of endpoints added on each Service sync
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
endpoint_slice_controller_endpoints_desired
+
Number of endpoints desired
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
endpoint_slice_controller_endpoints_removed_per_sync
+
Number of endpoints removed on each Service sync
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
endpoint_slice_controller_endpointslices_changed_per_sync
+
Number of EndpointSlices changed on each Service sync
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • topology
+
+
endpoint_slice_controller_num_endpoint_slices
+
Number of EndpointSlices
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
endpoint_slice_controller_syncs
+
Number of EndpointSlice syncs
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • result
+
+
endpoint_slice_mirroring_controller_addresses_skipped_per_sync
+
Number of addresses skipped on each Endpoints sync due to being invalid or exceeding MaxEndpointsPerSubset
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
endpoint_slice_mirroring_controller_changes
+
Number of EndpointSlice changes
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • operation
+
+
endpoint_slice_mirroring_controller_desired_endpoint_slices
+
Number of EndpointSlices that would exist with perfect endpoint allocation
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
endpoint_slice_mirroring_controller_endpoints_added_per_sync
+
Number of endpoints added on each Endpoints sync
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
endpoint_slice_mirroring_controller_endpoints_desired
+
Number of endpoints desired
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
endpoint_slice_mirroring_controller_endpoints_removed_per_sync
+
Number of endpoints removed on each Endpoints sync
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
endpoint_slice_mirroring_controller_endpoints_sync_duration
+
Duration of syncEndpoints() in seconds
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
endpoint_slice_mirroring_controller_endpoints_updated_per_sync
+
Number of endpoints updated on each Endpoints sync
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
endpoint_slice_mirroring_controller_num_endpoint_slices
+
Number of EndpointSlices
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
ephemeral_volume_controller_create_failures_total
+
Number of PersistenVolumeClaims creation requests
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
ephemeral_volume_controller_create_total
+
Number of PersistenVolumeClaims creation requests
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
etcd_bookmark_counts
+
Number of etcd bookmarks (progress notify events) split by kind.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • resource
+
+
etcd_lease_object_counts
+
Number of objects attached to a single etcd lease.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
etcd_request_duration_seconds
+
Etcd request latency in seconds for each operation and object type.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • operationtype
+
+
etcd_request_errors_total
+
Etcd failed request counts for each operation and object type.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • operationtype
+
+
etcd_requests_total
+
Etcd request counts for each operation and object type.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • operationtype
+
+
etcd_version_info
+
Etcd server's binary version
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • binary_version
+
+
field_validation_request_duration_seconds
+
Response latency distribution in seconds for each field validation value
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • field_validation
+
+
force_cleaned_failed_volume_operation_errors_total
+
The number of volumes that failed force cleanup after their reconstruction failed during kubelet startup.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
force_cleaned_failed_volume_operations_total
+
The number of volumes that were force cleaned after their reconstruction failed during kubelet startup. This includes both successful and failed cleanups.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
garbagecollector_controller_resources_sync_error_total
+
Number of garbage collector resources sync errors
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
get_token_count
+
Counter of total Token() requests to the alternate token source
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
get_token_fail_count
+
Counter of failed Token() requests to the alternate token source
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
horizontal_pod_autoscaler_controller_metric_computation_duration_seconds
+
The time(seconds) that the HPA controller takes to calculate one metric. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. The label 'error' should be either 'spec', 'internal', or 'none'. The label 'metric_type' corresponds to HPA.spec.metrics[*].type
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • actionerrormetric_type
+
+
horizontal_pod_autoscaler_controller_metric_computation_total
+
Number of metric computations. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. Also, the label 'error' should be either 'spec', 'internal', or 'none'. The label 'metric_type' corresponds to HPA.spec.metrics[*].type
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • actionerrormetric_type
+
+
horizontal_pod_autoscaler_controller_reconciliation_duration_seconds
+
The time(seconds) that the HPA controller takes to reconcile once. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. Also, the label 'error' should be either 'spec', 'internal', or 'none'. Note that if both spec and internal errors happen during a reconciliation, the first one to occur is reported in `error` label.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • actionerror
+
+
horizontal_pod_autoscaler_controller_reconciliations_total
+
Number of reconciliations of HPA controller. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. Also, the label 'error' should be either 'spec', 'internal', or 'none'. Note that if both spec and internal errors happen during a reconciliation, the first one to occur is reported in `error` label.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • actionerror
+
+
job_controller_pod_failures_handled_by_failure_policy_total
+
`The number of failed Pods handled by failure policy with, respect to the failure policy action applied based on the matched, rule. Possible values of the action label correspond to the, possible values for the failure policy rule action, which are:, "FailJob", "Ignore" and "Count".`
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • action
+
+
job_controller_terminated_pods_tracking_finalizer_total
+
`The number of terminated pods (phase=Failed|Succeeded), that have the finalizer batch.kubernetes.io/job-tracking, The event label can be "add" or "delete".`
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • event
+
+
kube_apiserver_clusterip_allocator_allocated_ips
+
Gauge measuring the number of allocated IPs for Services
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • cidr
+
+
kube_apiserver_clusterip_allocator_allocation_errors_total
+
Number of errors trying to allocate Cluster IPs
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • cidrscope
+
+
kube_apiserver_clusterip_allocator_allocation_total
+
Number of Cluster IPs allocations
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • cidrscope
+
+
kube_apiserver_clusterip_allocator_available_ips
+
Gauge measuring the number of available IPs for Services
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • cidr
+
+
kube_apiserver_nodeport_allocator_allocated_ports
+
Gauge measuring the number of allocated NodePorts for Services
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
kube_apiserver_nodeport_allocator_available_ports
+
Gauge measuring the number of available NodePorts for Services
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
kube_apiserver_pod_logs_backend_tls_failure_total
+
Total number of requests for pods/logs that failed due to kubelet server TLS verification
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kube_apiserver_pod_logs_insecure_backend_total
+
Total number of requests for pods/logs sliced by usage type: enforce_tls, skip_tls_allowed, skip_tls_denied
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • usage
+
+
kube_apiserver_pod_logs_pods_logs_backend_tls_failure_total
+
Total number of requests for pods/logs that failed due to kubelet server TLS verification
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • 1.27.0
+
+
kube_apiserver_pod_logs_pods_logs_insecure_backend_total
+
Total number of requests for pods/logs sliced by usage type: enforce_tls, skip_tls_allowed, skip_tls_denied
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • usage
  • 1.27.0
+
+
kubelet_active_pods
+
The number of pods the kubelet considers active and which are being considered when admitting new pods. static is true if the pod is not from the apiserver.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • static
+
+
kubelet_certificate_manager_client_expiration_renew_errors
+
Counter of certificate renewal errors.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kubelet_certificate_manager_client_ttl_seconds
+
Gauge of the TTL (time-to-live) of the Kubelet's client certificate. The value is in seconds until certificate expiry (negative if already expired). If client certificate is invalid or unused, the value will be +INF.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
kubelet_certificate_manager_server_rotation_seconds
+
Histogram of the number of seconds the previous certificate lived before being rotated.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
kubelet_certificate_manager_server_ttl_seconds
+
Gauge of the shortest TTL (time-to-live) of the Kubelet's serving certificate. The value is in seconds until certificate expiry (negative if already expired). If serving certificate is invalid or unused, the value will be +INF.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
kubelet_cgroup_manager_duration_seconds
+
Duration in seconds for cgroup manager operations. Broken down by method.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • operation_type
+
+
kubelet_container_log_filesystem_used_bytes
+
Bytes used by the container's logs on the filesystem.
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • uidnamespacepodcontainer
+
+
kubelet_containers_per_pod_count
+
The number of containers per pod.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
kubelet_cpu_manager_pinning_errors_total
+
The number of cpu core allocations which required pinning failed.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kubelet_cpu_manager_pinning_requests_total
+
The number of cpu core allocations which required pinning.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kubelet_credential_provider_plugin_duration
+
Duration of execution in seconds for credential provider plugin
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • plugin_name
+
+
kubelet_credential_provider_plugin_errors
+
Number of errors from credential provider plugin
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • plugin_name
+
+
kubelet_desired_pods
+
The number of pods the kubelet is being instructed to run. static is true if the pod is not from the apiserver.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • static
+
+
kubelet_device_plugin_alloc_duration_seconds
+
Duration in seconds to serve a device plugin Allocation request. Broken down by resource name.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • resource_name
+
+
kubelet_device_plugin_registration_total
+
Cumulative number of device plugin registrations. Broken down by resource name.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • resource_name
+
+
kubelet_evented_pleg_connection_error_count
+
The number of errors encountered during the establishment of streaming connection with the CRI runtime.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kubelet_evented_pleg_connection_latency_seconds
+
The latency of streaming connection with the CRI runtime, measured in seconds.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
kubelet_evented_pleg_connection_success_count
+
The number of times a streaming client was obtained to receive CRI Events.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kubelet_eviction_stats_age_seconds
+
Time between when stats are collected, and when pod is evicted based on those stats by eviction signal
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • eviction_signal
+
+
kubelet_evictions
+
Cumulative number of pod evictions by eviction signal
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • eviction_signal
+
+
kubelet_graceful_shutdown_end_time_seconds
+
Last graceful shutdown start time since unix epoch in seconds
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
kubelet_graceful_shutdown_start_time_seconds
+
Last graceful shutdown start time since unix epoch in seconds
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
kubelet_http_inflight_requests
+
Number of the inflight http requests
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • long_runningmethodpathserver_type
+
+
kubelet_http_requests_duration_seconds
+
Duration in seconds to serve http requests
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • long_runningmethodpathserver_type
+
+
kubelet_http_requests_total
+
Number of the http requests received since the server started
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • long_runningmethodpathserver_type
+
+
kubelet_lifecycle_handler_http_fallbacks_total
+
The number of times lifecycle handlers successfully fell back to http from https.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kubelet_managed_ephemeral_containers
+
Current number of ephemeral containers in pods managed by this kubelet.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
kubelet_mirror_pods
+
The number of mirror pods the kubelet will try to create (one per admitted static pod)
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
kubelet_node_name
+
The node's name. The count is always 1.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • node
+
+
kubelet_orphan_pod_cleaned_volumes
+
The total number of orphaned Pods whose volumes were cleaned in the last periodic sweep.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
kubelet_orphan_pod_cleaned_volumes_errors
+
The number of orphaned Pods whose volumes failed to be cleaned in the last periodic sweep.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
kubelet_orphaned_runtime_pods_total
+
Number of pods that have been detected in the container runtime without being already known to the pod worker. This typically indicates the kubelet was restarted while a pod was force deleted in the API or in the local configuration, which is unusual.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kubelet_pleg_discard_events
+
The number of discard events in PLEG.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kubelet_pleg_last_seen_seconds
+
Timestamp in seconds when PLEG was last seen active.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
kubelet_pleg_relist_duration_seconds
+
Duration in seconds for relisting pods in PLEG.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
kubelet_pleg_relist_interval_seconds
+
Interval in seconds between relisting in PLEG.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
kubelet_pod_resources_endpoint_errors_get
+
Number of requests to the PodResource Get endpoint which returned error. Broken down by server api version.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • server_api_version
+
+
kubelet_pod_resources_endpoint_errors_get_allocatable
+
Number of requests to the PodResource GetAllocatableResources endpoint which returned error. Broken down by server api version.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • server_api_version
+
+
kubelet_pod_resources_endpoint_errors_list
+
Number of requests to the PodResource List endpoint which returned error. Broken down by server api version.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • server_api_version
+
+
kubelet_pod_resources_endpoint_requests_get
+
Number of requests to the PodResource Get endpoint. Broken down by server api version.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • server_api_version
+
+
kubelet_pod_resources_endpoint_requests_get_allocatable
+
Number of requests to the PodResource GetAllocatableResources endpoint. Broken down by server api version.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • server_api_version
+
+
kubelet_pod_resources_endpoint_requests_list
+
Number of requests to the PodResource List endpoint. Broken down by server api version.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • server_api_version
+
+
kubelet_pod_resources_endpoint_requests_total
+
Cumulative number of requests to the PodResource endpoint. Broken down by server api version.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • server_api_version
+
+
kubelet_pod_start_duration_seconds
+
Duration in seconds from kubelet seeing a pod for the first time to the pod starting to run
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
kubelet_pod_start_sli_duration_seconds
+
Duration in seconds to start a pod, excluding time to pull images and run init containers, measured from pod creation timestamp to when all its containers are reported as started and observed via watch
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
kubelet_pod_status_sync_duration_seconds
+
Duration in seconds to sync a pod status update. Measures time from detection of a change to pod status until the API is successfully updated for that pod, even if multiple intevening changes to pod status occur.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
kubelet_pod_worker_duration_seconds
+
Duration in seconds to sync a single pod. Broken down by operation type: create, update, or sync
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • operation_type
+
+
kubelet_pod_worker_start_duration_seconds
+
Duration in seconds from kubelet seeing a pod to starting a worker.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
kubelet_preemptions
+
Cumulative number of pod preemptions by preemption resource
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • preemption_signal
+
+
kubelet_restarted_pods_total
+
Number of pods that have been restarted because they were deleted and recreated with the same UID while the kubelet was watching them (common for static pods, extremely uncommon for API pods)
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • static
+
+
kubelet_run_podsandbox_duration_seconds
+
Duration in seconds of the run_podsandbox operations. Broken down by RuntimeClass.Handler.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • runtime_handler
+
+
kubelet_run_podsandbox_errors_total
+
Cumulative number of the run_podsandbox operation errors by RuntimeClass.Handler.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • runtime_handler
+
+
kubelet_running_containers
+
Number of containers currently running
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • container_state
+
+
kubelet_running_pods
+
Number of pods that have a running pod sandbox
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
kubelet_runtime_operations_duration_seconds
+
Duration in seconds of runtime operations. Broken down by operation type.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • operation_type
+
+
kubelet_runtime_operations_errors_total
+
Cumulative number of runtime operation errors by operation type.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • operation_type
+
+
kubelet_runtime_operations_total
+
Cumulative number of runtime operations by operation type.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • operation_type
+
+
kubelet_server_expiration_renew_errors
+
Counter of certificate renewal errors.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kubelet_started_containers_errors_total
+
Cumulative number of errors when starting containers
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • codecontainer_type
+
+
kubelet_started_containers_total
+
Cumulative number of containers started
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • container_type
+
+
kubelet_started_host_process_containers_errors_total
+
Cumulative number of errors when starting hostprocess containers. This metric will only be collected on Windows.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • codecontainer_type
+
+
kubelet_started_host_process_containers_total
+
Cumulative number of hostprocess containers started. This metric will only be collected on Windows.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • container_type
+
+
kubelet_started_pods_errors_total
+
Cumulative number of errors when starting pods
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kubelet_started_pods_total
+
Cumulative number of pods started
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kubelet_topology_manager_admission_duration_ms
+
Duration in milliseconds to serve a pod admission request.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
kubelet_topology_manager_admission_errors_total
+
The number of admission request failures where resources could not be aligned.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kubelet_topology_manager_admission_requests_total
+
The number of admission requests where resources have to be aligned.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kubelet_volume_metric_collection_duration_seconds
+
Duration in seconds to calculate volume stats
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • metric_source
+
+
kubelet_volume_stats_available_bytes
+
Number of available bytes in the volume
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • namespacepersistentvolumeclaim
+
+
kubelet_volume_stats_capacity_bytes
+
Capacity in bytes of the volume
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • namespacepersistentvolumeclaim
+
+
kubelet_volume_stats_health_status_abnormal
+
Abnormal volume health status. The count is either 1 or 0. 1 indicates the volume is unhealthy, 0 indicates volume is healthy
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • namespacepersistentvolumeclaim
+
+
kubelet_volume_stats_inodes
+
Maximum number of inodes in the volume
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • namespacepersistentvolumeclaim
+
+
kubelet_volume_stats_inodes_free
+
Number of free inodes in the volume
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • namespacepersistentvolumeclaim
+
+
kubelet_volume_stats_inodes_used
+
Number of used inodes in the volume
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • namespacepersistentvolumeclaim
+
+
kubelet_volume_stats_used_bytes
+
Number of used bytes in the volume
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • namespacepersistentvolumeclaim
+
+
kubelet_working_pods
+
Number of pods the kubelet is actually running, broken down by lifecycle phase, whether the pod is desired, orphaned, or runtime only (also orphaned), and whether the pod is static. An orphaned pod has been removed from local configuration or force deleted in the API and consumes resources that are not otherwise visible.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • configlifecyclestatic
+
+
kubeproxy_network_programming_duration_seconds
+
In Cluster Network Programming Latency in seconds
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
kubeproxy_proxy_healthz_total
+
Cumulative proxy healthz HTTP status
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • code
+
+
kubeproxy_proxy_livez_total
+
Cumulative proxy livez HTTP status
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • code
+
+
kubeproxy_sync_full_proxy_rules_duration_seconds
+
SyncProxyRules latency in seconds for full resyncs
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
kubeproxy_sync_partial_proxy_rules_duration_seconds
+
SyncProxyRules latency in seconds for partial resyncs
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
kubeproxy_sync_proxy_rules_duration_seconds
+
SyncProxyRules latency in seconds
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
kubeproxy_sync_proxy_rules_endpoint_changes_pending
+
Pending proxy rules Endpoint changes
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
kubeproxy_sync_proxy_rules_endpoint_changes_total
+
Cumulative proxy rules Endpoint changes
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kubeproxy_sync_proxy_rules_iptables_last
+
Number of iptables rules written by kube-proxy in last sync
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • table
+
+
kubeproxy_sync_proxy_rules_iptables_partial_restore_failures_total
+
Cumulative proxy iptables partial restore failures
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kubeproxy_sync_proxy_rules_iptables_restore_failures_total
+
Cumulative proxy iptables restore failures
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kubeproxy_sync_proxy_rules_iptables_total
+
Total number of iptables rules owned by kube-proxy
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • table
+
+
kubeproxy_sync_proxy_rules_last_queued_timestamp_seconds
+
The last time a sync of proxy rules was queued
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
kubeproxy_sync_proxy_rules_last_timestamp_seconds
+
The last time proxy rules were successfully synced
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
kubeproxy_sync_proxy_rules_no_local_endpoints_total
+
Number of services with a Local traffic policy and no endpoints
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • traffic_policy
+
+
kubeproxy_sync_proxy_rules_service_changes_pending
+
Pending proxy rules Service changes
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
kubeproxy_sync_proxy_rules_service_changes_total
+
Cumulative proxy rules Service changes
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
kubernetes_build_info
+
A metric with a constant '1' value labeled by major, minor, git version, git commit, git tree state, build date, Go version, and compiler from which Kubernetes was built, and platform on which it is running.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • build_datecompilergit_commitgit_tree_stategit_versiongo_versionmajorminorplatform
+
+
leader_election_master_status
+
Gauge of if the reporting system is master of the relevant lease, 0 indicates backup, 1 indicates master. 'name' is the string used to identify the lease. Please make sure to group by name.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • name
+
+
node_authorizer_graph_actions_duration_seconds
+
Histogram of duration of graph actions in node authorizer.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • operation
+
+
node_collector_unhealthy_nodes_in_zone
+
Gauge measuring number of not Ready Nodes per zones.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • zone
+
+
node_collector_update_all_nodes_health_duration_seconds
+
Duration in seconds for NodeController to update the health of all nodes.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
node_collector_update_node_health_duration_seconds
+
Duration in seconds for NodeController to update the health of a single node.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
node_collector_zone_health
+
Gauge measuring percentage of healthy nodes per zone.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • zone
+
+
node_collector_zone_size
+
Gauge measuring number of registered Nodes per zones.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • zone
+
+
node_controller_cloud_provider_taint_removal_delay_seconds
+
Number of seconds after node creation when NodeController removed the cloud-provider taint of a single node.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
node_controller_initial_node_sync_delay_seconds
+
Number of seconds after node creation when NodeController finished the initial synchronization of a single node.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
node_ipam_controller_cidrset_allocation_tries_per_request
+
Number of endpoints added on each Service sync
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • clusterCIDR
+
+
node_ipam_controller_cidrset_cidrs_allocations_total
+
Counter measuring total number of CIDR allocations.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • clusterCIDR
+
+
node_ipam_controller_cidrset_cidrs_releases_total
+
Counter measuring total number of CIDR releases.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • clusterCIDR
+
+
node_ipam_controller_cidrset_usage_cidrs
+
Gauge measuring percentage of allocated CIDRs.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • clusterCIDR
+
+
node_ipam_controller_cirdset_max_cidrs
+
Maximum number of CIDRs that can be allocated.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • clusterCIDR
+
+
node_ipam_controller_multicidrset_allocation_tries_per_request
+
Histogram measuring CIDR allocation tries per request.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • clusterCIDR
+
+
node_ipam_controller_multicidrset_cidrs_allocations_total
+
Counter measuring total number of CIDR allocations.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • clusterCIDR
+
+
node_ipam_controller_multicidrset_cidrs_releases_total
+
Counter measuring total number of CIDR releases.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • clusterCIDR
+
+
node_ipam_controller_multicidrset_usage_cidrs
+
Gauge measuring percentage of allocated CIDRs.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • clusterCIDR
+
+
node_ipam_controller_multicirdset_max_cidrs
+
Maximum number of CIDRs that can be allocated.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • clusterCIDR
+
+
node_swap_usage_bytes
+
Current swap usage of the node in bytes. Reported only on non-windows systems
+
    +
  • ALPHA
  • +
  • Custom
  • +
+
+
number_of_l4_ilbs
+
Number of L4 ILBs
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • feature
+
+
plugin_manager_total_plugins
+
Number of plugins in Plugin Manager
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • socket_pathstate
+
+
pod_gc_collector_force_delete_pod_errors_total
+
Number of errors encountered when forcefully deleting the pods since the Pod GC Controller started.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • namespacereason
+
+
pod_gc_collector_force_delete_pods_total
+
Number of pods that are being forcefully deleted since the Pod GC Controller started.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • namespacereason
+
+
pod_security_errors_total
+
Number of errors preventing normal evaluation. Non-fatal errors may result in the latest restricted profile being used for evaluation.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • fatalrequest_operationresourcesubresource
+
+
pod_security_evaluations_total
+
Number of policy evaluations that occurred, not counting ignored or exempt requests.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • decisionmodepolicy_levelpolicy_versionrequest_operationresourcesubresource
+
+
pod_security_exemptions_total
+
Number of exempt requests, not counting ignored or out of scope requests.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • request_operationresourcesubresource
+
+
pod_swap_usage_bytes
+
Current amount of the pod swap usage in bytes. Reported only on non-windows systems
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • podnamespace
+
+
prober_probe_duration_seconds
+
Duration in seconds for a probe response.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • containernamespacepodprobe_type
+
+
prober_probe_total
+
Cumulative number of a liveness, readiness or startup probe for a container by result.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • containernamespacepodpod_uidprobe_typeresult
+
+
pv_collector_bound_pv_count
+
Gauge measuring number of persistent volume currently bound
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • storage_class
+
+
pv_collector_bound_pvc_count
+
Gauge measuring number of persistent volume claim currently bound
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • namespace
+
+
pv_collector_total_pv_count
+
Gauge measuring total number of persistent volumes
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • plugin_namevolume_mode
+
+
pv_collector_unbound_pv_count
+
Gauge measuring number of persistent volume currently unbound
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • storage_class
+
+
pv_collector_unbound_pvc_count
+
Gauge measuring number of persistent volume claim currently unbound
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • namespace
+
+
reconstruct_volume_operations_errors_total
+
The number of volumes that failed reconstruction from the operating system during kubelet startup.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
reconstruct_volume_operations_total
+
The number of volumes that were attempted to be reconstructed from the operating system during kubelet startup. This includes both successful and failed reconstruction.
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
replicaset_controller_sorting_deletion_age_ratio
+
The ratio of chosen deleted pod's ages to the current youngest pod's age (at the time). Should be <2.The intent of this metric is to measure the rough efficacy of the LogarithmicScaleDown feature gate's effect onthe sorting (and deletion) of pods when a replicaset scales down. This only considers Ready pods when calculating and reporting.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
resourceclaim_controller_create_attempts_total
+
Number of ResourceClaims creation requests
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
resourceclaim_controller_create_failures_total
+
Number of ResourceClaims creation request failures
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
rest_client_dns_resolution_duration_seconds
+
DNS resolver latency in seconds. Broken down by host.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • host
+
+
rest_client_exec_plugin_call_total
+
Number of calls to an exec plugin, partitioned by the type of event encountered (no_error, plugin_execution_error, plugin_not_found_error, client_internal_error) and an optional exit code. The exit code will be set to 0 if and only if the plugin call was successful.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • call_statuscode
+
+
rest_client_exec_plugin_certificate_rotation_age
+
Histogram of the number of seconds the last auth exec plugin client certificate lived before being rotated. If auth exec plugin client certificates are unused, histogram will contain no data.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
rest_client_exec_plugin_ttl_seconds
+
Gauge of the shortest TTL (time-to-live) of the client certificate(s) managed by the auth exec plugin. The value is in seconds until certificate expiry (negative if already expired). If auth exec plugins are unused or manage no TLS certificates, the value will be +INF.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
rest_client_rate_limiter_duration_seconds
+
Client side rate limiter latency in seconds. Broken down by verb, and host.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • hostverb
+
+
rest_client_request_duration_seconds
+
Request latency in seconds. Broken down by verb, and host.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • hostverb
+
+
rest_client_request_retries_total
+
Number of request retries, partitioned by status code, verb, and host.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • codehostverb
+
+
rest_client_request_size_bytes
+
Request size in bytes. Broken down by verb and host.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • hostverb
+
+
rest_client_requests_total
+
Number of HTTP requests, partitioned by status code, method, and host.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • codehostmethod
+
+
rest_client_response_size_bytes
+
Response size in bytes. Broken down by verb and host.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • hostverb
+
+
rest_client_transport_cache_entries
+
Number of transport entries in the internal cache.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
rest_client_transport_create_calls_total
+
Number of calls to get a new transport, partitioned by the result of the operation hit: obtained from the cache, miss: created and added to the cache, uncacheable: created and not cached
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • result
+
+
retroactive_storageclass_errors_total
+
Total number of failed retroactive StorageClass assignments to persistent volume claim
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
retroactive_storageclass_total
+
Total number of retroactive StorageClass assignments to persistent volume claim
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
root_ca_cert_publisher_sync_duration_seconds
+
Number of namespace syncs happened in root ca cert publisher.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • code
+
+
root_ca_cert_publisher_sync_total
+
Number of namespace syncs happened in root ca cert publisher.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • code
+
+
running_managed_controllers
+
Indicates where instances of a controller are currently running
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • managername
+
+
scheduler_goroutines
+
Number of running goroutines split by the work they do such as binding.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • operation
+
+
scheduler_permit_wait_duration_seconds
+
Duration of waiting on permit.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • result
+
+
scheduler_plugin_evaluation_total
+
Number of attempts to schedule pods by each plugin and the extension point (available only in PreFilter and Filter.).
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • extension_pointpluginprofile
+
+
scheduler_plugin_execution_duration_seconds
+
Duration for running a plugin at a specific extension point.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • extension_pointpluginstatus
+
+
scheduler_scheduler_cache_size
+
Number of nodes, pods, and assumed (bound) pods in the scheduler cache.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • type
+
+
scheduler_scheduling_algorithm_duration_seconds
+
Scheduling algorithm latency in seconds
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
scheduler_unschedulable_pods
+
The number of unschedulable pods broken down by plugin name. A pod will increment the gauge for all plugins that caused it to not schedule and so this metric have meaning only when broken down by plugin.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • pluginprofile
+
+
scheduler_volume_binder_cache_requests_total
+
Total number for request volume binding cache
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • operation
+
+
scheduler_volume_scheduling_stage_error_total
+
Volume scheduling stage error count
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • operation
+
+
scrape_error
+
1 if there was an error while getting container metrics, 0 otherwise
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • 1.29.0
+
+
service_controller_loadbalancer_sync_total
+
A metric counting the amount of times any load balancer has been configured, as an effect of service/node changes on the cluster
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
service_controller_nodesync_error_total
+
A metric counting the amount of times any load balancer has been configured and errored, as an effect of node changes on the cluster
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
service_controller_nodesync_latency_seconds
+
A metric measuring the latency for nodesync which updates loadbalancer hosts on cluster node updates.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
service_controller_update_loadbalancer_host_latency_seconds
+
A metric measuring the latency for updating each load balancer hosts.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
serviceaccount_legacy_auto_token_uses_total
+
Cumulative auto-generated legacy tokens used
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
serviceaccount_legacy_manual_token_uses_total
+
Cumulative manually created legacy tokens used
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
serviceaccount_legacy_tokens_total
+
Cumulative legacy service account tokens used
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
serviceaccount_stale_tokens_total
+
Cumulative stale projected service account tokens used
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
serviceaccount_valid_tokens_total
+
Cumulative valid projected service account tokens used
+
    +
  • ALPHA
  • +
  • Counter
  • +
+
+
storage_count_attachable_volumes_in_use
+
Measure number of volumes in use
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • nodevolume_plugin
+
+
storage_operation_duration_seconds
+
Storage operation duration
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • migratedoperation_namestatusvolume_plugin
+
+
ttl_after_finished_controller_job_deletion_duration_seconds
+
The time it took to delete the job since it became eligible for deletion
+
    +
  • ALPHA
  • +
  • Histogram
  • +
+
+
volume_manager_selinux_container_errors_total
+
Number of errors when kubelet cannot compute SELinux context for a container. Kubelet can't start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of containers.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
volume_manager_selinux_container_warnings_total
+
Number of errors when kubelet cannot compute SELinux context for a container that are ignored. They will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
volume_manager_selinux_pod_context_mismatch_errors_total
+
Number of errors when a Pod defines different SELinux contexts for its containers that use the same volume. Kubelet can't start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of Pods.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
volume_manager_selinux_pod_context_mismatch_warnings_total
+
Number of errors when a Pod defines different SELinux contexts for its containers that use the same volume. They are not errors yet, but they will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
volume_manager_selinux_volume_context_mismatch_errors_total
+
Number of errors when a Pod uses a volume that is already mounted with a different SELinux context than the Pod needs. Kubelet can't start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of Pods.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
volume_manager_selinux_volume_context_mismatch_warnings_total
+
Number of errors when a Pod uses a volume that is already mounted with a different SELinux context than the Pod needs. They are not errors yet, but they will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
volume_manager_selinux_volumes_admitted_total
+
Number of volumes whose SELinux context was fine and will be mounted with mount -o context option.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
+
+
volume_manager_total_volumes
+
Number of volumes in Volume Manager
+
    +
  • ALPHA
  • +
  • Custom
  • +
  • plugin_namestate
+
+
volume_operation_total_errors
+
Total volume operation errors
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • operation_nameplugin_name
+
+
volume_operation_total_seconds
+
Storage operation end to end duration in seconds
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • operation_nameplugin_name
+
+
watch_cache_capacity
+
Total capacity of watch cache broken by resource type.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • resource
+
+
watch_cache_capacity_decrease_total
+
Total number of watch cache capacity decrease events broken by resource type.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • resource
+
+
watch_cache_capacity_increase_total
+
Total number of watch cache capacity increase events broken by resource type.
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • resource
+
+
workqueue_adds_total
+
Total number of adds handled by workqueue
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • name
+
+
workqueue_depth
+
Current depth of workqueue
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • name
+
+
workqueue_longest_running_processor_seconds
+
How many seconds has the longest running processor for workqueue been running.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • name
+
+
workqueue_queue_duration_seconds
+
How long in seconds an item stays in workqueue before being requested.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • name
+
+
workqueue_retries_total
+
Total number of retries handled by workqueue
+
    +
  • ALPHA
  • +
  • Counter
  • +
  • name
+
+
workqueue_unfinished_work_seconds
+
How many seconds of work has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
+
    +
  • ALPHA
  • +
  • Gauge
  • +
  • name
+
+
workqueue_work_duration_seconds
+
How long in seconds processing an item from workqueue takes.
+
    +
  • ALPHA
  • +
  • Histogram
  • +
  • name
+
+
diff --git a/content/en/docs/reference/instrumentation/slis.md b/content/en/docs/reference/instrumentation/slis.md index 3b559a398c914..e520d0a9344b8 100644 --- a/content/en/docs/reference/instrumentation/slis.md +++ b/content/en/docs/reference/instrumentation/slis.md @@ -9,7 +9,7 @@ weight: 20 -{{< feature-state for_k8s_version="v1.27" state="beta" >}} +{{< feature-state for_k8s_version="v1.29" state="stable" >}} By default, Kubernetes {{< skew currentVersion >}} publishes Service Level Indicator (SLI) metrics for each Kubernetes component binary. This metric endpoint is exposed on the serving diff --git a/content/en/docs/reference/kubectl/kubectl.md b/content/en/docs/reference/kubectl/kubectl.md index 8d6e8aae0c3c0..80377dce33e70 100644 --- a/content/en/docs/reference/kubectl/kubectl.md +++ b/content/en/docs/reference/kubectl/kubectl.md @@ -370,10 +370,10 @@ kubectl [flags] -KUBECTL_INTERACTIVE_DELETE +KUBECTL_REMOTE_COMMAND_WEBSOCKETS -When set to true, the --interactive flag in the kubectl delete command will be activated, allowing users to preview and confirm resources before proceeding to delete by passing this flag. +When set to true, the kubectl exec, cp, and attach commands will attempt to stream using the websockets protocol. If the upgrade to websockets fails, the commands will fallback to use the current SPDY protocol. diff --git a/content/en/docs/reference/labels-annotations-taints/_index.md b/content/en/docs/reference/labels-annotations-taints/_index.md index 500c0f5a03a0d..cfecb8133d448 100644 --- a/content/en/docs/reference/labels-annotations-taints/_index.md +++ b/content/en/docs/reference/labels-annotations-taints/_index.md @@ -1043,6 +1043,23 @@ last saw a request where the client authenticated using the service account toke If a legacy token was last used before the cluster gained the feature (added in Kubernetes v1.26), then the label isn't set. +### kubernetes.io/legacy-token-invalid-since + +Type: Label + +Example: `kubernetes.io/legacy-token-invalid-since: 2023-10-27` + +Used on: Secret + +The control plane automatically adds this label to auto-generated Secrets that +have the type `kubernetes.io/service-account-token`, provided that you have the +`LegacyServiceAccountTokenCleanUp` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) +enabled. Kubernetes {{< skew currentVersion >}} enables that behavior by default. +This label marks the Secret-based token as invalid for authentication. The value +of this label records the date (ISO 8601 format, UTC time zone) when the control +plane detects that the auto-generated Secret has not been used for a specified +duration (defaults to one year). + ### endpointslice.kubernetes.io/managed-by {#endpointslicekubernetesiomanaged-by} Type: Label diff --git a/content/en/docs/reference/networking/virtual-ips.md b/content/en/docs/reference/networking/virtual-ips.md index 1595834ee5cc3..c574a8e1aafb7 100644 --- a/content/en/docs/reference/networking/virtual-ips.md +++ b/content/en/docs/reference/networking/virtual-ips.md @@ -14,6 +14,18 @@ The `kube-proxy` component is responsible for implementing a _virtual IP_ mechanism for {{< glossary_tooltip term_id="service" text="Services">}} of `type` other than [`ExternalName`](/docs/concepts/services-networking/service/#externalname). +Each instance of kube-proxy watches the Kubernetes {{< glossary_tooltip +term_id="control-plane" text="control plane" >}} for the addition and +removal of Service and EndpointSlice {{< glossary_tooltip +term_id="object" text="objects" >}}. For each Service, kube-proxy +calls appropriate APIs (depending on the kube-proxy mode) to configure +the node to capture traffic to the Service's `clusterIP` and `port`, +and redirect that traffic to one of the Service's endpoints +(usually a Pod, but possibly an arbitrary user-provided IP address). A control +loop ensures that the rules on each node are reliably synchronized with +the Service and EndpointSlice state as indicated by the API server. + +{{< figure src="/images/docs/services-iptables-overview.svg" title="Virtual IP mechanism for Services, using iptables mode" class="diagram-medium" >}} A question that pops up every now and then is why Kubernetes relies on proxying to forward inbound traffic to backends. What about other @@ -57,11 +69,14 @@ The kube-proxy starts up in different modes, which are determined by its configu On Linux nodes, the available modes for kube-proxy are: [`iptables`](#proxy-mode-iptables) -: A mode where the kube-proxy configures packet forwarding rules using iptables, on Linux. +: A mode where the kube-proxy configures packet forwarding rules using iptables. [`ipvs`](#proxy-mode-ipvs) : a mode where the kube-proxy configures packet forwarding rules using ipvs. +[`nftables`](#proxy-mode-nftables) +: a mode where the kube-proxy configures packet forwarding rules using nftables. + There is only one mode available for kube-proxy on Windows: [`kernelspace`](#proxy-mode-kernelspace) @@ -71,32 +86,10 @@ There is only one mode available for kube-proxy on Windows: _This proxy mode is only available on Linux nodes._ -In this mode, kube-proxy watches the Kubernetes -{{< glossary_tooltip term_id="control-plane" text="control plane" >}} for the addition and -removal of Service and EndpointSlice {{< glossary_tooltip term_id="object" text="objects." >}} -For each Service, it installs -iptables rules, which capture traffic to the Service's `clusterIP` and `port`, -and redirect that traffic to one of the Service's -backend sets. For each endpoint, it installs iptables rules which -select a backend Pod. - -By default, kube-proxy in iptables mode chooses a backend at random. - -Using iptables to handle traffic has a lower system overhead, because traffic -is handled by Linux netfilter without the need to switch between userspace and the -kernel space. This approach is also likely to be more reliable. - -If kube-proxy is running in iptables mode and the first Pod that's selected -does not respond, the connection fails. This is different from the old `userspace` -mode: in that scenario, kube-proxy would detect that the connection to the first -Pod had failed and would automatically retry with a different backend Pod. - -You can use Pod [readiness probes](/docs/concepts/workloads/pods/pod-lifecycle/#container-probes) -to verify that backend Pods are working OK, so that kube-proxy in iptables mode -only sees backends that test out as healthy. Doing this means you avoid -having traffic sent via kube-proxy to a Pod that's known to have failed. - -{{< figure src="/images/docs/services-iptables-overview.svg" title="Virtual IP mechanism for Services, using iptables mode" class="diagram-medium" >}} +In this mode, kube-proxy configures packet forwarding rules using the +iptables API of the kernel netfilter subsystem. For each endpoint, it +installs iptables rules which, by default, select a backend Pod at +random. #### Example {#packet-processing-iptables} @@ -122,8 +115,10 @@ through a load-balancer, though in those cases the client IP address does get al #### Optimizing iptables mode performance -In large clusters (with tens of thousands of Pods and Services), the -iptables mode of kube-proxy may take a long time to update the rules +In iptables mode, kube-proxy creates a few iptables rules for every +Service, and a few iptables rules for each endpoint IP address. In +clusters with tens of thousands of Pods and Services, this means tens +of thousands of iptables rules, and kube-proxy may take a long time to update the rules in the kernel when Services (or their EndpointSlices) change. You can adjust the syncing behavior of kube-proxy via options in the [`iptables` section](/docs/reference/config-api/kube-proxy-config.v1alpha1/#kubeproxy-config-k8s-io-v1alpha1-KubeProxyIPTablesConfiguration) of the @@ -204,18 +199,15 @@ and is likely to hurt functionality more than it improves performance. _This proxy mode is only available on Linux nodes._ -In `ipvs` mode, kube-proxy watches Kubernetes Services and EndpointSlices, -calls `netlink` interface to create IPVS rules accordingly and synchronizes -IPVS rules with Kubernetes Services and EndpointSlices periodically. -This control loop ensures that IPVS status matches the desired state. -When accessing a Service, IPVS directs traffic to one of the backend Pods. +In `ipvs` mode, kube-proxy uses the kernel IPVS and iptables APIs to +create rules to redirect traffic from Service IPs to endpoint IPs. The IPVS proxy mode is based on netfilter hook function that is similar to iptables mode, but uses a hash table as the underlying data structure and works in the kernel space. That means kube-proxy in IPVS mode redirects traffic with lower latency than kube-proxy in iptables mode, with much better performance when synchronizing -proxy rules. Compared to the other proxy modes, IPVS mode also supports a +proxy rules. Compared to the iptables proxy mode, IPVS mode also supports a higher throughput of network traffic. IPVS provides more options for balancing traffic to backend Pods; @@ -263,11 +255,28 @@ the node before starting kube-proxy. When kube-proxy starts in IPVS proxy mode, it verifies whether IPVS kernel modules are available. If the IPVS kernel modules are not detected, then kube-proxy -falls back to running in iptables proxy mode. +exits with an error. {{< /note >}} {{< figure src="/images/docs/services-ipvs-overview.svg" title="Virtual IP address mechanism for Services, using IPVS mode" class="diagram-medium" >}} +### `nftables` proxy mode {#proxy-mode-nftables} + +{{< feature-state for_k8s_version="v1.29" state="alpha" >}} + +_This proxy mode is only available on Linux nodes._ + +In this mode, kube-proxy configures packet forwarding rules using the +nftables API of the kernel netfilter subsystem. For each endpoint, it +installs nftables rules which, by default, select a backend Pod at +random. + +The nftables API is the successor to the iptables API, and although it +is designed to provide better performance and scalability than +iptables, the kube-proxy nftables mode is still under heavy +development as of {{< skew currentVersion >}} and is not necessarily +expected to outperform the other Linux modes at this time. + ### `kernelspace` proxy mode {#proxy-mode-kernelspace} _This proxy mode is only available on Windows nodes._ @@ -344,9 +353,9 @@ ensure that no two Services can collide. Kubernetes does that by allocating each Service its own IP address from within the `service-cluster-ip-range` CIDR range that is configured for the {{< glossary_tooltip term_id="kube-apiserver" text="API Server" >}}. -#### IP address allocation tracking +### IP address allocation tracking -To ensure each Service receives a unique IP, an internal allocator atomically +To ensure each Service receives a unique IP address, an internal allocator atomically updates a global allocation map in {{< glossary_tooltip term_id="etcd" >}} prior to creating each Service. The map object must exist in the registry for Services to get IP address assignments, otherwise creations will @@ -355,28 +364,37 @@ fail with a message indicating an IP address could not be allocated. In the control plane, a background controller is responsible for creating that map (needed to support migrating from older versions of Kubernetes that used in-memory locking). Kubernetes also uses controllers to check for invalid -assignments (e.g. due to administrator intervention) and for cleaning up allocated +assignments (for example: due to administrator intervention) and for cleaning up allocated IP addresses that are no longer used by any Services. +#### IP address allocation tracking using the Kubernetes API {#ip-address-objects} + {{< feature-state for_k8s_version="v1.27" state="alpha" >}} + If you enable the `MultiCIDRServiceAllocator` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) and the [`networking.k8s.io/v1alpha1` API group](/docs/tasks/administer-cluster/enable-disable-api/), -the control plane replaces the existing etcd allocator with a new one, using IPAddress -objects instead of an internal global allocation map. The ClusterIP address -associated to each Service will have a referenced IPAddress object. +the control plane replaces the existing etcd allocator with a revised implementation +that uses IPAddress and ServiceCIDR objects instead of an internal global allocation map. +Each cluster IP address associated to a Service then references an IPAddress object. + +Enabling the feature gate also replaces a background controller with an alternative +that handles the IPAddress objects and supports migration from the old allocator model. +Kubernetes {{< skew currentVersion >}} does not support migrating from IPAddress +objects to the internal allocation map. -The background controller is also replaced by a new one to handle the new IPAddress -objects and the migration from the old allocator model. +One of the main benefits of the revised allocator is that it removes the size limitations +for the IP address range that can be used for the cluster IP address of Services. +With `MultiCIDRServiceAllocator` enabled, there are no limitations for IPv4, and for IPv6 +you can use IP address netmasks that are a /64 or smaller (as opposed to /108 with the +legacy implementation). -One of the main benefits of the new allocator is that it removes the size limitations -for the `service-cluster-ip-range`, there is no limitations for IPv4 and for IPv6 -users can use masks equal or larger than /64 (previously it was /108). +Making IP address allocations available via the API means that you as a cluster administrator +can allow users to inspect the IP addresses assigned to their Services. +Kubernetes extensions, such as the [Gateway API](/docs/concepts/services-networking/gateway/), +can use the IPAddress API to extend Kubernetes' inherent networking capabilities. -Users now will be able to inspect the IP addresses assigned to their Services, and -Kubernetes extensions such as the [Gateway](https://gateway-api.sigs.k8s.io/) API, can use this new -IPAddress object kind to enhance the Kubernetes networking capabilities, going beyond the limitations of -the built-in Service API. +Here is a brief example of a user querying for IP addresses: ```shell kubectl get services @@ -394,7 +412,45 @@ NAME PARENTREF 2001:db8:1:2::a services/kube-system/kube-dns ``` -#### IP address ranges for Service virtual IP addresses {#service-ip-static-sub-range} +Kubernetes also allow users to dynamically define the available IP ranges for Services using +ServiceCIDR objects. During bootstrap, a default ServiceCIDR object named `kubernetes` is created +from the value of the `--service-cluster-ip-range` command line argument to kube-apiserver: + +```shell +kubectl get servicecidrs +``` +``` +NAME CIDRS AGE +kubernetes 10.96.0.0/28 17m +``` + +Users can create or delete new ServiceCIDR objects to manage the available IP ranges for Services: + +```shell +cat <<'EOF' | kubectl apply -f - +apiVersion: networking.k8s.io/v1alpha1 +kind: ServiceCIDR +metadata: + name: newservicecidr +spec: + cidrs: + - 10.96.0.0/24 +EOF +``` +``` +servicecidr.networking.k8s.io/newcidr1 created +``` + +```shell +kubectl get servicecidrs +``` +``` +NAME CIDRS AGE +kubernetes 10.96.0.0/28 17m +newservicecidr 10.96.0.0/24 7m +``` + +### IP address ranges for Service virtual IP addresses {#service-ip-static-sub-range} {{< feature-state for_k8s_version="v1.26" state="stable" >}} diff --git a/content/en/docs/reference/setup-tools/kubeadm/generated/kubeadm_certs_renew_super-admin.conf.md b/content/en/docs/reference/setup-tools/kubeadm/generated/kubeadm_certs_renew_super-admin.conf.md new file mode 100644 index 0000000000000..db00d62bb5750 --- /dev/null +++ b/content/en/docs/reference/setup-tools/kubeadm/generated/kubeadm_certs_renew_super-admin.conf.md @@ -0,0 +1,92 @@ + + + +Renew the certificate embedded in the kubeconfig file for the super-admin + +### Synopsis + + +Renew the certificate embedded in the kubeconfig file for the super-admin. + +Renewals run unconditionally, regardless of certificate expiration date; extra attributes such as SANs will be based on the existing file/certificates, there is no need to resupply them. + +Renewal by default tries to use the certificate authority in the local PKI managed by kubeadm; as alternative it is possible to use K8s certificate API for certificate renewal, or as a last option, to generate a CSR request. + +After renewal, in order to make changes effective, is required to restart control-plane components and eventually re-distribute the renewed certificate in case the file is used elsewhere. + +``` +kubeadm certs renew super-admin.conf [flags] +``` + +### Options + + ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
--cert-dir string     Default: "/etc/kubernetes/pki"

The path where to save the certificates

--config string

Path to a kubeadm configuration file.

-h, --help

help for admin.conf

--kubeconfig string     Default: "/etc/kubernetes/admin.conf"

The kubeconfig file to use when talking to the cluster. If the flag is not set, a set of standard locations can be searched for an existing kubeconfig file.

+ + + +### Options inherited from parent commands + + ++++ + + + + + + + + + + +
--rootfs string

[EXPERIMENTAL] The path to the 'real' host root filesystem.

+ + + diff --git a/content/en/docs/reference/setup-tools/kubeadm/generated/kubeadm_init_phase_kubeconfig_super-admin.md b/content/en/docs/reference/setup-tools/kubeadm/generated/kubeadm_init_phase_kubeconfig_super-admin.md new file mode 100644 index 0000000000000..14de2fdbfb5a5 --- /dev/null +++ b/content/en/docs/reference/setup-tools/kubeadm/generated/kubeadm_init_phase_kubeconfig_super-admin.md @@ -0,0 +1,121 @@ + + + +Generate a kubeconfig file for the super-admin + +### Synopsis + + +Generate a kubeconfig file for the super-admin. + +``` +kubeadm init phase kubeconfig super-admin [flags] +``` + +### Options + + ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
--apiserver-advertise-address string

The IP address the API Server will advertise it's listening on. If not set the default network interface will be used.

--apiserver-bind-port int32     Default: 6443

Port for the API Server to bind to.

--cert-dir string     Default: "/etc/kubernetes/pki"

The path where to save and store the certificates.

--config string

Path to a kubeadm configuration file.

--control-plane-endpoint string

Specify a stable IP address or DNS name for the control plane.

--dry-run

Don't apply any changes; just output what would be done.

-h, --help

help for admin

--kubeconfig-dir string     Default: "/etc/kubernetes"

The path where to save the kubeconfig file.

--kubernetes-version string     Default: "stable-1"

Choose a specific Kubernetes version for the control plane.

+ + + +### Options inherited from parent commands + + ++++ + + + + + + + + + + +
--rootfs string

[EXPERIMENTAL] The path to the 'real' host root filesystem.

+ + + diff --git a/content/en/docs/reference/setup-tools/kubeadm/implementation-details.md b/content/en/docs/reference/setup-tools/kubeadm/implementation-details.md index 7a0d5b3bf11a6..33463afb8cad8 100644 --- a/content/en/docs/reference/setup-tools/kubeadm/implementation-details.md +++ b/content/en/docs/reference/setup-tools/kubeadm/implementation-details.md @@ -64,6 +64,7 @@ in a majority of cases, and the most intuitive location; other constants paths a - `controller-manager.conf` - `scheduler.conf` - `admin.conf` for the cluster admin and kubeadm itself + - `super-admin.conf` for the cluster super-admin that can bypass RBAC - Names of certificates and key files : @@ -209,12 +210,21 @@ Kubeadm generates kubeconfig files with identities for control plane components: This client cert should have the CN `system:kube-scheduler`, as defined by default [RBAC core components roles](/docs/reference/access-authn-authz/rbac/#core-component-roles) -Additionally, a kubeconfig file for kubeadm itself and the admin is generated and saved into the -`/etc/kubernetes/admin.conf` file. The "admin" here is defined as the actual person(s) that is -administering the cluster and wants to have full control (**root**) over the cluster. The -embedded client certificate for admin should be in the `system:masters` organization, as defined -by default [RBAC user facing role bindings](/docs/reference/access-authn-authz/rbac/#user-facing-roles). -It should also include a CN. Kubeadm uses the `kubernetes-admin` CN. +Additionally, a kubeconfig file for kubeadm as an administrative entity is generated and stored +in `/etc/kubernetes/admin.conf`. This file includes a certificate with +`Subject: O = kubeadm:cluster-admins, CN = kubernetes-admin`. `kubeadm:cluster-admins` +is a group managed by kubeadm. It is bound to the `cluster-admin` ClusterRole during `kubeadm init`, +by using the `super-admin.conf` file, which does not require RBAC. +This `admin.conf` file must remain on control plane nodes and not be shared with additional users. + +During `kubeadm init` another kubeconfig file is generated and stored in `/etc/kubernetes/super-admin.conf`. +This file includes a certificate with `Subject: O = system:masters, CN = kubernetes-super-admin`. +`system:masters` is a super user group that bypasses RBAC and makes `super-admin.conf` useful in case +of an emergency where a cluster is locked due to RBAC misconfiguration. +The `super-admin.conf` file can be stored in a safe location and not shared with additional users. + +See [RBAC user facing role bindings](/docs/reference/access-authn-authz/rbac/#user-facing-roles) +for additional information RBAC and built-in ClusterRoles and groups. Please note that: diff --git a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-certs.md b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-certs.md index f4951290cf804..e05ea06b8707f 100644 --- a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-certs.md +++ b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-certs.md @@ -34,6 +34,7 @@ For more details see [Manual certificate renewal](/docs/tasks/administer-cluster {{< tab name="etcd-server" include="generated/kubeadm_certs_renew_etcd-server.md" />}} {{< tab name="front-proxy-client" include="generated/kubeadm_certs_renew_front-proxy-client.md" />}} {{< tab name="scheduler.conf" include="generated/kubeadm_certs_renew_scheduler.conf.md" />}} +{{< tab name="super-admin.conf" include="generated/kubeadm_certs_renew_super-admin.conf.md" />}} {{< /tabs >}} ## kubeadm certs certificate-key {#cmd-certs-certificate-key} diff --git a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init-phase.md b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init-phase.md index 2bab24f74d7d7..c08427d4b675e 100644 --- a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init-phase.md +++ b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init-phase.md @@ -58,6 +58,7 @@ You can create all required kubeconfig files by calling the `all` subcommand or {{< tab name="kubelet" include="generated/kubeadm_init_phase_kubeconfig_kubelet.md" />}} {{< tab name="controller-manager" include="generated/kubeadm_init_phase_kubeconfig_controller-manager.md" />}} {{< tab name="scheduler" include="generated/kubeadm_init_phase_kubeconfig_scheduler.md" />}} +{{< tab name="super-admin" include="generated/kubeadm_init_phase_kubeconfig_super-admin.md" />}} {{< /tabs >}} ## kubeadm init phase control-plane {#cmd-phase-control-plane} diff --git a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md index f090a9f3c6a78..0fbeb13e93040 100644 --- a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md +++ b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md @@ -32,8 +32,9 @@ following steps: arguments, lowercased if necessary. 1. Writes kubeconfig files in `/etc/kubernetes/` for the kubelet, the controller-manager and the - scheduler to use to connect to the API server, each with its own identity, as well as an - additional kubeconfig file for administration named `admin.conf`. + scheduler to use to connect to the API server, each with its own identity. Also + additional kubeconfig files are written, for kubeadm as administrative entity (`admin.conf`) + and for a super admin user that can bypass RBAC (`super-admin.conf`). 1. Generates static Pod manifests for the API server, controller-manager and scheduler. In case an external etcd is not provided, @@ -157,9 +158,9 @@ List of feature gates: {{< table caption="kubeadm feature gates" >}} Feature | Default | Alpha | Beta | GA :-------|:--------|:------|:-----|:---- +`EtcdLearnerMode` | `true` | 1.27 | 1.29 | - `PublicKeysECDSA` | `false` | 1.19 | - | - `RootlessControlPlane` | `false` | 1.22 | - | - -`EtcdLearnerMode` | `false` | 1.27 | - | - {{< /table >}} {{< note >}} @@ -168,6 +169,10 @@ Once a feature gate goes GA its value becomes locked to `true` by default. Feature gate descriptions: +`EtcdLearnerMode` +: With this feature gate enabled, when joining a new control plane node, a new etcd member will be created +as a learner and promoted to a voting member only after the etcd data are fully aligned. + `PublicKeysECDSA` : Can be used to create a cluster that uses ECDSA certificates instead of the default RSA algorithm. Renewal of existing ECDSA certificates is also supported using `kubeadm certs renew`, but you cannot @@ -179,14 +184,10 @@ for `kube-apiserver`, `kube-controller-manager`, `kube-scheduler` and `etcd` to If the flag is not set, those components run as root. You can change the value of this feature gate before you upgrade to a newer version of Kubernetes. -`EtcdLearnerMode` -: With this feature gate enabled, when joining a new control plane node, a new etcd member will be created -as a learner and promoted to a voting member only after the etcd data are fully aligned. - List of deprecated feature gates: {{< table caption="kubeadm deprecated feature gates" >}} -Feature | Default +Feature | Default :-------|:-------- `UpgradeAddonsBeforeControlPlane` | `false` {{< /table >}} @@ -212,12 +213,16 @@ List of removed feature gates: {{< table caption="kubeadm removed feature gates" >}} Feature | Alpha | Beta | GA | Removed :-------|:------|:-----|:---|:------- -`UnversionedKubeletConfigMap` | 1.22 | 1.23 | 1.25 | 1.26 `IPv6DualStack` | 1.16 | 1.21 | 1.23 | 1.24 +`UnversionedKubeletConfigMap` | 1.22 | 1.23 | 1.25 | 1.26 {{< /table >}} Feature gate descriptions: +`IPv6DualStack` +: This flag helps to configure components dual stack when the feature is in progress. For more details on Kubernetes +dual-stack support see [Dual-stack support with kubeadm](/docs/setup/production-environment/tools/kubeadm/dual-stack-support/). + `UnversionedKubeletConfigMap` : This flag controls the name of the {{< glossary_tooltip text="ConfigMap" term_id="configmap" >}} where kubeadm stores kubelet configuration data. With this flag not specified or set to `true`, the ConfigMap is named `kubelet-config`. @@ -228,10 +233,6 @@ or `kubeadm upgrade apply`), kubeadm respects the value of `UnversionedKubeletCo (during `kubeadm join`, `kubeadm reset`, `kubeadm upgrade ...`), kubeadm attempts to use unversioned ConfigMap name first; if that does not succeed, kubeadm falls back to using the legacy (versioned) name for that ConfigMap. -`IPv6DualStack` -: This flag helps to configure components dual stack when the feature is in progress. For more details on Kubernetes -dual-stack support see [Dual-stack support with kubeadm](/docs/setup/production-environment/tools/kubeadm/dual-stack-support/). - ### Adding kube-proxy parameters {#kube-proxy} For information about kube-proxy parameters in the kubeadm configuration see: @@ -291,7 +292,7 @@ for etcd and CoreDNS. #### Custom sandbox (pause) images {#custom-pause-image} -To set a custom image for these you need to configure this in your +To set a custom image for these you need to configure this in your {{< glossary_tooltip text="container runtime" term_id="container-runtime" >}} to use the image. Consult the documentation for your container runtime to find out how to change this setting; @@ -386,8 +387,9 @@ DNS name or an address of a load balancer. kubeadm certs certificate-key ``` -Once the cluster is up, you can grab the admin credentials from the control-plane node -at `/etc/kubernetes/admin.conf` and use that to talk to the cluster. +Once the cluster is up, you can use the `/etc/kubernetes/admin.conf` file from +a control-plane node to talk to the cluster with administrator credentials or +[Generating kubeconfig files for additional users](/docs/tasks/administer-cluster/kubeadm/kubeadm-certs#kubeconfig-additional-users). Note that this style of bootstrap has some relaxed security guarantees because it does not allow the root CA hash to be validated with diff --git a/content/en/docs/reference/using-api/api-concepts.md b/content/en/docs/reference/using-api/api-concepts.md index 844d0042a5fb5..3e50fabb94313 100644 --- a/content/en/docs/reference/using-api/api-concepts.md +++ b/content/en/docs/reference/using-api/api-concepts.md @@ -317,7 +317,7 @@ The `content-encoding` header indicates that the response is compressed with `gz ## Retrieving large results sets in chunks -{{< feature-state for_k8s_version="v1.9" state="beta" >}} +{{< feature-state for_k8s_version="v1.29" state="stable" >}} On large clusters, retrieving the collection of some resource types may result in very large responses that can impact the server and client. For instance, a cluster @@ -325,9 +325,7 @@ may have tens of thousands of Pods, each of which is equivalent to roughly 2 KiB encoded JSON. Retrieving all pods across all namespaces may result in a very large response (10-20MB) and consume a large amount of server resources. -Provided that you don't explicitly disable the `APIListChunking` -[feature gate](/docs/reference/command-line-tools-reference/feature-gates/), the -Kubernetes API server supports the ability to break a single large collection request +The Kubernetes API server supports the ability to break a single large collection request into many smaller chunks while preserving the consistency of the total request. Each chunk can be returned sequentially which reduces both the total size of the request and allows user-oriented clients to display results incrementally to improve responsiveness. diff --git a/content/en/docs/reference/using-api/deprecation-guide.md b/content/en/docs/reference/using-api/deprecation-guide.md index 0133cbbb577c0..a5560a880b5f2 100644 --- a/content/en/docs/reference/using-api/deprecation-guide.md +++ b/content/en/docs/reference/using-api/deprecation-guide.md @@ -20,6 +20,19 @@ deprecated API versions to newer and more stable API versions. ## Removed APIs by release +### v1.32 + +The **v1.32** release will stop serving the following deprecated API versions: + +#### Flow control resources {#flowcontrol-resources-v132} + +The **flowcontrol.apiserver.k8s.io/v1beta3** API version of FlowSchema and PriorityLevelConfiguration will no longer be served in v1.32. + +* Migrate manifests and API clients to use the **flowcontrol.apiserver.k8s.io/v1** API version, available since v1.29. +* All existing persisted objects are accessible via the new API +* Notable changes in **flowcontrol.apiserver.k8s.io/v1**: + * The PriorityLevelConfiguration `spec.limited.nominalConcurrencyShares` field only defaults to 30 when unspecified, and an explicit value of 0 is not changed to 30. + ### v1.29 The **v1.29** release will stop serving the following deprecated API versions: @@ -28,8 +41,10 @@ The **v1.29** release will stop serving the following deprecated API versions: The **flowcontrol.apiserver.k8s.io/v1beta2** API version of FlowSchema and PriorityLevelConfiguration will no longer be served in v1.29. -* Migrate manifests and API clients to use the **flowcontrol.apiserver.k8s.io/v1beta3** API version, available since v1.26. +* Migrate manifests and API clients to use the **flowcontrol.apiserver.k8s.io/v1** API version, available since v1.29, or the **flowcontrol.apiserver.k8s.io/v1beta3** API version, available since v1.26. * All existing persisted objects are accessible via the new API +* Notable changes in **flowcontrol.apiserver.k8s.io/v1**: + * The PriorityLevelConfiguration `spec.limited.assuredConcurrencyShares` field is renamed to `spec.limited.nominalConcurrencyShares` and only defaults to 30 when unspecified, and an explicit value of 0 is not changed to 30. * Notable changes in **flowcontrol.apiserver.k8s.io/v1beta3**: * The PriorityLevelConfiguration `spec.limited.assuredConcurrencyShares` field is renamed to `spec.limited.nominalConcurrencyShares` diff --git a/content/en/docs/setup/best-practices/certificates.md b/content/en/docs/setup/best-practices/certificates.md index 8bcf5f7e1ecc4..f8af369c80412 100644 --- a/content/en/docs/setup/best-practices/certificates.md +++ b/content/en/docs/setup/best-practices/certificates.md @@ -95,6 +95,12 @@ Required certificates: | kube-apiserver-kubelet-client | kubernetes-ca | system:masters | client | | | front-proxy-client | kubernetes-front-proxy-ca | | client | | +{{< note >}} +Instead of using the super-user group `system:masters` for `kube-apiserver-kubelet-client` +a less privileged group can be used. kubeadm uses the `kubeadm:cluster-admins` group for +that purpose. +{{< /note >}} + [1]: any other IP or DNS name you contact your cluster on (as used by [kubeadm](/docs/reference/setup-tools/kubeadm/) the load balancer stable IP and/or DNS name, `kubernetes`, `kubernetes.default`, `kubernetes.default.svc`, `kubernetes.default.svc.cluster`, `kubernetes.default.svc.cluster.local`) @@ -184,12 +190,13 @@ you need to provide if you are generating all of your own keys and certificates: You must manually configure these administrator account and service accounts: -| filename | credential name | Default CN | O (in Subject) | -|-------------------------|----------------------------|-------------------------------------|----------------| -| admin.conf | default-admin | kubernetes-admin | system:masters | -| kubelet.conf | default-auth | system:node:`` (see note) | system:nodes | -| controller-manager.conf | default-controller-manager | system:kube-controller-manager | | -| scheduler.conf | default-scheduler | system:kube-scheduler | | +| filename | credential name | Default CN | O (in Subject) | +|-------------------------|----------------------------|-------------------------------------|------------------------| +| admin.conf | default-admin | kubernetes-admin | `` | +| super-admin.conf | default-super-admin | kubernetes-super-admin | system:masters | +| kubelet.conf | default-auth | system:node:`` (see note) | system:nodes | +| controller-manager.conf | default-controller-manager | system:kube-controller-manager | | +| scheduler.conf | default-scheduler | system:kube-scheduler | | {{< note >}} The value of `` for `kubelet.conf` **must** match precisely the value of the node name @@ -197,6 +204,22 @@ provided by the kubelet as it registers with the apiserver. For further details, [Node Authorization](/docs/reference/access-authn-authz/node/). {{< /note >}} +{{< note >}} +In the above example `` is implementation specific. Some tools sign the +certificate in the default `admin.conf` to be part of the `system:masters` group. +`system:masters` is a break-glass, super user group can bypass the authorization +layer of Kubernetes, such as RBAC. Also some tools do not generate a separate +`super-admin.conf` with a certificate bound to this super user group. + +kubeadm generates two separate administrator certificates in kubeconfig files. +One is in `admin.conf` and has `Subject: O = kubeadm:cluster-admins, CN = kubernetes-admin`. +`kubeadm:cluster-admins` is a custom group bound to the `cluster-admin` ClusterRole. +This file is generated on all kubeadm managed control plane machines. + +Another is in `super-admin.conf` that has `Subject: O = system:masters, CN = kubernetes-super-admin`. +This file is generated only on the node where `kubeadm init` was called. +{{< /note >}} + 1. For each config, generate an x509 cert/key pair with the given CN and O. 1. Run `kubectl` as follows for each config: @@ -213,6 +236,7 @@ These files are used as follows: | filename | command | comment | |-------------------------|-------------------------|-----------------------------------------------------------------------| | admin.conf | kubectl | Configures administrator user for the cluster | +| super-admin.conf | kubectl | Configures super administrator user for the cluster | | kubelet.conf | kubelet | One required for each node in the cluster. | | controller-manager.conf | kube-controller-manager | Must be added to manifest in `manifests/kube-controller-manager.yaml` | | scheduler.conf | kube-scheduler | Must be added to manifest in `manifests/kube-scheduler.yaml` | @@ -221,6 +245,7 @@ The following files illustrate full paths to the files listed in the previous ta ``` /etc/kubernetes/admin.conf +/etc/kubernetes/super-admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf diff --git a/content/en/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm.md b/content/en/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm.md index 20c93e290b7a3..61b62893288c4 100644 --- a/content/en/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm.md +++ b/content/en/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm.md @@ -265,11 +265,19 @@ export KUBECONFIG=/etc/kubernetes/admin.conf ``` {{< warning >}} -Kubeadm signs the certificate in the `admin.conf` to have `Subject: O = system:masters, CN = kubernetes-admin`. -`system:masters` is a break-glass, super user group that bypasses the authorization layer (e.g. RBAC). -Do not share the `admin.conf` file with anyone and instead grant users custom permissions by generating -them a kubeconfig file using the `kubeadm kubeconfig user` command. For more details see -[Generating kubeconfig files for additional users](/docs/tasks/administer-cluster/kubeadm/kubeadm-certs#kubeconfig-additional-users). +The kubeconfig file `admin.conf` that `kubeadm init` generates contains a certificate with +`Subject: O = kubeadm:cluster-admins, CN = kubernetes-admin`. The group `kubeadm:cluster-admins` +is bound to the built-in `cluster-admin` ClusterRole. +Do not share the `admin.conf` file with anyone. + +`kubeadm init` generates another kubeconfig file `super-admin.conf` that contains a certificate with +`Subject: O = system:masters, CN = kubernetes-super-admin`. +`system:masters` is a break-glass, super user group that bypasses the authorization layer (for example RBAC). +Do not share the `super-admin.conf` file with anyone. It is recommended to move the file to a safe location. + +See +[Generating kubeconfig files for additional users](/docs/tasks/administer-cluster/kubeadm/kubeadm-certs#kubeconfig-additional-users) +on how to use `kubeadm kubeconfig user` to generate kubeconfig files for additional users. {{< /warning >}} Make a record of the `kubeadm join` command that `kubeadm init` outputs. You @@ -605,7 +613,7 @@ version as kubeadm or one version older. Example: * kubeadm is at {{< skew currentVersion >}} -* kubelet on the host must be at {{< skew currentVersion >}} or {{< skew currentVersionAddMinor -1 >}} +* kubelet on the host must be at {{< skew currentVersion >}}, {{< skew currentVersionAddMinor -1 >}}, {{< skew currentVersionAddMinor -2 >}} or {{< skew currentVersionAddMinor -3 >}} ### kubeadm's skew against kubeadm diff --git a/content/en/docs/tasks/administer-cluster/change-pv-access-mode-readwriteoncepod.md b/content/en/docs/tasks/administer-cluster/change-pv-access-mode-readwriteoncepod.md new file mode 100644 index 0000000000000..a2dffdc702c22 --- /dev/null +++ b/content/en/docs/tasks/administer-cluster/change-pv-access-mode-readwriteoncepod.md @@ -0,0 +1,187 @@ +--- +title: Change the Access Mode of a PersistentVolume to ReadWriteOncePod +content_type: task +weight: 90 +min-kubernetes-server-version: v1.22 +--- + + + +This page shows how to change the access mode on an existing PersistentVolume to +use `ReadWriteOncePod`. + +## {{% heading "prerequisites" %}} + +{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}} + +{{< note >}} +The `ReadWriteOncePod` access mode graduated to stable in the Kubernetes v1.29 +release. If you are running a version of Kubernetes older than v1.29, you might +need to enable a feature gate. Check the documentation for your version of +Kubernetes. +{{< /note >}} + +{{< note >}} +The `ReadWriteOncePod` access mode is only supported for +{{< glossary_tooltip text="CSI" term_id="csi" >}} volumes. +To use this volume access mode you will need to update the following +[CSI sidecars](https://kubernetes-csi.github.io/docs/sidecar-containers.html) +to these versions or greater: + +* [csi-provisioner:v3.0.0+](https://github.com/kubernetes-csi/external-provisioner/releases/tag/v3.0.0) +* [csi-attacher:v3.3.0+](https://github.com/kubernetes-csi/external-attacher/releases/tag/v3.3.0) +* [csi-resizer:v1.3.0+](https://github.com/kubernetes-csi/external-resizer/releases/tag/v1.3.0) +{{< /note >}} + +## Why should I use `ReadWriteOncePod`? + +Prior to Kubernetes v1.22, the `ReadWriteOnce` access mode was commonly used to +restrict PersistentVolume access for workloads that required single-writer +access to storage. However, this access mode had a limitation: it restricted +volume access to a single *node*, allowing multiple pods on the same node to +read from and write to the same volume simultaneously. This could pose a risk +for applications that demand strict single-writer access for data safety. + +If ensuring single-writer access is critical for your workloads, consider +migrating your volumes to `ReadWriteOncePod`. + + + +## Migrating existing PersistentVolumes + +If you have existing PersistentVolumes, they can be migrated to use +`ReadWriteOncePod`. Only migrations from `ReadWriteOnce` to `ReadWriteOncePod` +are supported. + +In this example, there is already a `ReadWriteOnce` "cat-pictures-pvc" +PersistentVolumeClaim that is bound to a "cat-pictures-pv" PersistentVolume, +and a "cat-pictures-writer" Deployment that uses this PersistentVolumeClaim. + +{{< note >}} +If your storage plugin supports +[Dynamic provisioning](/docs/concepts/storage/dynamic-provisioning/), +the "cat-picutres-pv" will be created for you, but its name may differ. To get +your PersistentVolume's name run: + +```shell +kubectl get pvc cat-pictures-pvc -o jsonpath='{.spec.volumeName}' +``` +{{< /note >}} + +And you can view the PVC before you make changes. Either view the manifest +locally, or run `kubectl get pvc -o yaml`. The output is similar +to: + +```yaml +# cat-pictures-pvc.yaml +kind: PersistentVolumeClaim +apiVersion: v1 +metadata: + name: cat-pictures-pvc +spec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 1Gi +``` + +Here's an example Deployment that relies on that PersistentVolumeClaim: + +```yaml +# cat-pictures-writer-deployment.yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: cat-pictures-writer +spec: + replicas: 3 + selector: + matchLabels: + app: cat-pictures-writer + template: + metadata: + labels: + app: cat-pictures-writer + spec: + containers: + - name: nginx + image: nginx:1.14.2 + ports: + - containerPort: 80 + volumeMounts: + - name: cat-pictures + mountPath: /mnt + volumes: + - name: cat-pictures + persistentVolumeClaim: + claimName: cat-pictures-pvc + readOnly: false +``` + +As a first step, you need to edit your PersistentVolume's +`spec.persistentVolumeReclaimPolicy` and set it to `Retain`. This ensures your +PersistentVolume will not be deleted when you delete the corresponding +PersistentVolumeClaim: + +```shell +kubectl patch pv cat-pictures-pv -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}' +``` + +Next you need to stop any workloads that are using the PersistentVolumeClaim +bound to the PersistentVolume you want to migrate, and then delete the +PersistentVolumeClaim. Avoid making any other changes to the +PersistentVolumeClaim, such as volume resizes, until after the migration is +complete. + +Once that is done, you need to clear your PersistentVolume's `spec.claimRef.uid` +to ensure PersistentVolumeClaims can bind to it upon recreation: + +```shell +kubectl scale --replicas=0 deployment cat-pictures-writer +kubectl delete pvc cat-pictures-pvc +kubectl patch pv cat-pictures-pv -p '{"spec":{"claimRef":{"uid":""}}}' +``` + +After that, replace the PersistentVolume's list of valid access modes to be +(only) `ReadWriteOncePod`: + +```shell +kubectl patch pv cat-pictures-pv -p '{"spec":{"accessModes":["ReadWriteOncePod"]}}' +``` + +{{< note >}} +The `ReadWriteOncePod` access mode cannot be combined with other access modes. +Make sure `ReadWriteOncePod` is the only access mode on the PersistentVolume +when updating, otherwise the request will fail. +{{< /note >}} + +Next you need to modify your PersistentVolumeClaim to set `ReadWriteOncePod` as +the only access mode. You should also set the PersistentVolumeClaim's +`spec.volumeName` to the name of your PersistentVolume to ensure it binds to +this specific PersistentVolume. + +Once this is done, you can recreate your PersistentVolumeClaim and start up your +workloads: + +```shell +# IMPORTANT: Make sure to edit your PVC in cat-pictures-pvc.yaml before applying. You need to: +# - Set ReadWriteOncePod as the only access mode +# - Set spec.volumeName to "cat-pictures-pv" + +kubectl apply -f cat-pictures-pvc.yaml +kubectl apply -f cat-pictures-writer-deployment.yaml +``` + +Lastly you may edit your PersistentVolume's `spec.persistentVolumeReclaimPolicy` +and set to it back to `Delete` if you previously changed it. + +```shell +kubectl patch pv cat-pictures-pv -p '{"spec":{"persistentVolumeReclaimPolicy":"Delete"}}' +``` + +## {{% heading "whatsnext" %}} + +* Learn more about [PersistentVolumes](/docs/concepts/storage/persistent-volumes/). +* Learn more about [PersistentVolumeClaims](/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims). +* Learn more about [Configuring a Pod to Use a PersistentVolume for Storage](/docs/tasks/configure-pod-container/configure-persistent-volume-storage/) diff --git a/content/en/docs/tasks/administer-cluster/encrypt-data.md b/content/en/docs/tasks/administer-cluster/encrypt-data.md index 5a86d2df14c81..1ceed92ac3d94 100644 --- a/content/en/docs/tasks/administer-cluster/encrypt-data.md +++ b/content/en/docs/tasks/administer-cluster/encrypt-data.md @@ -248,7 +248,7 @@ The following table describes each available provider. - kms v2 (beta) + kms v2 Uses envelope encryption scheme with DEK per API server. Strongest Fast @@ -259,14 +259,10 @@ The following table describes each available provider. Data is encrypted by data encryption keys (DEKs) using AES-GCM; DEKs are encrypted by key encryption keys (KEKs) according to configuration in Key Management Service (KMS). - Kubernetes defaults to generating a new DEK at API server startup, which is then - reused for object encryption. - If you enable the KMSv2KDF - feature gate, - Kubernetes instead generates a new DEK per encryption from a secret seed. - Whichever approach you configure, the DEK or seed is also rotated whenever the KEK is rotated.
+ Kubernetes generates a new DEK per encryption from a secret seed. + The seed is rotated whenever the KEK is rotated.
A good choice if using a third party tool for key management. - Available in beta from Kubernetes v1.27. + Available as stable from Kubernetes v1.29.
Read how to configure the KMS V2 provider. @@ -538,4 +534,3 @@ To allow automatic reloading, configure the API server to run with: * Read about [decrypting data that are already stored at rest](/docs/tasks/administer-cluster/decrypt-data/) * Learn more about the [EncryptionConfiguration configuration API (v1)](/docs/reference/config-api/apiserver-encryption.v1/). - diff --git a/content/en/docs/tasks/administer-cluster/kms-provider.md b/content/en/docs/tasks/administer-cluster/kms-provider.md index 921e13d29fe81..6ed39227fb235 100644 --- a/content/en/docs/tasks/administer-cluster/kms-provider.md +++ b/content/en/docs/tasks/administer-cluster/kms-provider.md @@ -9,9 +9,17 @@ weight: 370 This page shows how to configure a Key Management Service (KMS) provider and plugin to enable secret data encryption. In Kubernetes {{< skew currentVersion >}} there are two versions of KMS at-rest encryption. -You should use KMS v2 if feasible because KMS v1 is deprecated (since Kubernetes v1.28). -However, you should also read and observe the **Caution** notices in this page that highlight specific -cases when you must not use KMS v2. KMS v2 offers significantly better performance characteristics than KMS v1. +You should use KMS v2 if feasible because KMS v1 is deprecated (since Kubernetes v1.28) and disabled by default (since Kubernetes v1.29). +KMS v2 offers significantly better performance characteristics than KMS v1. + +{{< caution >}} +This documentation is for the generally available implementation of KMS v2 (and for the +deprecated version 1 implementation). +If you are using any control plane components older than Kubernetes v1.29, please check +the equivalent page in the documentation for the version of Kubernetes that your cluster +is running. Earlier releases of Kubernetes had different behavior that may be relevant +for information security. +{{< /caution >}} ## {{% heading "prerequisites" %}} @@ -24,7 +32,7 @@ you have selected. Kubernetes recommends using KMS v2. (if you are running a different version of Kubernetes that also supports the v2 KMS API, switch to the documentation for that version of Kubernetes). - If you selected KMS API v1 to support clusters prior to version v1.27 - or if you have a legacy KMS plugin that only supports KMS v1, + or if you have a legacy KMS plugin that only supports KMS v1, any supported Kubernetes version will work. This API is deprecated as of Kubernetes v1.28. Kubernetes does not recommend the use of this API. @@ -35,80 +43,36 @@ you have selected. Kubernetes recommends using KMS v2. * Kubernetes version 1.10.0 or later is required -* Your cluster must use etcd v3 or later +* For version 1.29 and later, the v1 implementation of KMS is disabled by default. + To enable the feature, set `--feature-gates=KMSv1=true` to configure a KMS v1 provider. -### KMS v2 -{{< feature-state for_k8s_version="v1.27" state="beta" >}} - -* For version 1.25 and 1.26, enabling the feature via kube-apiserver feature gate is required. -Set `--feature-gates=KMSv2=true` to configure a KMS v2 provider. - For environments where all API servers are running version 1.28 or later, and you do not require the ability - to downgrade to Kubernetes v1.27, you can enable the `KMSv2KDF` feature gate (a beta feature) for more - robust data encryption key generation. The Kubernetes project recommends enabling KMS v2 KDF if those - preconditions are met. - * Your cluster must use etcd v3 or later -{{< caution >}} -The KMS v2 API and implementation changed in incompatible ways in-between the alpha release in v1.25 -and the beta release in v1.27. Attempting to upgrade from old versions with the alpha feature -enabled will result in data loss. - ---- +### KMS v2 +{{< feature-state for_k8s_version="v1.29" state="stable" >}} -Running mixed API server versions with some servers at v1.27, and others at v1.28 _with the -`KMSv2KDF` feature gate enabled_ is **not supported** - and is likely to result in data loss. -{{< /caution >}} +* Your cluster must use etcd v3 or later +## KMS encryption and per-object encryption keys + The KMS encryption provider uses an envelope encryption scheme to encrypt data in etcd. The data is encrypted using a data encryption key (DEK). The DEKs are encrypted with a key encryption key (KEK) that is stored and managed in a remote KMS. -With KMS v1, a new DEK is generated for each encryption. +If you use the (deprecated) v1 implementation of KMS, a new DEK is generated for each encryption. -With KMS v2, there are two ways for the API server to generate a DEK. -Kubernetes defaults to generating a new DEK at API server startup, which is then reused -for resource encryption. However, if you use KMS v2 _and_ enable the `KMSv2KDF` -[feature gate](/docs/reference/command-line-tools-reference/feature-gates/), then -Kubernetes instead generates a new DEK **per encryption**: the API server uses a +With KMS v2, a new DEK is generated **per encryption**: the API server uses a _key derivation function_ to generate single use data encryption keys from a secret seed combined with some random data. -Whichever approach you configure, the DEK or seed is also rotated whenever the KEK is rotated -(see `Understanding key_id and Key Rotation` section below for more details). +The seed is rotated whenever the KEK is rotated +(see the _Understanding key_id and Key Rotation_ section below for more details). The KMS provider uses gRPC to communicate with a specific KMS plugin over a UNIX domain socket. The KMS plugin, which is implemented as a gRPC server and deployed on the same host(s) as the Kubernetes control plane, is responsible for all communication with the remote KMS. -{{< caution >}} - -If you are running virtual machine (VM) based nodes that leverage VM state store with this feature, -using KMS v2 is **insecure** and an information security risk unless you also explicitly enable -the `KMSv2KDF` -[feature gate](/docs/reference/command-line-tools-reference/feature-gates/). - -With KMS v2, the API server uses AES-GCM with a 12 byte nonce (8 byte atomic counter and 4 bytes random data) for encryption. -The following issues could occur if the VM is saved and restored: - -1. The counter value may be lost or corrupted if the VM is saved in an inconsistent state or restored improperly. - This can lead to a situation where the same counter value is used twice, resulting in the same nonce being used - for two different messages. -2. If the VM is restored to a previous state, the counter value may be set back to its previous value, -resulting in the same nonce being used again. - -Although both of these cases are partially mitigated by the 4 byte random nonce, this can compromise -the security of the encryption. - -If you have enabled the `KMSv2KDF` -[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) _and_ are using KMS v2 -(not KMS v1), the API server generates single use data encryption keys from a secret seed. -This eliminates the need for a counter based nonce while avoiding nonce collision concerns. -It also removes any specific concerns with using KMS v2 and VM state store. - -{{< /caution >}} - ## Configuring the KMS provider To configure a KMS provider on the API server, include a provider of type `kms` in the @@ -197,10 +161,14 @@ Then use the functions and data structures in the stub file to develop the serve ##### KMS v2 {#developing-a-kms-plugin-gRPC-server-notes-kms-v2} -* KMS plugin version: `v2beta1` +* KMS plugin version: `v2` - In response to procedure call `Status`, a compatible KMS plugin should return `v2beta1` as `StatusResponse.version`, + In response to the `Status` remote procedure call, a compatible KMS plugin should return its KMS compatibility + version as `StatusResponse.version`. That status response should also include "ok" as `StatusResponse.healthz` and a `key_id` (remote KMS KEK ID) as `StatusResponse.key_id`. + The Kubernetes project recommends you make your plugin + compatible with the stable `v2` KMS API. Kubernetes {{< skew currentVersion >}} also supports the + `v2beta1` API for KMS; future Kubernetes releases are likely to continue supporting that beta version. The API server polls the `Status` procedure call approximately every minute when everything is healthy, and every 10 seconds when the plugin is not healthy. Plugins must take care to optimize this call as it will be @@ -258,20 +226,20 @@ Then use the functions and data structures in the stub file to develop the serve API server restart is required to perform KEK rotation. {{< caution >}} - Because you don't control the number of writes performed with the DEK, + Because you don't control the number of writes performed with the DEK, the Kubernetes project recommends rotating the KEK at least every 90 days. {{< /caution >}} * protocol: UNIX domain socket (`unix`) - The plugin is implemented as a gRPC server that listens at UNIX domain socket. - The plugin deployment should create a file on the file system to run the gRPC unix domain socket connection. - The API server (gRPC client) is configured with the KMS provider (gRPC server) unix - domain socket endpoint in order to communicate with it. - An abstract Linux socket may be used by starting the endpoint with `/@`, i.e. `unix:///@foo`. - Care must be taken when using this type of socket as they do not have concept of ACL - (unlike traditional file based sockets). - However, they are subject to Linux networking namespace, so will only be accessible to + The plugin is implemented as a gRPC server that listens at UNIX domain socket. + The plugin deployment should create a file on the file system to run the gRPC unix domain socket connection. + The API server (gRPC client) is configured with the KMS provider (gRPC server) unix + domain socket endpoint in order to communicate with it. + An abstract Linux socket may be used by starting the endpoint with `/@`, i.e. `unix:///@foo`. + Care must be taken when using this type of socket as they do not have concept of ACL + (unlike traditional file based sockets). + However, they are subject to Linux networking namespace, so will only be accessible to containers within the same pod unless host networking is used. ### Integrating a KMS plugin with the remote KMS @@ -363,10 +331,6 @@ The following table summarizes the health check endpoints for each KMS version: These healthcheck endpoint paths are hard coded and generated/controlled by the server. The indices for individual healthchecks corresponds to the order in which the KMS encryption config is processed. -At a high level, restarting an API server when a KMS plugin is unhealthy is unlikely to make the situation better. -It can make the situation significantly worse by throwing away the API server's DEK cache. Thus the general -recommendation is to ignore the API server KMS healthz checks for liveness purposes, i.e. `/livez?exclude=kms-providers`. - Until the steps defined in [Ensuring all secrets are encrypted](#ensuring-all-secrets-are-encrypted) are performed, the `providers` list should end with the `identity: {}` provider to allow unencrypted data to be read. Once all resources are encrypted, the `identity` provider should be removed to prevent the API server from honoring unencrypted data. For details about the `EncryptionConfiguration` format, please check the diff --git a/content/en/docs/tasks/configure-pod-container/configure-persistent-volume-storage.md b/content/en/docs/tasks/configure-pod-container/configure-persistent-volume-storage.md index dc1a01abb00ad..0d6c9859dc97f 100644 --- a/content/en/docs/tasks/configure-pod-container/configure-persistent-volume-storage.md +++ b/content/en/docs/tasks/configure-pod-container/configure-persistent-volume-storage.md @@ -98,6 +98,12 @@ read-write by a single Node. It defines the [StorageClass name](/docs/concepts/s `manual` for the PersistentVolume, which will be used to bind PersistentVolumeClaim requests to this PersistentVolume. +{{< note >}} +This example uses the `ReadWriteOnce` access mode, for simplicity. For +production use, the Kubernetes project recommends using the `ReadWriteOncePod` +access mode instead. +{{< /note >}} + Create the PersistentVolume: ```shell diff --git a/content/en/docs/tasks/configure-pod-container/configure-service-account.md b/content/en/docs/tasks/configure-pod-container/configure-service-account.md index e5530ec2a78ad..002fc3708e965 100644 --- a/content/en/docs/tasks/configure-pod-container/configure-service-account.md +++ b/content/en/docs/tasks/configure-pod-container/configure-service-account.md @@ -1,6 +1,6 @@ --- reviewers: -- bprashanth +- enj - liggitt - thockin title: Configure Service Accounts for Pods @@ -184,6 +184,16 @@ ServiceAccount. You can request a specific token duration using the `--duration` command line argument to `kubectl create token` (the actual duration of the issued token might be shorter, or could even be longer). +When the `ServiceAccountTokenNodeBinding` and `ServiceAccountTokenNodeBindingValidation` +features are enabled and the `KUBECTL_NODE_BOUND_TOKENS` enviroment variable is set to `true`, +it is possible to create a service account token that is directly bound to a `Node`: + +```shell +KUBECTL_NODE_BOUND_TOKENS=true kubectl create token build-robot --bound-object-kind Node --bound-object-name node-001 --bound-object-uid 123...456 +``` + +The token will be valid until it expires or either the assocaited `Node` or service account are deleted. + {{< note >}} Versions of Kubernetes before v1.22 automatically created long term credentials for accessing the Kubernetes API. This older mechanism was based on creating token Secrets @@ -408,6 +418,39 @@ You can configure this behavior for the `spec` of a Pod using a [projected volume](/docs/concepts/storage/volumes/#projected) type called `ServiceAccountToken`. +The token from this projected volume is a {{}} (JWT). +The JSON payload of this token follows a well defined schema - an example payload for a pod bound token: + +```yaml +{ + "aud": [ # matches the requested audiences, or the API server's default audiences when none are explicitly requested + "https://kubernetes.default.svc" + ], + "exp": 1731613413, + "iat": 1700077413, + "iss": "https://kubernetes.default.svc", # matches the first value passed to the --service-account-issuer flag + "jti": "ea28ed49-2e11-4280-9ec5-bc3d1d84661a", # ServiceAccountTokenJTI feature must be enabled for the claim to be present + "kubernetes.io": { + "namespace": "kube-system", + "node": { # ServiceAccountTokenPodNodeInfo feature must be enabled for the API server to add this node reference claim + "name": "127.0.0.1", + "uid": "58456cb0-dd00-45ed-b797-5578fdceaced" + }, + "pod": { + "name": "coredns-69cbfb9798-jv9gn", + "uid": "778a530c-b3f4-47c0-9cd5-ab018fb64f33" + }, + "serviceaccount": { + "name": "coredns", + "uid": "a087d5a0-e1dd-43ec-93ac-f13d89cd13af" + }, + "warnafter": 1700081020 + }, + "nbf": 1700077413, + "sub": "system:serviceaccount:kube-system:coredns" +} +``` + ### Launch a Pod using service account token projection To provide a Pod with a token with an audience of `vault` and a validity duration diff --git a/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md b/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md index 28b3493a0468c..d34c835ed2ee7 100644 --- a/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md +++ b/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md @@ -749,8 +749,12 @@ validations are not supported by ratcheting under the implementation in Kubernet - `not` - any validations in a descendent of one of these fields - `x-kubernetes-validations` - For Kubernetes {{< skew currentVersion >}}, CRD validation rules](#validation-rules) are ignored by - ratcheting. This may change in later Kubernetes releases. + For Kubernetes 1.28, CRD validation rules](#validation-rules) are ignored by + ratcheting. Starting with Alpha 2 in Kubernetes 1.29, `x-kubernetes-validations` + are ratcheted. + + Transition Rules are never ratcheted: only errors raised by rules that do not + use `oldSelf` will be automatically ratcheted if their values are unchanged. - `x-kubernetes-list-type` Errors arising from changing the list type of a subschema will not be ratcheted. For example adding `set` onto a list with duplicates will always @@ -767,19 +771,13 @@ validations are not supported by ratcheting under the implementation in Kubernet - `additionalProperties` To remove a previously specified `additionalProperties` validation will not be ratcheted. - +- `metadata` + Errors arising from changes to fields within an object's `metadata` are not + ratcheted. ### Validation rules -{{< feature-state state="beta" for_k8s_version="v1.25" >}} - - -Validation rules are in beta since 1.25 and the `CustomResourceValidationExpressions` -[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is enabled by default to -validate custom resource based on _validation rules_. You can disable this feature by explicitly -setting the `CustomResourceValidationExpressions` feature gate to `false`, for the -[kube-apiserver](/docs/reference/command-line-tools-reference/kube-apiserver/) component. This -feature is only available if the schema is a [structural schema](#specifying-a-structural-schema). +{{< feature-state state="stable" for_k8s_version="v1.29" >}} Validation rules use the [Common Expression Language (CEL)](https://github.com/google/cel-spec) to validate custom resource values. Validation rules are included in @@ -1177,6 +1175,34 @@ The `fieldPath` field does not support indexing arrays numerically. Setting `fieldPath` is optional. +#### The `optionalOldSelf` field {#field-optional-oldself} + +{{< feature-state state="alpha" for_k8s_version="v1.29" >}} + +The feature [CRDValidationRatcheting](#validation-ratcheting) must be enabled in order to +make use of this field. + +The `optionalOldSelf` field is a boolean field that alters the behavior of [Transition Rules](#transition-rules) described +below. Normally, a transition rule will not evaluate if `oldSelf` cannot be determined: +during object creation or when a new value is introduced in an update. + +If `optionalOldSelf` is set to true, then transition rules will always be +evaluated and the type of `oldSelf` be changed to a CEL [`Optional`](https://pkg.go.dev/github.com/google/cel-go/cel#OptionalTypes) type. + +`optionalOldSelf` is useful in cases where schema authors would like a more +control tool [than provided by the default equality based behavior of ][#validation-ratcheting] +to introduce newer, usually stricter constraints on new values, while still +allowing old values to be "grandfathered" or ratcheted using the older validation. + +Example Usage: + +| CEL | Description | +|-----------------------------------------|-------------| +| `self.foo == "foo" || (oldSelf.hasValue() && oldSelf.value().foo != "foo")` | Ratcheted rule. Once a value is set to "foo", it must stay foo. But if it existed before the "foo" constraint was introduced, it may use any value | +| [oldSelf.orValue(""), self].all(x, ["OldCase1", "OldCase2"].exists(case, x == case)) || ["NewCase1", "NewCase2"].exists(case, self == case) || ["NewCase"].has(self)` | "Ratcheted validation for removed enum cases if oldSelf used them" | +| oldSelf.optMap(o, o.size()).orValue(0) < 4 || self.size() >= 4 | Ratcheted validation of newly increased minimum map or list size | + + #### Validation functions {#available-validation-functions} Functions available include: diff --git a/content/en/docs/tasks/network/extend-service-ip-ranges.md b/content/en/docs/tasks/network/extend-service-ip-ranges.md new file mode 100644 index 0000000000000..fdce843c68c41 --- /dev/null +++ b/content/en/docs/tasks/network/extend-service-ip-ranges.md @@ -0,0 +1,184 @@ +--- +reviewers: +- thockin +- dwinship +min-kubernetes-server-version: v1.29 +title: Extend Service IP Ranges +content_type: task +--- + + +{{< feature-state state="alpha" for_k8s_version="v1.29" >}} + +This document shares how to extend the existing Service IP range assigned to a cluster. + + +## {{% heading "prerequisites" %}} + +{{< include "task-tutorial-prereqs.md" >}} + +{{< version-check >}} + + + +## API + +Kubernetes clusters with kube-apiservers that have enabled the `MultiCIDRServiceAllocator` +[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) and the `networking.k8s.io/v1alpha1` API, +will create a new ServiceCIDR object that takes the well-known name `kubernetes`, and that uses an IP address range +based on the value of the `--service-cluster-ip-range` command line argument to kube-apiserver. + +```sh +kubectl get servicecidr +``` +``` +NAME CIDRS AGE +kubernetes 10.96.0.0/28 17d +``` + +The well-known `kubernetes` Service, that exposes the kube-apiserver endpoint to the Pods, calculates +the first IP address from the default ServiceCIDR range and uses that IP address as its +cluster IP address. + +```sh +kubectl get service kubernetes +``` +``` +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +kubernetes ClusterIP 10.96.0.1 443/TCP 17d +``` + +The default Service, in this case, uses the ClusterIP 10.96.0.1, that has the corresponding IPAddress object. + +```sh +kubectl get ipaddress 10.96.0.1 +``` +``` +NAME PARENTREF +10.96.0.1 services/default/kubernetes +``` + +The ServiceCIDRs are protected with {{}}, to avoid leaving Service ClusterIPs orphans; +the finalizer is only removed if there is another subnet that contains the existing IPAddresses or +there are no IPAddresses belonging to the subnet. + +## Extend the number of available IPs for Services + +There are cases that users will need to increase the number addresses available to Services, previously, increasing the Service range was a disruptive operation that could also cause data loss. With this new feature users only need to add a new ServiceCIDR to increase the number of available addresses. + +### Adding a new ServiceCIDR + +On a cluster with a 10.96.0.0/28 range for Services, there is only 2^(32-28) - 2 = 14 IP addresses available. The `kubernetes.default` Service is always created; for this example, that leaves you with only 13 possible Services. + +```sh +for i in $(seq 1 13); do kubectl create service clusterip "test-$i" --tcp 80 -o json | jq -r .spec.clusterIP; done +``` +``` +10.96.0.11 +10.96.0.5 +10.96.0.12 +10.96.0.13 +10.96.0.14 +10.96.0.2 +10.96.0.3 +10.96.0.4 +10.96.0.6 +10.96.0.7 +10.96.0.8 +10.96.0.9 +error: failed to create ClusterIP service: Internal error occurred: failed to allocate a serviceIP: range is full +``` + +You can increase the number of IP addresses available for Services, by creating a new ServiceCIDR +that extends or adds new IP address ranges. + +```sh +cat