Merge pull request #86014 from eromanova97/OBSDOCS-1550

OBSDOCS-1550: Add assemblies for 'Configuring the core platform monito…
openshift · Dec 16, 2024 · 0f93cce · 0f93cce
2 parents a686ac7 + 56c0e9f
commit 0f93cce
Show file tree

Hide file tree

Showing 10 changed files with 265 additions and 53 deletions.
diff --git a/modules/monitoring-configurable-monitoring-components.adoc b/modules/monitoring-configurable-monitoring-components.adoc
@@ -2,53 +2,89 @@
 //
 // * observability/monitoring/configuring-the-monitoring-stack.adoc
 
-[id="configurable-monitoring-components_{context}"]
-= Configurable monitoring components
+:_mod-docs-content-type: REFERENCE
 
-This table shows the monitoring components you can configure and the keys used to specify the components in the
-ifndef::openshift-dedicated,openshift-rosa[]
-`cluster-monitoring-config` and
-endif::openshift-dedicated,openshift-rosa[]
-`user-workload-monitoring-config` `ConfigMap` objects.
+// The ultimate solution DOES NOT NEED separate IDs, it is just needed for now so that the tests will not break
+
+// tag::CPM[]
+[id="configurable-monitoring-components-cpm_{context}"]
+= Configurable monitoring components for core platform monitoring
+// end::CPM[]
+
+// tag::UWM[]
+[id="configurable-monitoring-components-uwm_{context}"]
+= Configurable monitoring components for monitoring for user-defined projects
+// end::UWM[]
+
+// Set attributes to distinguish between cluster monitoring example (core platform monitoring - CPM) and user workload monitoring (UWM) examples.
+// tag::CPM[]
+:configmap-name: cluster-monitoring-config
+:alertmanager: alertmanagerMain
+:prometheus: prometheusK8s
+:thanosname: Thanos Querier
+:thanos: thanosQuerier
+// end::CPM[]
+// tag::UWM[]
+:configmap-name: user-workload-monitoring-config
+:alertmanager: alertmanager
+:prometheus: prometheus
+:thanosname: Thanos Ruler
+:thanos: thanosRuler
+// end::UWM[]
+
+This table shows the monitoring components you can configure and the keys used to specify the components in the `{configmap-name}` config map.
 
+// tag::UWM[]
 ifdef::openshift-dedicated,openshift-rosa[]
 [WARNING]
 ====
-Do not modify the monitoring components in the `cluster-monitoring-config` `ConfigMap` object. Red Hat Site Reliability Engineers (SRE) use these components to monitor the core cluster components and Kubernetes services.
+Do not modify the monitoring components in the `cluster-monitoring-config` `ConfigMap` object. Red{nbsp}Hat Site Reliability Engineers (SRE) use these components to monitor the core cluster components and Kubernetes services.
 ====
 endif::openshift-dedicated,openshift-rosa[]
+// end::UWM[]
 
-ifndef::openshift-dedicated,openshift-rosa[]
-.Configurable monitoring components
+// tag::CPM[]
+.Configurable core platform monitoring components
+// end::CPM[]
+// tag::UWM[]
+.Configurable monitoring components for user-defined projects
+// end::UWM[]
 [options="header"]
 |====
-|Component |cluster-monitoring-config config map key |user-workload-monitoring-config config map key
-|Prometheus Operator |`prometheusOperator` |`prometheusOperator`
-|Prometheus |`prometheusK8s` |`prometheus`
-|Alertmanager |`alertmanagerMain` | `alertmanager`
-|kube-state-metrics |`kubeStateMetrics` |
-|monitoring-plugin | `monitoringPlugin` |
-|openshift-state-metrics |`openshiftStateMetrics` |
-|Telemeter Client |`telemeterClient` |
-|Metrics Server |`metricsServer` |
-|Thanos Querier |`thanosQuerier` |
-|Thanos Ruler | |`thanosRuler`
+|Component |{configmap-name} config map key
+|Prometheus Operator |`prometheusOperator`
+|Prometheus |`{prometheus}`
+|Alertmanager |`{alertmanager}`
+|{thanosname} | `{thanos}`
+// tag::CPM[]
+|kube-state-metrics |`kubeStateMetrics`
+|monitoring-plugin | `monitoringPlugin`
+|openshift-state-metrics |`openshiftStateMetrics`
+|Telemeter Client |`telemeterClient`
+|Metrics Server |`metricsServer`
+// end::CPM[]
 |====
 
-[NOTE]
+[WARNING]
 ====
-The Prometheus key is called `prometheusK8s` in the `cluster-monitoring-config` `ConfigMap` object and `prometheus` in the `user-workload-monitoring-config` `ConfigMap` object.
+Different configuration changes to the `ConfigMap` object result in different outcomes:
+
+* The pods are not redeployed. Therefore, there is no service outage.
+
+* The affected pods are redeployed:
+
+** For single-node clusters, this results in temporary service outage.
+
+** For multi-node clusters, because of high-availability, the affected pods are gradually rolled out and the monitoring stack remains available.
+
+** Configuring and resizing a persistent volume always results in a service outage, regardless of high availability.
+
+Each procedure that requires a change in the config map includes its expected outcome.
 ====
-endif::openshift-dedicated,openshift-rosa[]
 
-ifdef::openshift-dedicated,openshift-rosa[]
-.Configurable monitoring components
-[options="header"]
-|===
-|Component |user-workload-monitoring-config config map key
-|Alertmanager |`alertmanager`
-|Prometheus Operator |`prometheusOperator`
-|Prometheus |`prometheus`
-|Thanos Ruler |`thanosRuler`
-|===
-endif::openshift-dedicated,openshift-rosa[]
+// Unset the source code block attributes just to be safe.
+:!configmap-name:
+:!alertmanager:
+:!prometheus:
+:!thanosname:
+:!thanos:
diff --git a/modules/monitoring-configuring-metrics-collection-profiles.adoc b/modules/monitoring-configuring-metrics-collection-profiles.adoc
@@ -6,15 +6,8 @@
 [id="configuring-metrics-collection-profiles_{context}"]
 = Configuring metrics collection profiles
 
-[IMPORTANT]
-====
-[subs="attributes+"]
-Using a metrics collection profile is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete.
-Red Hat does not recommend using them in production.
-These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
-
-For more information about the support scope of Red Hat Technology Preview features, see link:https://access.redhat.com/support/offerings/techpreview[https://access.redhat.com/support/offerings/techpreview].
-====
+:FeatureName: Metrics collection profile
+include::snippets/technology-preview.adoc[]
 
 By default, Prometheus collects metrics exposed by all default metrics targets in {product-title} components.
 However, you might want Prometheus to collect fewer metrics from a cluster in certain scenarios:

diff --git a/modules/monitoring-granting-users-permission-to-monitor-user-defined-projects.adoc b/modules/monitoring-granting-users-permission-to-monitor-user-defined-projects.adoc
@@ -4,7 +4,7 @@
 
 :_mod-docs-content-type: CONCEPT
 [id="granting-users-permission-to-monitor-user-defined-projects_{context}"]
-= Granting users permission to monitor user-defined projects
+= Granting users permissions for monitoring for user-defined projects
 
 As a cluster administrator, you can monitor all core {product-title} and user-defined projects.
 

diff --git a/modules/monitoring-moving-monitoring-components-to-different-nodes.adoc b/modules/monitoring-moving-monitoring-components-to-different-nodes.adoc
@@ -111,3 +111,7 @@ If monitoring components remain in a `Pending` state after configuring the `node
 ====
 
 . Save the file to apply the changes. The components specified in the new configuration are automatically moved to the new nodes, and the pods affected by the new configuration are redeployed.
+
+// Unset the source code block attributes just to be safe.
+:!configmap-name:
+:!namespace-name:
diff --git a/...rvability/monitoring/configuring-core-platform-monitoring/before-you-begin.adoc b/...rvability/monitoring/configuring-core-platform-monitoring/before-you-begin.adoc
@@ -6,7 +6,36 @@ include::_attributes/common-attributes.adoc[]
 
 toc::[]
 
-TBD
+The {product-title} installation program provides only a low number of configuration options before installation. Configuring most {product-title} framework components, including the cluster monitoring stack, happens after the installation.
 
+This section explains which monitoring components can be configured and how to prepare to configure the monitoring stack.
 
+[IMPORTANT]
+====
+* Not all configuration parameters for the monitoring stack are exposed.
+Only the parameters and fields listed in the xref:../../../observability/monitoring/config-map-reference-for-the-cluster-monitoring-operator.adoc#cluster-monitoring-operator-configuration-reference[Config map reference for the {cmo-full}] are supported for configuration.
+
+* The monitoring stack imposes additional resource requirements. Consult the computing resources recommendations in xref:../../../scalability_and_performance/recommended-performance-scale-practices/recommended-infrastructure-practices.adoc#scaling-cluster-monitoring-operator[Scaling the {cmo-full}] and verify that you have sufficient resources.
+====
+
+// Configurable monitoring components
+include::modules/monitoring-configurable-monitoring-components.adoc[leveloffset=+1,tags=**;CPM;!UWM]
+
+// Preparing to configure the monitoring stack
+[id="preparing-to-configure-the-monitoring-stack_{context}"]
+== Preparing to configure the monitoring stack
+
+You can configure the core platform monitoring by creating and updating the `cluster-monitoring-config` config map. This config map configures the {cmo-first}, which in turn configures the components of the default monitoring stack.
+
+include::modules/monitoring-creating-cluster-monitoring-configmap.adoc[leveloffset=+2]
+
+// Granting users permissions for core platform monitoring
+include::modules/monitoring-granting-users-permissions-for-core-platform-monitoring.adoc[leveloffset=+1]
+
+[role="_additional-resources"]
+.Additional resources
+* TBD
+
+include::modules/monitoring-granting-user-permissions-using-the-web-console.adoc[leveloffset=+2]
+include::modules/monitoring-granting-user-permissions-using-the-cli.adoc[leveloffset=+2]
 
diff --git a/.../configuring-core-platform-monitoring/configuring-alerts-and-notifications.adoc b/.../configuring-core-platform-monitoring/configuring-alerts-and-notifications.adoc
@@ -6,7 +6,28 @@ include::_attributes/common-attributes.adoc[]
 
 toc::[]
 
-TBD
+You can configure a local or external Alertmanager instance to route alerts from Prometheus to endpoint receivers. You can also attach custom labels to all time series and alerts to add useful metadata information.
 
+//Configuring external Alertmanager instances
+include::modules/monitoring-configuring-external-alertmanagers.adoc[leveloffset=1,tags=**;CPM;!UWM]
 
+//Configuring secrets for Alertmanager
+include::modules/monitoring-configuring-secrets-for-alertmanager.adoc[leveloffset=1]
 
+include::modules/monitoring-adding-a-secret-to-the-alertmanager-configuration.adoc[leveloffset=2,tags=**;CPM;!UWM]
+
+//Attaching additional labels to your time series and alerts
+include::modules/monitoring-attaching-additional-labels-to-your-time-series-and-alerts.adoc[leveloffset=+1,tags=**;CPM;!UWM]
+
+[role="_additional-resources"]
+.Additional resources
+
+* TBD
+
+// Disabling the local Alertmanager
+include::modules/monitoring-disabling-the-local-alertmanager.adoc[leveloffset=+1]
+
+[role="_additional-resources"]
+.Additional resources
+
+* TBD
diff --git a/...bility/monitoring/configuring-core-platform-monitoring/configuring-metrics.adoc b/...bility/monitoring/configuring-core-platform-monitoring/configuring-metrics.adoc
@@ -6,7 +6,30 @@ include::_attributes/common-attributes.adoc[]
 
 toc::[]
 
-TBD
+Configure the collection of metrics to monitor how cluster components and your own workloads are performing.
 
+You can send ingested metrics to remote systems for long-term storage and add cluster ID labels to the metrics to identify the data coming from different clusters.
 
+// Configuring remote write storage
+include::modules/monitoring-configuring-remote-write-storage.adoc[leveloffset=+1,tags=**;CPM;!UWM]
 
+include::modules/monitoring-supported-remote-write-authentication-settings.adoc[leveloffset=+2]
+
+include::modules/monitoring-example-remote-write-authentication-settings.adoc[leveloffset=+2,tags=**;CPM;!UWM]
+
+include::modules/monitoring-example-remote-write-queue-configuration.adoc[leveloffset=+2,tags=**;CPM;!UWM]
+
+[role="_additional-resources"]
+.Additional resources
+
+* TBD
+
+// Adding cluster ID labels to metrics
+include::modules/monitoring-adding-cluster-id-labels-to-metrics.adoc[leveloffset=+1]
+
+include::modules/monitoring-creating-cluster-id-labels-for-metrics.adoc[leveloffset=+2,tags=**;CPM;!UWM]
+
+[role="_additional-resources"]
+.Additional resources
+
+* TBD
diff --git a/...nfiguring-core-platform-monitoring/configuring-performance-and-scalability.adoc b/...nfiguring-core-platform-monitoring/configuring-performance-and-scalability.adoc
@@ -6,7 +6,67 @@ include::_attributes/common-attributes.adoc[]
 
 toc::[]
 
-TBD
+You can configure the monitoring stack to optimize the performance and scale of your clusters. The following documentation provides information about how to distribute the monitoring components and control the impact of the monitoring stack on CPU and memory resources.
 
+// Using node selectors to move monitoring components
+
+include::modules/monitoring-using-node-selectors-to-move-monitoring-components.adoc[leveloffset=+1]]
+
+[role="_additional-resources"]
+.Additional resources
+
+* TBD
+
+include::modules/monitoring-moving-monitoring-components-to-different-nodes.adoc[leveloffset=+2,tags=**;CPM;!UWM]
+
+[role="_additional-resources"]
+.Additional resources
+
+* TBD
+
+include::modules/monitoring-assigning-tolerations-to-monitoring-components.adoc[leveloffset=+2,tags=**;CPM;!UWM]
+
+[role="_additional-resources"]
+.Additional resources
+
+* TBD
+
+// Setting the body size limit for metrics scraping
+include::modules/monitoring-setting-the-body-size-limit-for-metrics-scraping.adoc[leveloffset=+1]
+
+[role="_additional-resources"]
+.Additional resources
+
+* TBD
+
+[id="managing-cpu-and-memory-resources-for-monitoring-components_{context}"]
+== Managing CPU and memory resources for monitoring components
+
+You can ensure that the containers that run monitoring components have enough CPU and memory resources by specifying values for resource limits and requests for those components.
+
+You can configure these limits and requests for core platform monitoring components in the `openshift-monitoring` namespace.
+
+include::modules/monitoring-about-specifying-limits-and-requests-for-monitoring-components.adoc[leveloffset=+2,tags=**;CPM;!UWM]
+
+include::modules/monitoring-specifying-limits-and-requests-for-monitoring-components.adoc[leveloffset=+2,tags=**;CPM;!UWM]
+
+// Configuring metrics collection profiles
+include::modules/monitoring-configuring-metrics-collection-profiles.adoc[leveloffset=+1]
+include::modules/monitoring-choosing-a-metrics-collection-profile.adoc[leveloffset=+2]
+
+[role="_additional-resources"]
+.Additional resources
+
+* TBD
+
+// Using pod topology spread constraints for monitoring components
+include::modules/monitoring-using-pod-topology-spread-constraints-for-monitoring.adoc[leveloffset=1]
+
+[role="_additional-resources"]
+.Additional resources
+
+* TBD
+
+include::modules/monitoring-configuring-pod-topology-spread-constraints.adoc[leveloffset=2,tags=**;CPM;!UWM]
 
 
diff --git a/...monitoring/configuring-core-platform-monitoring/storing-and-recording-data.adoc b/...monitoring/configuring-core-platform-monitoring/storing-and-recording-data.adoc
@@ -6,7 +6,50 @@ include::_attributes/common-attributes.adoc[]
 
 toc::[]
 
-TBD
+Store and record your metrics and alerting data, configure logs to specify which activities are recorded, control how long Prometheus retains stored data, and set the maximum amount of disk space for the data. These actions help you protect your data and use them for troubleshooting.
 
+// Configuring persistent storage
+include::modules/monitoring-configuring-persistent-storage.adoc[leveloffset=+1]
 
+include::modules/monitoring-configuring-a-persistent-volume-claim.adoc[leveloffset=+2,tags=**;CPM;!UWM]
+
+[role="_additional-resources"]
+.Additional resources
+
+* TBD
+
+include::modules/monitoring-resizing-a-persistent-volume.adoc[leveloffset=+2,tags=**;CPM;!UWM]
+
+[role="_additional-resources"]
+.Additional resources
+
+* TBD
+
+// Modifying the retention time and size for Prometheus metrics data
+
+include::modules/monitoring-modifying-retention-time-and-size-for-prometheus-metrics-data.adoc[leveloffset=+1,tags=**;CPM;!UWM]
+
+include::modules/monitoring-modifying-the-retention-time-for-thanos-ruler-metrics-data.adoc[leveloffset=+2]
+
+// Configuring audit logs for Metrics Server
+include::modules/monitoring-configuring-audit-logs-for-metrics-server.adoc[leveloffset=+1]
+
+// Setting log levels for monitoring components
+include::modules/monitoring-setting-log-levels-for-monitoring-components.adoc[leveloffset=+1,tags=**;CPM;!UWM]
+
+// Enabling the query log file for Prometheus
+include::modules/monitoring-setting-query-log-file-for-prometheus.adoc[leveloffset=+1,tags=**;CPM;!UWM]
+
+[role="_additional-resources"]
+.Additional resources
+
+* TBD
+
+// Enabling query logging for Thanos Querier
+include::modules/monitoring-enabling-query-logging-for-thanos-querier.adoc[leveloffset=+1]
+
+[role="_additional-resources"]
+.Additional resources
+
+* TBD