Skip to content

Commit

Permalink
Merge pull request #86014 from eromanova97/OBSDOCS-1550
Browse files Browse the repository at this point in the history
OBSDOCS-1550: Add assemblies for 'Configuring the core platform monito…
  • Loading branch information
stevsmit authored Dec 16, 2024
2 parents a686ac7 + 56c0e9f commit 0f93cce
Show file tree
Hide file tree
Showing 10 changed files with 265 additions and 53 deletions.
106 changes: 71 additions & 35 deletions modules/monitoring-configurable-monitoring-components.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,53 +2,89 @@
//
// * observability/monitoring/configuring-the-monitoring-stack.adoc

[id="configurable-monitoring-components_{context}"]
= Configurable monitoring components
:_mod-docs-content-type: REFERENCE

This table shows the monitoring components you can configure and the keys used to specify the components in the
ifndef::openshift-dedicated,openshift-rosa[]
`cluster-monitoring-config` and
endif::openshift-dedicated,openshift-rosa[]
`user-workload-monitoring-config` `ConfigMap` objects.
// The ultimate solution DOES NOT NEED separate IDs, it is just needed for now so that the tests will not break

// tag::CPM[]
[id="configurable-monitoring-components-cpm_{context}"]
= Configurable monitoring components for core platform monitoring
// end::CPM[]

// tag::UWM[]
[id="configurable-monitoring-components-uwm_{context}"]
= Configurable monitoring components for monitoring for user-defined projects
// end::UWM[]

// Set attributes to distinguish between cluster monitoring example (core platform monitoring - CPM) and user workload monitoring (UWM) examples.
// tag::CPM[]
:configmap-name: cluster-monitoring-config
:alertmanager: alertmanagerMain
:prometheus: prometheusK8s
:thanosname: Thanos Querier
:thanos: thanosQuerier
// end::CPM[]
// tag::UWM[]
:configmap-name: user-workload-monitoring-config
:alertmanager: alertmanager
:prometheus: prometheus
:thanosname: Thanos Ruler
:thanos: thanosRuler
// end::UWM[]

This table shows the monitoring components you can configure and the keys used to specify the components in the `{configmap-name}` config map.

// tag::UWM[]
ifdef::openshift-dedicated,openshift-rosa[]
[WARNING]
====
Do not modify the monitoring components in the `cluster-monitoring-config` `ConfigMap` object. Red Hat Site Reliability Engineers (SRE) use these components to monitor the core cluster components and Kubernetes services.
Do not modify the monitoring components in the `cluster-monitoring-config` `ConfigMap` object. Red{nbsp}Hat Site Reliability Engineers (SRE) use these components to monitor the core cluster components and Kubernetes services.
====
endif::openshift-dedicated,openshift-rosa[]
// end::UWM[]

ifndef::openshift-dedicated,openshift-rosa[]
.Configurable monitoring components
// tag::CPM[]
.Configurable core platform monitoring components
// end::CPM[]
// tag::UWM[]
.Configurable monitoring components for user-defined projects
// end::UWM[]
[options="header"]
|====
|Component |cluster-monitoring-config config map key |user-workload-monitoring-config config map key
|Prometheus Operator |`prometheusOperator` |`prometheusOperator`
|Prometheus |`prometheusK8s` |`prometheus`
|Alertmanager |`alertmanagerMain` | `alertmanager`
|kube-state-metrics |`kubeStateMetrics` |
|monitoring-plugin | `monitoringPlugin` |
|openshift-state-metrics |`openshiftStateMetrics` |
|Telemeter Client |`telemeterClient` |
|Metrics Server |`metricsServer` |
|Thanos Querier |`thanosQuerier` |
|Thanos Ruler | |`thanosRuler`
|Component |{configmap-name} config map key
|Prometheus Operator |`prometheusOperator`
|Prometheus |`{prometheus}`
|Alertmanager |`{alertmanager}`
|{thanosname} | `{thanos}`
// tag::CPM[]
|kube-state-metrics |`kubeStateMetrics`
|monitoring-plugin | `monitoringPlugin`
|openshift-state-metrics |`openshiftStateMetrics`
|Telemeter Client |`telemeterClient`
|Metrics Server |`metricsServer`
// end::CPM[]
|====

[NOTE]
[WARNING]
====
The Prometheus key is called `prometheusK8s` in the `cluster-monitoring-config` `ConfigMap` object and `prometheus` in the `user-workload-monitoring-config` `ConfigMap` object.
Different configuration changes to the `ConfigMap` object result in different outcomes:
* The pods are not redeployed. Therefore, there is no service outage.
* The affected pods are redeployed:
** For single-node clusters, this results in temporary service outage.
** For multi-node clusters, because of high-availability, the affected pods are gradually rolled out and the monitoring stack remains available.
** Configuring and resizing a persistent volume always results in a service outage, regardless of high availability.
Each procedure that requires a change in the config map includes its expected outcome.
====
endif::openshift-dedicated,openshift-rosa[]

ifdef::openshift-dedicated,openshift-rosa[]
.Configurable monitoring components
[options="header"]
|===
|Component |user-workload-monitoring-config config map key
|Alertmanager |`alertmanager`
|Prometheus Operator |`prometheusOperator`
|Prometheus |`prometheus`
|Thanos Ruler |`thanosRuler`
|===
endif::openshift-dedicated,openshift-rosa[]
// Unset the source code block attributes just to be safe.
:!configmap-name:
:!alertmanager:
:!prometheus:
:!thanosname:
:!thanos:
11 changes: 2 additions & 9 deletions modules/monitoring-configuring-metrics-collection-profiles.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,8 @@
[id="configuring-metrics-collection-profiles_{context}"]
= Configuring metrics collection profiles

[IMPORTANT]
====
[subs="attributes+"]
Using a metrics collection profile is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete.
Red Hat does not recommend using them in production.
These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see link:https://access.redhat.com/support/offerings/techpreview[https://access.redhat.com/support/offerings/techpreview].
====
:FeatureName: Metrics collection profile
include::snippets/technology-preview.adoc[]

By default, Prometheus collects metrics exposed by all default metrics targets in {product-title} components.
However, you might want Prometheus to collect fewer metrics from a cluster in certain scenarios:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

:_mod-docs-content-type: CONCEPT
[id="granting-users-permission-to-monitor-user-defined-projects_{context}"]
= Granting users permission to monitor user-defined projects
= Granting users permissions for monitoring for user-defined projects

As a cluster administrator, you can monitor all core {product-title} and user-defined projects.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -111,3 +111,7 @@ If monitoring components remain in a `Pending` state after configuring the `node
====

. Save the file to apply the changes. The components specified in the new configuration are automatically moved to the new nodes, and the pods affected by the new configuration are redeployed.

// Unset the source code block attributes just to be safe.
:!configmap-name:
:!namespace-name:
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,36 @@ include::_attributes/common-attributes.adoc[]

toc::[]

TBD
The {product-title} installation program provides only a low number of configuration options before installation. Configuring most {product-title} framework components, including the cluster monitoring stack, happens after the installation.

This section explains which monitoring components can be configured and how to prepare to configure the monitoring stack.

[IMPORTANT]
====
* Not all configuration parameters for the monitoring stack are exposed.
Only the parameters and fields listed in the xref:../../../observability/monitoring/config-map-reference-for-the-cluster-monitoring-operator.adoc#cluster-monitoring-operator-configuration-reference[Config map reference for the {cmo-full}] are supported for configuration.
* The monitoring stack imposes additional resource requirements. Consult the computing resources recommendations in xref:../../../scalability_and_performance/recommended-performance-scale-practices/recommended-infrastructure-practices.adoc#scaling-cluster-monitoring-operator[Scaling the {cmo-full}] and verify that you have sufficient resources.
====

// Configurable monitoring components
include::modules/monitoring-configurable-monitoring-components.adoc[leveloffset=+1,tags=**;CPM;!UWM]

// Preparing to configure the monitoring stack
[id="preparing-to-configure-the-monitoring-stack_{context}"]
== Preparing to configure the monitoring stack

You can configure the core platform monitoring by creating and updating the `cluster-monitoring-config` config map. This config map configures the {cmo-first}, which in turn configures the components of the default monitoring stack.

include::modules/monitoring-creating-cluster-monitoring-configmap.adoc[leveloffset=+2]

// Granting users permissions for core platform monitoring
include::modules/monitoring-granting-users-permissions-for-core-platform-monitoring.adoc[leveloffset=+1]

[role="_additional-resources"]
.Additional resources
* TBD

include::modules/monitoring-granting-user-permissions-using-the-web-console.adoc[leveloffset=+2]
include::modules/monitoring-granting-user-permissions-using-the-cli.adoc[leveloffset=+2]

Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,28 @@ include::_attributes/common-attributes.adoc[]

toc::[]

TBD
You can configure a local or external Alertmanager instance to route alerts from Prometheus to endpoint receivers. You can also attach custom labels to all time series and alerts to add useful metadata information.

//Configuring external Alertmanager instances
include::modules/monitoring-configuring-external-alertmanagers.adoc[leveloffset=1,tags=**;CPM;!UWM]

//Configuring secrets for Alertmanager
include::modules/monitoring-configuring-secrets-for-alertmanager.adoc[leveloffset=1]

include::modules/monitoring-adding-a-secret-to-the-alertmanager-configuration.adoc[leveloffset=2,tags=**;CPM;!UWM]

//Attaching additional labels to your time series and alerts
include::modules/monitoring-attaching-additional-labels-to-your-time-series-and-alerts.adoc[leveloffset=+1,tags=**;CPM;!UWM]

[role="_additional-resources"]
.Additional resources

* TBD
// Disabling the local Alertmanager
include::modules/monitoring-disabling-the-local-alertmanager.adoc[leveloffset=+1]

[role="_additional-resources"]
.Additional resources

* TBD
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,30 @@ include::_attributes/common-attributes.adoc[]

toc::[]

TBD
Configure the collection of metrics to monitor how cluster components and your own workloads are performing.

You can send ingested metrics to remote systems for long-term storage and add cluster ID labels to the metrics to identify the data coming from different clusters.

// Configuring remote write storage
include::modules/monitoring-configuring-remote-write-storage.adoc[leveloffset=+1,tags=**;CPM;!UWM]

include::modules/monitoring-supported-remote-write-authentication-settings.adoc[leveloffset=+2]

include::modules/monitoring-example-remote-write-authentication-settings.adoc[leveloffset=+2,tags=**;CPM;!UWM]

include::modules/monitoring-example-remote-write-queue-configuration.adoc[leveloffset=+2,tags=**;CPM;!UWM]

[role="_additional-resources"]
.Additional resources

* TBD
// Adding cluster ID labels to metrics
include::modules/monitoring-adding-cluster-id-labels-to-metrics.adoc[leveloffset=+1]

include::modules/monitoring-creating-cluster-id-labels-for-metrics.adoc[leveloffset=+2,tags=**;CPM;!UWM]

[role="_additional-resources"]
.Additional resources

* TBD
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,67 @@ include::_attributes/common-attributes.adoc[]

toc::[]

TBD
You can configure the monitoring stack to optimize the performance and scale of your clusters. The following documentation provides information about how to distribute the monitoring components and control the impact of the monitoring stack on CPU and memory resources.

// Using node selectors to move monitoring components

include::modules/monitoring-using-node-selectors-to-move-monitoring-components.adoc[leveloffset=+1]]

[role="_additional-resources"]
.Additional resources

* TBD
include::modules/monitoring-moving-monitoring-components-to-different-nodes.adoc[leveloffset=+2,tags=**;CPM;!UWM]

[role="_additional-resources"]
.Additional resources

* TBD
include::modules/monitoring-assigning-tolerations-to-monitoring-components.adoc[leveloffset=+2,tags=**;CPM;!UWM]

[role="_additional-resources"]
.Additional resources

* TBD
// Setting the body size limit for metrics scraping
include::modules/monitoring-setting-the-body-size-limit-for-metrics-scraping.adoc[leveloffset=+1]

[role="_additional-resources"]
.Additional resources

* TBD
[id="managing-cpu-and-memory-resources-for-monitoring-components_{context}"]
== Managing CPU and memory resources for monitoring components

You can ensure that the containers that run monitoring components have enough CPU and memory resources by specifying values for resource limits and requests for those components.

You can configure these limits and requests for core platform monitoring components in the `openshift-monitoring` namespace.

include::modules/monitoring-about-specifying-limits-and-requests-for-monitoring-components.adoc[leveloffset=+2,tags=**;CPM;!UWM]

include::modules/monitoring-specifying-limits-and-requests-for-monitoring-components.adoc[leveloffset=+2,tags=**;CPM;!UWM]

// Configuring metrics collection profiles
include::modules/monitoring-configuring-metrics-collection-profiles.adoc[leveloffset=+1]
include::modules/monitoring-choosing-a-metrics-collection-profile.adoc[leveloffset=+2]

[role="_additional-resources"]
.Additional resources

* TBD

// Using pod topology spread constraints for monitoring components
include::modules/monitoring-using-pod-topology-spread-constraints-for-monitoring.adoc[leveloffset=1]

[role="_additional-resources"]
.Additional resources

* TBD

include::modules/monitoring-configuring-pod-topology-spread-constraints.adoc[leveloffset=2,tags=**;CPM;!UWM]


Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,50 @@ include::_attributes/common-attributes.adoc[]

toc::[]

TBD
Store and record your metrics and alerting data, configure logs to specify which activities are recorded, control how long Prometheus retains stored data, and set the maximum amount of disk space for the data. These actions help you protect your data and use them for troubleshooting.

// Configuring persistent storage
include::modules/monitoring-configuring-persistent-storage.adoc[leveloffset=+1]

include::modules/monitoring-configuring-a-persistent-volume-claim.adoc[leveloffset=+2,tags=**;CPM;!UWM]

[role="_additional-resources"]
.Additional resources

* TBD
include::modules/monitoring-resizing-a-persistent-volume.adoc[leveloffset=+2,tags=**;CPM;!UWM]

[role="_additional-resources"]
.Additional resources

* TBD
// Modifying the retention time and size for Prometheus metrics data

include::modules/monitoring-modifying-retention-time-and-size-for-prometheus-metrics-data.adoc[leveloffset=+1,tags=**;CPM;!UWM]

include::modules/monitoring-modifying-the-retention-time-for-thanos-ruler-metrics-data.adoc[leveloffset=+2]

// Configuring audit logs for Metrics Server
include::modules/monitoring-configuring-audit-logs-for-metrics-server.adoc[leveloffset=+1]

// Setting log levels for monitoring components
include::modules/monitoring-setting-log-levels-for-monitoring-components.adoc[leveloffset=+1,tags=**;CPM;!UWM]

// Enabling the query log file for Prometheus
include::modules/monitoring-setting-query-log-file-for-prometheus.adoc[leveloffset=+1,tags=**;CPM;!UWM]

[role="_additional-resources"]
.Additional resources

* TBD
// Enabling query logging for Thanos Querier
include::modules/monitoring-enabling-query-logging-for-thanos-querier.adoc[leveloffset=+1]

[role="_additional-resources"]
.Additional resources

* TBD
Loading

0 comments on commit 0f93cce

Please sign in to comment.