diff --git a/docs/README-resource-policy.md b/docs/resource-policy/README.md
similarity index 100%
rename from docs/README-resource-policy.md
rename to docs/resource-policy/README.md
diff --git a/docs/resource-policy/developers-guide/architecture.md b/docs/resource-policy/developers-guide/architecture.md
new file mode 100644
index 000000000..130df98e5
--- /dev/null
+++ b/docs/resource-policy/developers-guide/architecture.md
@@ -0,0 +1,205 @@
+# Architecture
+## Overview
+NRI Resource Policy (later NRI-RP) plugin is an add-on for controlling
+container resource allocation on Kubernetes nodes.
+NRI-RP plugs in to the NRI interface provided by container runtime implementation.
+The NRI-RP may alter the container resource allocation depending on
+NRI-RP keeps track of the states of all containers running on a Kubernetes
+node. Whenever it receives a NRI request that results in changes to the
+resource allocation of any container (container creation, deletion, or
+resource assignment update request), NRI-RP runs the built-in policy
+algorithm. This policy makes a decision about how the assignment of
+resources should be updated. The policy can make changes to any
+container in the system, not just the one associated with the received
+NRI request. NRI-RP's internal state tracking cache provides an abstraction
+for modifying containers and the policy uses this abstraction for recording its
+Many aspects for NRI-RP are configurable. These include, for instance,
+configuration of the resource assignment algorithm for the policy.
+Although NRI-RP can be configured using a static configuration file,
+the preferred way to configure all NRI-RP instances in a cluster is to use
+Kubernetes CRDs and ConfigMaps.
+## Components
+### [Node Agent](/pkg/resmgr/agent/)
+The node agent is a component internal to NRI-RP itself. All interactions
+by NRI-RP with the Kubernetes Control Plane go through the node agent with
+the node agent performing any direct interactions on behalf of NRI-RP.
+The agent interface implements the following functionality:
+ - push updated external configuration data to NRI-RP
+ - updating resource capacity of the node
+ - getting, setting, or removing labels on the node
+ - getting, setting, or removing annotations on the node
+ - getting, setting, or removing taints on the node
+The config interface is defined and has its gRPC server running in
+NRI-RP. The agent acts as a gRPC client for this interface. The low-level
+cluster interface is defined and has its gRPC server running in the agent,
+with the [convenience layer](/pkg/resmgr/agent) defined in NRI-RP.
+NRI-RP acts as a gRPC client for the low-level plumbing interface.
+Additionally, the stock node agent that comes with NRI-RP implements schemes
+ - configuration management for all NRI-RP instances
+ - management of dynamic adjustments to container resource assignments
+### [Resource Manager](/pkg/resmgr/)
+NRI-RP implements an event processing pipeline. In addition to NRI events,
+it processes a set of other events that are not directly related to or the
+result of NRI requests.
+These events are typically internally generated within NRI-RP. They can be
+the result of changes in the state of some containers or the utilization
+of a shared system resource, which potentially could warrant an attempt to
+rebalance the distribution of resources among containers to bring the system
+closer to an optimal state. Some events can also be generated by policies.
+The Resource Manager component of NRI-RP implements the basic control
+flow of the processing pipeline. It passes control to all the
+necessary sub-components of NRI-RP at the various phases of processing a
+request or an event. Additionally, it serializes the processing of these,
+making sure there is at most one request or event being processed at any
+point in time.
+The high-level control flow of the request processing pipeline is as
+A. If the request does not need policying, let it bypass the processing
+pipeline; hand it off for logging, then relay it to the server and the
+corresponding response back to the client.
+B. If the request needs to be intercepted for policying, do the following:
+ 1. Lock the processing pipeline serialization lock.
+ 2. Look up/create cache objects (pod/container) for the request.
+ 3. If the request has no resource allocation consequences, do proxying
+ (step 6).
+ 4. Otherwise, invoke the policy layer for resource allocation:
+ - Pass it on to the configured active policy, which will
+ - Allocate resources for the container.
+ - Update the assignments for the container in the cache.
+ - Update any other containers affected by the allocation in the cache.
+ 5. Invoke the controller layer for post-policy processing, which will:
+ - Collect controllers with pending changes in their domain of control
+ - for each invoke the post-policy processing function corresponding to
+ the request.
+ - Clear pending markers for the controllers.
+ 6. Proxy the request:
+ - Relay the request to the server.
+ - Send update requests for any additional affected containers.
+ - Update the cache if/as necessary based on the response.
+ - Relay the response back to the client.
+ 7. Release the processing pipeline serialization lock.
+The high-level control flow of the event processing pipeline is one of the
+following, based on the event type:
+ - For policy-specific events:
+ 1. Engage the processing pipeline lock.
+ 2. Call policy event handler.
+ 3. Invoke the controller layer for post-policy processing (same as step 5 for requests).
+ 4. Release the pipeline lock.
+ - For metrics events:
+ 1. Perform collection/processing/correlation.
+ 2. Engage the processing pipeline lock.
+ 3. Update cache objects as/if necessary.
+ 4. Request rebalancing as/if necessary.
+ 5. Release pipeline lock.
+ - For rebalance events:
+ 1. Engage the processing pipeline lock.
+ 2. Invoke policy layer for rebalancing.
+ 3. Invoke the controller layer for post-policy processing (same as step 5 for requests).
+ 4. Release the pipeline lock.
+### [Cache](/pkg/resmgr/cache/)
+The cache is a shared internal storage location within NRI-RP. It tracks the
+runtime state of pods and containers known to NRI-RP, as well as the state
+of NRI-RP itself, including the active configuration and the state of the
+active policy. The cache is saved to permanent storage in the filesystem and
+is used to restore the runtime state of NRI-RP across restarts.
+The cache provides functions for querying and updating the state of pods and
+containers. This is the mechanism used by the active policy to make resource
+assignment decisions. The policy simply updates the state of the affected
+containers in the cache according to the decisions.
+The cache's ability to associate and track changes to containers with
+resource domains is used to enforce policy decisions. The generic controller
+layer first queries which containers have pending changes, then invokes each
+controller for each container. The controllers use the querying functions
+provided by the cache to decide if anything in their resource/control domain
+needs to be changed and then act accordingly.
+Access to the cache needs to be serialized. However, this serialization is
+not provided by the cache itself. Instead, it assumes callers to make sure
+proper protection is in place against concurrent read-write access. The
+request and event processing pipelines in the resource manager use a lock to
+serialize request and event processing and consequently access to the cache.
+If a policy needs to do processing unsolicited by the resource manager, IOW
+processing other than handling the internal policy backend API calls from the
+resource manager, then it should inject a policy event into the resource
+managers event loop. This causes a callback from the resource manager to
+the policy's event handler with the injected event as an argument and with
+the cache properly locked.
+### [Generic Policy Layer](/pkg/resmgr/policy/policy.go)
+The generic policy layer defines the abstract interface the rest of NRI-RP
+uses to interact with policy implementations and takes care of the details
+of activating and dispatching calls through to the configured active policy.
+### [Generic Resource Controller Layer](/pkg/resmgr/control/control.go)
+The generic resource controller layer defines the abstract interface the rest
+of NRI-RP uses to interact with resource controller implementations and takes
+care of the details of dispatching calls to the controller implementations
+for post-policy enforcment of decisions.
+### [Metrics Collector](/pkg/metrics/)
+The metrics collector gathers a set of runtime metrics about the containers
+running on the node. NRI-RP can be configured to periodically evaluate this
+collected data to determine how optimal the current assignment of container
+resources is and to attempt a rebalancing/reallocation if it is deemed
+both possible and necessary.
+### [Policy Implementations](/cmd/)
+#### [Topology Aware](/cmd/topology-aware/)
+A topology-aware policy capable of handling multiple tiers/types of memory,
+typically a DRAM/PMEM combination configured in 2-layer memory mode.
+#### [Balloons](/cmd/balloons/)
+A balloons policy allows user to define fine grained control how the
+computer resources are distributed to workloads.
+#### [Template](/cmd/template/)
+The template policy can be used as a base for developing new policies.
+It provides hooks that the policy developer can fill to define fine grained
+control how the computer resources are distributed to workloads.
+Do not edit the template policy directly but copy it to new name and edit that.
diff --git a/docs/resource-policy/developers-guide/e2e-test.md b/docs/resource-policy/developers-guide/e2e-test.md
new file mode 100644
index 000000000..06adb92f5
--- /dev/null
+++ b/docs/resource-policy/developers-guide/e2e-test.md
@@ -0,0 +1,118 @@
+# End-to-End tests
+## Prerequisites
+- `docker`
+- `vagrant`
+## Usage
+Run policy tests:
+cd test/e2e
+[VAR=VALUE...] ./run_tests.sh policies.test-suite
+Run tests only on certain policy, topology, or only selected test:
+cd test/e2e
+[VAR=VALUE...] ./run_tests.sh policies.test-suite[/POLICY[/TOPOLOGY[/testNN-*]]]
+Get help on available `VAR=VALUE`'s with `./run.sh help`.
+`run_tests.sh` calls `run.sh` in order to execute selected tests.
+Therefore the same `VAR=VALUE` definitions apply both scripts.
+## Test phases
+In the *setup phase* `run.sh` creates a virtual machine unless it
+already exists. When it is running, tests create a single-node cluster
+and deploy `nri-resource-policy` DaemonSet on it.
+In the *test phase* `run.sh` runs a test script. *Test scripts* are
+`bash` scripts that can use helper functions for running commands and
+observing the status of the virtual machine and software running on it.
+In the *tear down phase* `run.sh` copies logs from the virtual machine
+and finally stops or deletes the virtual machine, if that is wanted.
+## Test modes
+- `test` mode runs fast and reports `Test verdict: PASS` or
+ `FAIL`. The exit status is zero if and only if a test passed.
+Currently only the normal test mode is supported.
+## Running from scratch and quick rerun in existing virtual machine
+The test will use `vagrant`-managed virtual machine named in the
+`vm_name` environment variable. The default name is constructed
+from used topology, Linux distribution and runtime name.
+If a virtual machine already exists, the test will be run on it.
+Otherwise the test will create a virtual machine from scratch.
+You can delete a virtual machine by going to the VM directory and
+giving the command `make destroy`.
+## Custom topologies
+If you change NUMA node topology of an existing virtual machine, you
+must delete the virtual machine first. Otherwise the `topology` variable
+is ignored and the test will run in the existing NUMA
+The `topology` variable is a JSON array of objects. Each object
+defines one or more NUMA nodes. Keys in objects:
+"mem" mem (RAM) size on each NUMA node in this group.
+ The default is "0G".
+"nvmem" nvmem (non-volatile RAM) size on each NUMA node
+ in this group. The default is "0G".
+"cores" number of CPU cores on each NUMA node in this group.
+ The default is 0.
+"threads" number of threads on each CPU core.
+ The default is 2.
+"nodes" number of NUMA nodes on each die.
+ The default is 1.
+"dies" number of dies on each package.
+ The default is 1.
+"packages" number of packages.
+ The default is 1.
+Run the test in a VM with two NUMA nodes. There are 4 CPUs (two cores, two
+threads per core by default) and 4G RAM in each node
+e2e$ vm_name=my2x4 topology='[{"mem":"4G","cores":2,"nodes":2}]' ./run.sh
+Run the test in a VM with 32 CPUs in total: there are two packages
+(sockets) in the system, each containing two dies. Each die containing
+two NUMA nodes, each node containing 2 CPU cores, each core containing
+two threads. And with a NUMA node with 16G of non-volatile memory
+(NVRAM) but no CPUs.
+e2e$ vm_name=mynvram topology='[{"mem":"4G","cores":2,"nodes":2,"dies":2,"packages":2},{"nvmem":"16G"}]' ./run.sh
+## Test output
+All test output is saved under the directory in the environment
+variable `outdir` if the `run.sh` script is executed as is. The default
+output directory in this case is `./output`.
+For the standard e2e-tests run by `run_tests.sh`, the output directory
+is constructed from used Linux distribution, container runtime name and
+the used machine topology.
+For example `n4c16-generic-fedora37-containerd` output directory would
+indicate four node and 16 CPU system, running with Fedora 37 and having
+containerd as a container runtime.
+Executed commands with their output, exit status and timestamps are
+saved under the `output/commands` directory.
diff --git a/docs/resource-policy/developers-guide/figures/nri-resource-policy.png b/docs/resource-policy/developers-guide/figures/nri-resource-policy.png
new file mode 100644
index 000000000..fdb385c4e
Binary files /dev/null and b/docs/resource-policy/developers-guide/figures/nri-resource-policy.png differ
diff --git a/docs/developers-guide/index.rst b/docs/resource-policy/developers-guide/index.rst
similarity index 78%
rename from docs/developers-guide/index.rst
rename to docs/resource-policy/developers-guide/index.rst
index 4a1b83f39..dc83815f6 100644
--- a/docs/developers-guide/index.rst
+++ b/docs/resource-policy/developers-guide/index.rst
@@ -4,5 +4,4 @@ Developer's Guide
:maxdepth: 1
- policy-writers-guide.md
diff --git a/docs/developers-guide/testing.rst b/docs/resource-policy/developers-guide/testing.rst
similarity index 100%
rename from docs/developers-guide/testing.rst
rename to docs/resource-policy/developers-guide/testing.rst
diff --git a/docs/developers-guide/unit-test.md b/docs/resource-policy/developers-guide/unit-test.md
similarity index 100%
rename from docs/developers-guide/unit-test.md
rename to docs/resource-policy/developers-guide/unit-test.md
diff --git a/docs/resource-policy/index.rst b/docs/resource-policy/index.rst
new file mode 100644
index 000000000..48d2cdd44
--- /dev/null
+++ b/docs/resource-policy/index.rst
@@ -0,0 +1,17 @@
+.. NRI Resource Policy documentation master file
+Resource Policy Plugin
+.. toctree::
+ :maxdepth: 2
+ :caption: Contents:
+ introduction.md
+ quick-start.md
+ installation.md
+ setup.md
+ policy/index.rst
+ node-agent.md
+ developers-guide/index.rst
diff --git a/docs/resource-policy/installation.md b/docs/resource-policy/installation.md
new file mode 100644
index 000000000..3b2f72d87
--- /dev/null
+++ b/docs/resource-policy/installation.md
@@ -0,0 +1,26 @@
+# Installation
+## Installing from sources
+You will need at least `git`, {{ '`golang '+ '{}'.format(golang_version) + '`' }} or newer,
+`GNU make`, `bash`, `find`, `sed`, `head`, `date`, and `install` to be able to build and install
+from sources.
+Although not recommended, you can install NRI Resource Policy from sources:
+ git clone https://github.com/containers/nri-plugins
+ make && make images
+After the images are created, you can copy the tar images from `build/images` to
+the target device and deploy the relevant DaemonSet deployment file found also
+in images directory.
+For example, you can deploy topology-aware resource policy like this:
+ cd build/images
+ ctr -n k8s.io image import nri-resource-policy-topology-aware-image-321ca3aad95e.tar
+ kubectl apply -f nri-resource-policy-topology-aware-deployment.yaml
diff --git a/docs/resource-policy/introduction.md b/docs/resource-policy/introduction.md
new file mode 100644
index 000000000..9c5e47260
--- /dev/null
+++ b/docs/resource-policy/introduction.md
@@ -0,0 +1,12 @@
+# Introduction
+NRI Resource Policy is a NRI container runtime plugin. It is connected
+to Container Runtime implementation (containerd, cri-o) via NRI API.
+The main purpose of the the NRI resource plugin is to apply hardware-aware
+resource allocation policies to the containers running in the system.
+There are different policies available, each with a different set of
+goals in mind and implementing different hardware allocation strategies. The
+details of whether and how a container resource request is altered or
+if extra actions are performed depend on which policy plugin is running
+and how that policy is configured.
diff --git a/docs/node-agent.md b/docs/resource-policy/node-agent.md
similarity index 50%
rename from docs/node-agent.md
rename to docs/resource-policy/node-agent.md
index 2bf6da34e..ea7a57840 100644
--- a/docs/node-agent.md
+++ b/docs/resource-policy/node-agent.md
@@ -1,29 +1,27 @@
-# Node Agent
+# Dynamic Configuration
-CRI Resource Manager can be configured dynamically using the CRI Resource
-Manager Node Agent and Kubernetes\* ConfigMaps. The agent is built in the
-NRI resource plugin.
+NRI Resource Policy plugin can be configured dynamically using ConfigMaps.
-The agent monitors two ConfigMaps for the node, a primary node-specific one
+The plugin daemon monitors two ConfigMaps for the node, a primary node-specific one
and a secondary group-specific or default one, depending on whether the node
belongs to a configuration group. The node-specific ConfigMap always takes
precedence over the others.
The names of these ConfigMaps are
-1. `cri-resmgr-config.node.$NODE_NAME`: primary, node-specific configuration
-2. `cri-resmgr-config.group.$GROUP_NAME`: secondary group-specific node
+1. `nri-resource-policy-config.node.$NODE_NAME`: primary, node-specific configuration
+2. `nri-resource-policy-config.group.$GROUP_NAME`: secondary group-specific node
-3. `cri-resmgr-config.default`: secondary: secondary default node
+3. `nri-resource-policy-config.default`: secondary: secondary default node
You can assign a node to a configuration group by setting the
-`cri-resource-manager.intel.com/group` label on the node to the name of
+`resource-policy.nri.io/group` label on the node to the name of
the configuration group. You can remove a node from its group by deleting
the node group label.
There is a
-[sample ConfigMap spec](/sample-configs/nri-resmgr-configmap.example.yaml)
+[sample ConfigMap spec](/sample-configs/nri-resource-policy-configmap.example.yaml)
that contains a node-specific, a group-specific, and a default ConfigMap
example. See [any available policy-specific documentation](policy/index.rst)
for more information on the policy configurations.
diff --git a/docs/policy/balloons.md b/docs/resource-policy/policy/balloons.md
similarity index 94%
rename from docs/policy/balloons.md
rename to docs/resource-policy/policy/balloons.md
index 4f752c77c..f6a9c80da 100644
--- a/docs/policy/balloons.md
+++ b/docs/resource-policy/policy/balloons.md
@@ -47,17 +47,15 @@ min and max frequencies on CPU cores and uncore.
## Deployment
-### Install cri-resmgr
-Deploy cri-resmgr on each node as you would for any other policy. See
-[installation](../installation.md) for more details.
+Deploy nri-resource-policy-balloons on each node as you would for any
+other policy. See [installation](../installation.md) for more details.
## Configuration
The balloons policy is configured using the yaml-based configuration
-system of CRI-RM. See [setup and
-usage](../setup.md#setting-up-cri-resource-manager) for more details
-on managing the configuration.
+system of nri-resource-policy.
+See [setup and usage](../setup.md#setting-up-nri-resource-policy) for
+more details on managing the configuration.
### Parameters
@@ -193,9 +191,9 @@ of a single container (`CONTAINER_NAME`). The last two annotations set
the default balloon type for all containers in the pod.
-balloon.balloons.cri-resource-manager.intel.com/container.CONTAINER_NAME: BT
-balloon.balloons.cri-resource-manager.intel.com/pod: BT
-balloon.balloons.cri-resource-manager.intel.com: BT
+balloon.balloons.resource-policy.nri.io/container.CONTAINER_NAME: BT
+balloon.balloons.resource-policy.nri.io/pod: BT
+balloon.balloons.resource-policy.nri.io: BT
If a pod has no annotations, its namespace is matched to the
@@ -211,7 +209,7 @@ the `BalloonTypes` configuration.
In order to enable more verbose logging and metrics exporting from the
balloons policy, enable instrumentation and policy debugging from the
-CRI-RM global config:
+nri-resource-policy global config:
diff --git a/docs/policy/container-affinity.md b/docs/resource-policy/policy/container-affinity.md
similarity index 90%
rename from docs/policy/container-affinity.md
rename to docs/resource-policy/policy/container-affinity.md
index 6a1a90dab..d4a1ab2e2 100644
--- a/docs/policy/container-affinity.md
+++ b/docs/resource-policy/policy/container-affinity.md
@@ -2,10 +2,10 @@
## Introduction
-Some policies allow the user to give hints about how particular containers
-should be *co-located* within a node. In particular these hints express whether
-containers should be located *'close'* to each other or *'far away'* from each
-other, in a hardware topology sense.
+The topology-aware resource policy allow the user to give hints about how
+particular containers should be *co-located* within a node. In particular these
+hints express whether containers should be located *'close'* to each other or
+*'far away'* from each other, in a hardware topology sense.
Since these hints are interpreted always by a particular *policy implementation*,
the exact definitions of 'close' and 'far' are also somewhat *policy-specific*.
@@ -27,8 +27,8 @@ Policies try to place a container
## Affinity Annotation Syntax
-*Affinities* are defined as the `cri-resource-manager.intel.com/affinity` annotation.
-*Anti-affinities* are defined as the `cri-resource-manager.intel.com/anti-affinity`
+*Affinities* are defined as the `resource-policy.nri.io/affinity` annotation.
+*Anti-affinities* are defined as the `resource-manager.nri.io/anti-affinity`
annotation. They are specified in the `metadata` section of the `Pod YAML`, under
`annotations` as a dictionary, with each dictionary key being the name of the
*container* within the Pod to which the annotation belongs to.
@@ -36,7 +36,7 @@ annotation. They are specified in the `metadata` section of the `Pod YAML`, unde
- cri-resource-manager.intel.com/affinity: |
+ resource-manager.nri.io/affinity: |
- scope:
key: key-ref
@@ -55,13 +55,13 @@ metadata:
weight: w
-An anti-affinity is defined similarly but using `cri-resource-manager.intel.com/anti-affinity`
+An anti-affinity is defined similarly but using `resource-manager.nri.io/anti-affinity`
as the annotation key.
- cri-resource-manager.intel.com/anti-affinity: |
+ resource-manager.nri.io/anti-affinity: |
- scope:
key: key-ref
@@ -197,7 +197,7 @@ container `wolf`.
- cri-resource-manager.intel.com/affinity: |
+ resource-manager.nri.io/affinity: |
- match:
key: name
@@ -205,7 +205,7 @@ metadata:
- sheep
weight: 5
- cri-resource-manager.intel.com/anti-affinity: |
+ resource-manager.nri.io/anti-affinity: |
- match:
key: name
@@ -223,9 +223,9 @@ one needs to give just the names of the containers, like in the example below.
- cri-resource-manager.intel.com/affinity: |
+ resource-manager.nri.io/affinity: |
container3: [ container1 ]
- cri-resource-manager.intel.com/anti-affinity: |
+ resource-manager.nri.io/anti-affinity: |
container3: [ container2 ]
container4: [ container2, container3 ]
@@ -243,14 +243,14 @@ The equivalent annotation in full syntax would be
- cri-resource-manager.intel.com/affinity: |+
+ resource-manager.nri.io/affinity: |+
- match:
key: labels/io.kubernetes.container.name
operator: In
- container1
- cri-resource-manager.intel.com/anti-affinity: |+
+ resource-manager.nri.io/anti-affinity: |+
- match:
key: labels/io.kubernetes.container.name
diff --git a/docs/policy/cpu-allocator.md b/docs/resource-policy/policy/cpu-allocator.md
similarity index 88%
rename from docs/policy/cpu-allocator.md
rename to docs/resource-policy/policy/cpu-allocator.md
index 149ab669e..8d7eb0419 100644
--- a/docs/policy/cpu-allocator.md
+++ b/docs/resource-policy/policy/cpu-allocator.md
@@ -1,6 +1,6 @@
# CPU Allocator
-CRI Resource Manager has a separate CPU allocator component that helps policies
+NRI Resource Policy has a separate CPU allocator component that helps policies
make educated allocation of CPU cores for workloads. Currently all policies
utilize the built-in CPU allocator. See policy specific documentation for more
@@ -14,7 +14,7 @@ request "near" each other in order to minimize memory latencies between CPUs.
## CPU Prioritization
The CPU allocator also does automatic CPU prioritization by detecting CPU
-features and their configuration parameters. Currently, CRI Resource Manager
+features and their configuration parameters. Currently, NRI Resource Policy
supports CPU priority detection based on the `intel_pstate` scaling
driver in the Linux CPUFreq subsystem, and, Intel Speed Select Technology
@@ -26,7 +26,7 @@ priority CPUs for high priority workloads.
### Intel Speed Select Technology (SST)
-CRI Resource Manager supports detection of all Intel Speed Select Technology
+NRI Resource Policy supports detection of all Intel Speed Select Technology
(SST) features, i.e. Speed Select Technology Performance Profile (SST-PP), Base
Frequency (SST-BF), Turbo Frequency (SST-TF) and Core Power (SST-CP).
@@ -47,7 +47,7 @@ and their parameterization:
### Linux CPUFreq
CPUFreq based prioritization only takes effect if Intel Speed Select Technology
-(SST) is disabled (or not supported). CRI-RM divides CPU cores into priority
+(SST) is disabled (or not supported). NRI-RM divides CPU cores into priority
classes based on two parameters:
- base frequency
diff --git a/docs/policy/index.rst b/docs/resource-policy/policy/index.rst
similarity index 84%
rename from docs/policy/index.rst
rename to docs/resource-policy/policy/index.rst
index 35e1647a6..35bd500de 100644
--- a/docs/policy/index.rst
+++ b/docs/resource-policy/policy/index.rst
@@ -7,6 +7,4 @@ Policies
- blockio.md
- rdt.md
diff --git a/docs/policy/topology-aware.md b/docs/resource-policy/policy/topology-aware.md
similarity index 90%
rename from docs/policy/topology-aware.md
rename to docs/resource-policy/policy/topology-aware.md
index b1659c7f8..9904e891d 100644
--- a/docs/policy/topology-aware.md
+++ b/docs/resource-policy/policy/topology-aware.md
@@ -39,8 +39,9 @@ dies, sockets, and finally the whole of the system at the root node. Leaf NUMA
nodes are assigned the memory behind their controllers / zones and CPU cores
with the smallest distance / access penalty to this memory. If the machine
has multiple types of memory separately visible to both the kernel and user
-space, for instance both DRAM and [PMEM](https://www.intel.com/content/www/us/en/products/memory-storage/optane-dc-persistent-memory.html), each zone of special type of memory
-is assigned to the closest NUMA node pool.
+space, for instance both DRAM and
+each zone of special type of memory is assigned to the closest NUMA node pool.
Each non-leaf pool node in the tree is assigned the union of the resources of
its children. So in practice, dies nodes end up containing all the CPU cores
@@ -118,7 +119,7 @@ The `topology-aware` policy has the following features:
## Activating the Policy
You can activate the `topology-aware` policy by using the following configuration
-fragment in the configuration for `cri-resmgr`:
+fragment in the configuration for `nri-resource-policy-topology-aware`:
@@ -131,10 +132,9 @@ policy:
The policy has a number of configuration options which affect its default behavior.
These options can be supplied as part of the
-[dynamic configuration](../setup.md#using-cri-resource-manager-agent-and-a-configmap)
+[dynamic configuration](../setup.md#using-nri-resource-policy-agent-and-a-configmap)
received via the [`node agent`](../node-agent.md), or in a fallback or forced
-[configuration file](../setup.md#using-a-local-configuration-from-a-file). These
-configuration options are
+configuration file. These configuration options are
- `PinCPU`
* whether to pin workloads to assigned pool CPU sets
@@ -247,11 +247,11 @@ following Pod annotation.
# opt in container C1 to shared CPU core allocation
- prefer-shared-cpus.cri-resource-manager.intel.com/container.C1: "true"
+ prefer-shared-cpus.resource-policy.nri.io/container.C1: "true"
# opt in the whole pod to shared CPU core allocation
- prefer-shared-cpus.cri-resource-manager.intel.com/pod: "true"
+ prefer-shared-cpus.resource-policy.nri.io/pod: "true"
# selectively opt out container C2 from shared CPU core allocation
- prefer-shared-cpus.cri-resource-manager.intel.com/container.C2: "false"
+ prefer-shared-cpus.resource-policy.nri.io/container.C2: "false"
Opting in to exclusive allocation happens by opting out from shared allocation,
@@ -265,11 +265,11 @@ allocation using the following Pod annotation.
# opt in container C1 to isolated exclusive CPU core allocation
- prefer-isolated-cpus.cri-resource-manager.intel.com/container.C1: "true"
+ prefer-isolated-cpus.resource-policy.nri.io/container.C1: "true"
# opt in the whole pod to isolated exclusive CPU core allocation
- prefer-isolated-cpus.cri-resource-manager.intel.com/pod: "true"
+ prefer-isolated-cpus.resource-policy.nri.io/pod: "true"
# selectively opt out container C2 from isolated exclusive CPU core allocation
- prefer-isolated-cpus.cri-resource-manager.intel.com/container.C2: "false"
+ prefer-isolated-cpus.resource-policy.nri.io/container.C2: "false"
These Pod annotations have no effect on containers which are not eligible for
@@ -277,12 +277,12 @@ exclusive allocation.
### Implicit Hardware Topology Hints
-`CRI Resource Manager` automatically generates HW `Topology Hints` for devices
+`NRI Resource Policy` automatically generates HW `Topology Hints` for devices
assigned to a container, prior to handing the container off to the active policy
for resource allocation. The `topology-aware` policy is hint-aware and normally
-takes topology hints into account when picking the best pool to allocate
-resources. Hints indicate optimal `HW locality` for device access and they can
-alter significantly which pool gets picked for a container.
+takes topology hints into account when picking the best pool to allocate resources.
+Hints indicate optimal `HW locality` for device access and they can alter
+significantly which pool gets picked for a container.
Since device topology hints are implicitly generated, there are cases where one
would like the policy to disregard them altogether. For instance, when a local
@@ -295,11 +295,11 @@ pool selection using the following Pod annotations.
# only disregard hints for container C1
- topologyhints.cri-resource-manager.intel.com/container.C1: "false"
+ topologyhints.resource-policy.nri.io/container.C1: "false"
# disregard hints for all containers by default
- topologyhints.cri-resource-manager.intel.com/pod: "false"
+ topologyhints.resource-policy.nri.io/pod: "false"
# but take hints into account for container C2
- topologyhints.cri-resource-manager.intel.com/container.C2: "true"
+ topologyhints.resource-policy.nri.io/container.C2: "true"
Topology hint generation is globally enabled by default. Therefore, using the
@@ -336,8 +336,8 @@ begin with. Cold start is configured like this in the pod metadata:
- memory-type.cri-resource-manager.intel.com/container.container1: dram,pmem
- cold-start.cri-resource-manager.intel.com/container.container1: |
+ memory-type.resource-policy.nri.io/container.container1: dram,pmem
+ cold-start.resource-policy.nri.io/container.container1: |
duration: 60s
@@ -348,9 +348,9 @@ future release:
- cri-resource-manager.intel.com/memory-type: |
+ resource-policy.nri.io/memory-type: |
container1: dram,pmem
- cri-resource-manager.intel.com/cold-start: |
+ resource-policy.nri.io/cold-start: |
duration: 60s
@@ -386,7 +386,7 @@ every two seconds from DRAM to PMEM.
## Container memory requests and limits
-Due to inaccuracies in how `cri-resmgr` calculates memory requests for
+Due to inaccuracies in how `nri-resource-policy` calculates memory requests for
pods in QoS class `Burstable`, you should either use `Limit` for setting
the amount of memory for containers in `Burstable` pods to provide `cri-resmgr`
with an exact copy of the resource requirements from the Pod Spec as an extra
@@ -427,6 +427,6 @@ For example:
- prefer-reserved-cpus.cri-resource-manager.intel.com/pod: "true"
- prefer-reserved-cpus.cri-resource-manager.intel.com/container.special: "false"
+ prefer-reserved-cpus.resource-policy.nri.io/pod: "true"
+ prefer-reserved-cpus.resource-policy.nri.io/container.special: "false"
diff --git a/docs/resource-policy/quick-start.md b/docs/resource-policy/quick-start.md
new file mode 100644
index 000000000..254007682
--- /dev/null
+++ b/docs/resource-policy/quick-start.md
@@ -0,0 +1,85 @@
+# Quick-start
+The following describes the minimum number of steps to get started with NRI
+Resource Policy plugin.
+## Pre-requisites
+- containerd or cri-o container runtime installed and running, and also
+ NRI feature enabled.
+- kubelet installed on your nodes
+Note that for both the containerd and cri-o must have NRI support enabled.
+For containerd, the NRI is currently only available in 1.7beta or later release.
+For cri-o it is recommended to use version 1.26.0 or later.
+## Setup NRI Resource Policy Plugin
+First, compile the resource plugins and create deployment image.
+git clone https://github.com/containers/nri-plugins
+make && make images
+### Deploy Daemonset
+The build/ directory will contain the needed images and deployment
+files. Copy the plugin .yaml file and corresponding image file into
+the node and deploy it there.
+For example:
+ls build/images
+ nri-resource-policy-balloons-deployment.yaml
+ nri-resource-policy-balloons-image-ed6fffe77071.tar
+ nri-resource-policy-topology-aware-deployment.yaml
+ nri-resource-policy-topology-aware-image-9797e8de7107.tar
+Copy the nri-resource-policy-topology-aware-deployment.yaml and the
+latest tar file, that was generated by the `make images` command.
+This will create a fresh config file and backup the old one if it existed:
+[ -f /etc/containerd/config.toml ] && cp /etc/containerd/config.toml.backup
+containerd config default > /etc/containerd/config.toml
+Edit the `/etc/containerd/config.toml` file and set `plugins."io.containerd.nri.v1.nri"`
+option `disable = true` to `disable = false` and restart containerd.
+If you are running cri-o, the NRI enabling can be done like this:
+mkdir -p /etc/crio/crio.conf.d
+cat > /etc/crio/crio.conf.d/10-enable-nri.conf <
` option.
+When using the agent, it is also possible to provide an initial fallback for
+configuration using the `--fallback-config `. This file is
+used before the very first configuration is successfully acquired from the
+See the [Node Agent][agent] about how to set up and configure the agent.
+## Logging and debugging
+You can control logging with the klog command line options or by setting the
+corresponding environment variables. You can get the name of the environment
+variable for a command line option by prepending the `LOGGER_` prefix to the
+capitalized option name without any leading dashes. For instance, setting the
+environment variable `LOGGER_SKIP_HEADERS=true` has the same effect as using
+the `-skip_headers` command line option.
+Additionally, the `LOGGER_DEBUG` environment variable controls debug logs.
+These are globally disabled by default. You can turn on full debugging by
+setting `LOGGER_DEBUG='*'`.
+When using environment variables, be careful which configuration you pass to
+NRI Resource Policy using a file or ConfigMap. The environment is treated
+as default configuration but a file or a ConfigMap has higher precedence.
+If something is configured in both, the environment will only be in effect
+until the configuration is applied. However, in such a case if you later
+push an updated configuration to NRI Resource Policy with the overlapping
+settings removed, the original ones from the environment will be in effect
+For debug logs, the settings from the configuration are applied in addition
+to any settings in the environment. That said, if you turn something on in
+the environment but off in the configuration, it will be turned off
+[agent]: node-agent.md
diff --git a/docs/security.md b/docs/security.md
deleted file mode 100644
index 74e4a096f..000000000
--- a/docs/security.md
+++ /dev/null
@@ -1,4 +0,0 @@
-# Reporting a Potential Security Vulnerability
-Please visit [intel.com/security](https://intel.com/security) to report
-security issues.
diff --git a/docs/setup.md b/docs/setup.md
deleted file mode 100644
index 3855d66e5..000000000
--- a/docs/setup.md
+++ /dev/null
@@ -1,210 +0,0 @@
-# Setup and Usage
-If you want to give CRI Resource Manager a try, here is the list of things
-you need to do, assuming you already have a Kubernetes\* cluster up and
-running, using either `containerd` or `cri-o` as the runtime.
- 0. [Install](installation.md) CRI Resource Manager.
- 1. Set up kubelet to use CRI Resource Manager as the runtime.
- 2. Set up CRI Resource Manager to use the runtime with a policy.
-For kubelet you do this by altering its command line options like this:
- kubelet --container-runtime=remote \
- --container-runtime-endpoint=unix:///var/run/cri-resmgr/cri-resmgr.sock
-For CRI Resource Manager, you need to provide a configuration file, and also
-a socket path if you don't use `containerd` or you run it with a different
-socket path.
- # for containerd with default socket path
- cri-resmgr --force-config --runtime-socket unix:///var/run/containerd/containerd.sock
- # for cri-o
- cri-resmgr --force-config --runtime-socket unix:///var/run/crio/crio.sock
-The choice of policy to use along with any potential parameters specific to
-that policy are taken from the configuration file. You can take a look at the
-[sample configurations](/sample-configs) for some minimal/trivial examples.
-For instance, you can use
-as `` to activate the topology aware policy with memory
-tiering support.
-**NOTE**: Currently, the available policies are a work in progress.
-## Setting up kubelet to use CRI Resource Manager as the runtime
-To let CRI Resource Manager act as a proxy between kubelet and the CRI
-runtime, you need to configure kubelet to connect to CRI Resource Manager
-instead of the runtime. You do this by passing extra command line options to
-kubelet as shown below:
- kubelet --container-runtime=remote \
- --container-runtime-endpoint=unix:///var/run/cri-resmgr/cri-resmgr.sock
-## Setting up CRI Resource Manager
-Setting up CRI Resource Manager involves pointing it to your runtime and
-providing it with a configuration. Pointing to the runtime is done using
-the `--runtime-socket ` and, optionally, the `--image-socket `.
-For providing a configuration there are two options:
- 1. use a local configuration YAML file
- 2. use the [CRI Resource Manager Node Agent][agent] and a `ConfigMap`
-The former is easier to set up and it is also the preferred way to run CRI
-Resource Manager for development, and in some cases testing. Setting up the
-latter is a bit more involved but it allows you to
- - manage policy configuration for your cluster as a single source, and
- - dynamically update that configuration
-### Using a local configuration from a file
-This is the easiest way to run CRI Resource Manager for development or
-testing. You can do it with the following command:
- cri-resmgr --force-config --runtime-socket
-When started this way, CRI Resource Manager reads its configuration from the
-given file. It does not fetch external configuration from the node agent and
-also disables the config interface for receiving configuration updates.
-### Using CRI Resource Manager Agent and a ConfigMap
-This setup requires an extra component, the
-[CRI Resource Manager Node Agent][agent],
-to monitor and fetch configuration from the ConfigMap and pass it on to CRI
-Resource Manager. By default, CRI Resource Manager automatically tries to
-use the agent to acquire configuration, unless you override this by forcing
-a static local configuration using the `--force-config ` option.
-When using the agent, it is also possible to provide an initial fallback for
-configuration using the `--fallback-config `. This file is
-used before the very first configuration is successfully acquired from the
-Whenever a new configuration is acquired from the agent and successfully
-taken into use, this configuration is stored in the cache and becomes
-the default configuration to take into use the next time CRI Resource
-Manager is restarted (unless that time the --force-config option is used).
-While CRI Resource Manager is shut down, any cached configuration can be
-cleared from the cache using the --reset-config command line option.
-See the [Node Agent][agent] about how to set up and configure the agent.
-### Changing the active policy
-Currently, CRI Resource Manager disables changing the active policy using
-the [agent][agent]. That is, once the active policy is recorded in the cache,
-any configuration received through the agent that requests a different policy
-is rejected. This limitation will be removed in a future version of
-CRI Resource Manager.
-However, by default CRI Resource Manager allows you to change policies during
-its startup phase. If you want to disable this, you can pass the command line
-option `--disable-policy-switch` to CRI Resource Manager.
-If you run CRI Resource Manager with disabled policy switching, you can still
-switch policies by clearing any policy-specific data stored in the cache while
-CRI Resource Manager is shut down. You can do this by using the command line
-option `--reset-policy`. The whole sequence of switching policies this way is
- - stop cri-resmgr (`systemctl stop cri-resource-manager`)
- - reset policy data (`cri-resmgr --reset-policy`)
- - change policy (`$EDITOR /etc/cri-resource-manager/fallback.cfg`)
- - start cri-resmgr (`systemctl start cri-resource-manager`)
-## Kata Containers
-[Kata Containers](https://katacontainers.io/) is an open source container
-runtime, building lightweight virtual machines that seamlessly plug into the
-containers ecosystem.
-In order to enable Kata Containers in a Kubernetes-CRI-RM stack, both
-Kubernetes and the Container Runtime need to be aware of the new runtime
- * The Container Runtime can only be CRI-O or containerd, and needs to
- have the runtimes enabled in their configuration files.
- * Kubernetes must be made aware of the CRI-O/containerd runtimes via a
- "RuntimeClass"
- [resource](https://kubernetes.io/docs/concepts/containers/runtime-class/)
-After these prerequisites are satisfied, the configuration file for the
-target Kata Container, must have the flag "SandboxCgroupOnly" set to true.
-As of Kata 2.0 this is the only way Kata Containers can work with the
-Kubernetes cgroup naming conventions.
- ```toml
- ...
- # If enabled, the runtime will add all the kata processes inside one dedicated cgroup.
- # The container cgroups in the host are not created, just one single cgroup per sandbox.
- # The runtime caller is free to restrict or collect cgroup stats of the overall Kata sandbox.
- # The sandbox cgroup path is the parent cgroup of a container with the PodSandbox annotation.
- # The sandbox cgroup is constrained if there is no container type annotation.
- # See: https://godoc.org/github.com/kata-containers/runtime/virtcontainers#ContainerType
- sandbox_cgroup_only=true
- ...
- ```
-### Reference
-If you have a pre-existing Kubernetes cluster, for an easy deployement
-follow this [document](https://github.com/kata-containers/packaging/blob/master/kata-deploy/README.md#kubernetes-quick-start).
-Starting from scratch:
- * [Kata installation guide](https://github.com/kata-containers/kata-containers/tree/2.0-dev/docs/install#manual-installation)
- * [Kata Containers + CRI-O](https://github.com/kata-containers/documentation/blob/master/how-to/run-kata-with-k8s.md)
- * [Kata Containers + containerd](https://github.com/kata-containers/documentation/blob/master/how-to/containerd-kata.md)
- * [Kubernetes Runtime Class](https://kubernetes.io/docs/concepts/containers/runtime-class/)
- * [Cgroup and Kata containers](https://github.com/kata-containers/kata-containers/blob/stable-2.0.0/docs/design/host-cgroups.md)
-## Running with Untested Runtimes
-CRI Resource Manager is tested with `containerd` and `CRI-O`. If any other runtime is
-detected during startup, `cri-resmgr` will refuse to start. This default behavior can
-be changed using the `--allow-untested-runtimes` command line option.
-## Logging and debugging
-You can control logging with the klog command line options or by setting the
-corresponding environment variables. You can get the name of the environment
-variable for a command line option by prepending the `LOGGER_` prefix to the
-capitalized option name without any leading dashes. For instance, setting the
-environment variable `LOGGER_SKIP_HEADERS=true` has the same effect as using
-the `-skip_headers` command line option.
-Additionally, the `LOGGER_DEBUG` environment variable controls debug logs.
-These are globally disabled by default. You can turn on full debugging by
-setting `LOGGER_DEBUG='*'`.
-When using environment variables, be careful which configuration you pass to
-CRI Resource Manager using a file or ConfigMap. The environment is treated
-as default configuration but a file or a ConfigMap has higher precedence.
-If something is configured in both, the environment will only be in effect
-until the configuration is applied. However, in such a case if you later
-push an updated configuration to CRI Resource Manager with the overlapping
-settings removed, the original ones from the environment will be in effect
-For debug logs, the settings from the configuration are applied in addition
-to any settings in the environment. That said, if you turn something on in
-the environment but off in the configuration, it will be turned off
-[agent]: node-agent.md