Skip to content

Releases: intel/cri-resource-manager

v0.5.0: Improved policies, bug fixes, better test coverage.

23 Feb 09:16
64fe33f
Compare
Choose a tag to compare

This release brings general stability and correctness improvements. It merges the memory tiering policy to
the original topology aware one, with a number of important fixes for resource accounting and assignment.

Major Changes

  • policies:

    • Add new podpools policy for pod-granularity workload placement
    • topology-aware: merge topology-aware and memory tiering policies
    • topology-aware: honor CPU reservation/reserved CPU set in configuration
    • topology-aware: unify syntax for per container and pod annotated preferences
  • RDT:

    • split out RDT manipulation code to a self-contained package, https://github.com/intel/goresctrl
    • implement operating modes (Disabled, Discovery, Full)
    • add option to disable RDT monitoring
    • support L2 cache allocation
  • CPU allocator (used by topology-aware and podpools policies):

    • detect CPU priority levels with Intel Speed Select Technology (SST)

Bug Fixes

  • policies:

    • topology-aware: several significant cpu and memory accounting fixes
    • topology-aware: fixes in gradually relaxed memory pinning for OOM-prevention
    • topology-aware: better handling of bounding and reserved resources
    • topology-aware: fix assignment of CPU-less memory zones
    • topology-aware: fix building sparse topology trees
  • RDT:

    • use root class as a fallback for missing classes
    • empty class implies root class
    • do forceful rdt (re-)configuration
  • resource-manager:

    • force full reallocation when switching policies
    • run post-update hooks after reconfiguration
    • save cache at startup
  • config:

    • handle composite structs in Module.validate()
  • cache:

    • (over)write cache file atomically
  • testing:

    • e2e: fix clearing cri-resmgr cache on uninstall
    • e2e: properly set VM_COMPOSE_YAML when reloading existing vm-configs
  • documentation:

    • fix static-pools debug logging instructions
    • sample-configs: sample configuration fixes

Other Improvements

  • policies:

    • topology-aware: more regular annotation interpretation for CPU allocation preferences
  • resource-manager:

    • dump extra data for message disambiguation
    • flush logs after every request/event processed
  • cache:

    • log name on pod/container removal
  • cri-resmgr:

    • increase allowed service journal log bursts
  • logging:

    • switch logger to use klog
  • testing:

    • e2e: add tests for memset expansion in topology-aware policy
    • e2e: add vm-put-docker-image to the vm library
    • e2e: allow user override for VM_SSH_USER over distro-ssh-user
    • e2e: generalize templating any file with instantiate()
    • e2e: properly set VM_COMPOSE_YAML when reloading existing vm-configs
    • e2e: set imagePullPolicy on every test pod
    • e2e: support namespaced kubectl create from templates
    • e2e: unified memory-type and cold-start annotation syntax
    • e2e: update dynamic page demotion tests
    • e2e: update podpools tests to pass with new cpuallocator
    • e2e: update tests on pinning reserved CPUs
    • benchmark: add memtier_benchmark for memcached/redis
  • documentation:

    • improve RDT documentation
    • fix static-pools debug logging instructions

List of Merged PRs

  • PR #528: build: include only cri-resmgr in binary dist tarballs
  • PR #529: docs: fix static-pools debug logging instructions
  • PR #530: memtier/c4pmem4/test03-coldstart: don't jump the gun
  • PR #536: .github: update issue template for new releases
  • PR #537: docs: minor fixes in html template customization
  • PR #538: docs: use 'release branch' as the current version in versions menu
  • PR #540: e2e: support namespaced kubectl create from templates
  • PR #541: e2e: fix clearing cri-resmgr cache on uninstall
  • PR #542: e2e: generalize templating any file with instantiate()
  • PR #543: memtier: implement reserved CPUs pool
  • PR #545: resource-manager: run post-update hooks after reconfiguration
  • PR #546: go.mod: update to Kubernetes v1.19.4
  • PR #547: scripts: helper for maintaining replace lines in go.mod
  • PR #549: benchmark: add memtier_benchmark for memcached/redis
  • PR #550: test/functional: prevent read/write data race in klog
  • PR #553: docs: quote text containing '<' and '>' using `` in affinity docs
  • PR #555: scripts/update-gh-pages: more intelligent http redirect
  • PR #556: e2e: allow user override for VM_SSH_USER over distro-ssh-user
  • PR #557: Improve CPU prioritization
  • PR #560: e2e: add vm-put-docker-image to the vm library
  • PR #561: memtier: rework building of pool tree by HW topology
  • PR #562: docs: improve rdt documentation
  • PR #563: memtier/pool test: fix fd leakage causing test panics with more data
  • PR #566: Kata container support
  • PR #567: config: handle composite structs in Module.validate()
  • PR #568: control/rdt: add option to disable rdt monitoring
  • PR #570: page-migrate: add cache-like container.GetPodID()
  • PR #571: config: fix typo in log message
  • PR #572: control/rdt: fix and simplify handling of implicit disabling
  • PR #573: control/rdt: empty class implies root class
  • PR #574: control/rdt: implement assignAll()
  • PR #575: control/rdt: do forceful rdt (re-)configuration
  • PR #576: control/rdt: correct usage of checkIdle() in configNotify()
  • PR #577: control/rdt: implement operating modes
  • PR #579: memtier: don't imply error by signature for functions that never fail
  • PR #580: docs: use an explicit version of recommonmark
  • PR #581: rdt: accept missing default classes in Discovery mode
  • PR #583: docs: refer to the latest release in the installation instructions
  • PR #586: rdt: use root class as a fallback to missing classes
  • PR #587: e2e: set imagePullPolicy on every test pod
  • PR #588: memtier: unify syntax for annotated preferences
  • PR #589: memtier: fix build error introduced by improper, unrebased merging of both #524 and #543
  • PR #590: memtier: more regular annotation interpretation for CPU allocation preferences
  • PR #591: fix: nil pointer dereference on updateSharedAllocations(nil)
  • PR #592: e2e: unified memory-type and cold-start annotation syntax
  • PR #594: policy/builtin/*: fix outdated comment about PolicyName
  • PR #595: docs: recognize/handle .md-links to element IDs
  • PR #596: server,resource-manager: flush logs after every request/event processed
  • PR #597: resource-manager: rename 'memtier' policy to 'topology-aware'
  • PR #598: podpools: policy for pod-granularity workload placement
  • PR #599: rdt: fix order of params passed to GetTasksInContainer()
  • PR #600: test: drop stale rdt testdata
  • PR #601: topology-aware: improved topology tree/node dump
  • PR #602: cpuallocator: add CPU priority levels
  • PR #604: e2e: properly set VM_COMPOSE_YAML when reloading existing vm-configs
  • PR #606: Extended detection of Intel Speed Selection Technology (SST)
  • PR #607: klog: skip headers for journald by default
  • PR #608: cri-resmgr: increase allowed service journal log bursts
  • PR #609: fixes: topology-aware policy cpu/memory accounting fixes
  • PR #610: resource-manager: force full reallocation when switching policies
  • PR #612: topology-aware: force reserved/kube-system containers to the root
  • PR #613: e2e: add tests for memset expansion in topology-aware policy
  • PR #614: resource-manager,dump: dump extra data for message disambiguation
  • PR #615: topology-aware: better and more readable logs
  • PR #616: topology-aware: memory accounting and memset expansion fixes
  • PR #617: resource-manager: catch containers earlier when they are gone
  • PR #618: e2e: update podpools tests to pass with new cpuallocator
  • PR #622: topology-aware: use normal as fallback for reserved
  • PR #623: e2e: update tests on pinning reserved CPUs
  • PR #624: topology-aware: use prettyMem() in log messages
  • PR #625: cache: (over)write cache file atomically
  • PR #626: resource-manager: save cache at startup
  • PR #627: cache: log name on pod/container removal
  • PR #628: rdt: support L2 cache allocation
  • PR #629: topology-aware: fix filtering out nodes with insufficient memory
  • PR #630: topology-aware: fix moving up memory grant
  • PR #631: pkg/sysfs: clarifying comment on getCPUMapping()
  • PR #632: e2e: update dynamic page demotion tests
  • PR #634: sample-configs: make cri-resmgr-configmap.example.yaml usable
  • PR #636: podpools: fix reflect JSON tag typo

v0.4.1: Improved documentation, end-to-end testing, bug fixes.

30 Nov 11:56
4e26f8c
Compare
Choose a tag to compare

The documentation in this release has been overhauled with significant structural improvements and additional content over previous ones. End-to-end test coverage has been vastly extended and the test framework significantly improved. This release contains a number of important bug fixes and a few other functional improvements. Here is a non-exhaustive list of these.

Bug fixes

  • agent:
    • refuse to start if NODE_NAME environment variable is not specified
  • memtier policy:
    • fix updating containers after shared pool changes
    • honor CPU isolation opt-out preference
    • honor allowed CPUs in resource discovery
    • fix PMEM-only NUMA node assignment for weird topologies
  • static-pools policy:
    • make dynamic (re-)configuration work properly
    • look for cmk isolate when parsing container command line
    • re-load legacy config on config update
    • only take pools configuration from legacy config
    • improved sanity check on pool configuration
    • fix node tainting
  • cri-resmgr:
    • fill in defaults for unspecified values in configuration

Other Improvements

  • cri-resmgr:
    • dump outbound requests if debugging is enabled for the 'cri/relay' source
  • resource controllers:
    • page-migrate: split out page-migration into a controller of its own
  • e2e test framework
    • vastly improved test coverage on multiple distros
  • builds:
    • build binary dist tarballs

Difference wrt. Rolling Master

With the exception of the PRs listed below, all others in the inclusive range #411 - #527 has been cherry-picked or back-ported from the rolling master branch to this release. The omitted PRs have been excluded due to backwards compatibility or other similar reasons:

  • #525: cri-resmgr: reuse 'rdt' logger for the split out rdt package
  • #490: rdt: use goresctrl 
  • #497: pkg/log: switch logger to use klog
  • #472: e2e: add tests for static-pools
  • #489: static-pools: slight refactoring and renaming
  • #483: static-pools: lazier node updates
  • #475: static-pools: drop all cmdline flags

v0.4.0: Improved support for Memory Tiering, Binary packages

24 Sep 14:19
@kad kad
15878a4
Compare
Choose a tag to compare

Major changes

  • 'topology-aware' policy superseded by 'memtier'
  • support for cold start of containers
  • support for dynamic demotion of memory
  • support for limiting container top tier/DRAM memory usage (require kernel support)
  • support for externally adjusting container resource assignments
  • multi-die aware resource allocation
  • binary distribution with packages for popular Linux distributions and images at Docker Hub

Detailed changelog

Policies

  • 'topology-aware' policy superseded by 'memtier', which
    • is a forked and improved version of 'topology-aware'
    • has the same basic functionality
    • has a number of improvements and extra functionality:
      • multi-die topology support
      • multi-tier (DRAM/PMEM) memory support
      • top tier/DRAM memory limiting
      • container 'cold start' support: force containers initially exclusively to PMEM
      • experimental dynamic page demotion: periodically move least-used pages from DRAM to PMEM
      • experimental support for dynamic external adjustments to container resource assignments
    • has a bunch of resource assignment/allocation fixes (which are not backported to 'topology-aware' any more)
    • will in the next release replace 'topology-aware' altogether
  • static-pools:
    • compatibility back-ports from CMK: advertise CPUs in 'shared', 'infra' pools via CMK_CPUS_SHARED, CMK_CPUS_INFRA environment variables
  • common:
    • support for new Pod annotation controls:
      • opt out from automatic topology hint generation:
        • topologyhints.cri-resource-manager.intel.com/pod: false
        • topologyhints.cri-resource-manager.intel.com/container.$name: false
      • set DRAM/top tier memory limit:
        • toptierlimit.cri-resource-manager.intel.com/pod: $limit
        • toptierlimit.cri-resource-manager.intel.com/container.$name: $limit
    • make simple container affinities always implicitly symmetric
    • limit user-defined container affinity to [-1000,1000]
    • re-trigger pod cgroupfs parent directory and QoS class discovery if necessary

Resource controllers

  • RDT:
    • remove controller-level class name mapping
    • don't consider assignment to a default class an error if no classes are defined
    • fix crash/misplaced logging of group deletion
  • Block I/O:
    • remove controller-level class name mapping
    • don't consider assignment to a default class an error if no classes are defined
  • CRI:
    • properly send out generated/queued UpdateContainerResources requests

Data collectors

  • cgroupstats:
    • use/report container IDs
    • fix hugetlb size parsing
  • avx:
    • switch to cilium/ebpf from iovisor/gobpf

cri-resmgr

  • new command line options:
    • reset cached configuration: --reset-config
    • reset cached policy data: --reset-policy
  • always set up node agent connection, even when running with --force-config
  • allow switching policies during startup, unless started with --disable-policy-switch

Packaging

  • install sample fallback config as fallback and not real config file
  • use /etc/default for defaults on debian-based distros
  • support Ubuntu 20.04, OpenSUSE 15.2

Documentation

Testing

  • end-to-end test framework added

v0.3.1: Packaging and build fixes

16 Jun 11:13
52d7160
Compare
Choose a tag to compare
Pre-release

This v0.3.1 patch release adds packaging and build fixes on top of the v0.3.0 release.

Changes:

  • feature: add command line options for resetting the active policy in the cache and allow this to happen automatically during startup if necessary
  • fix: NUMA CPU-/memory-attachment detection code to work with older kernels
  • fix: move from gobpf to Cilium-based AVX eBPF implementation to address build issues on older kernel
  • fix: add targets for containerized cross-builds for distro packages

v0.3.0: Memory management improvements

20 May 13:49
@kad kad
e8861cf
Compare
Choose a tag to compare
Pre-release
  • added memory-tiering policy: topology-aware policy with support for DRAM, PMEM (Intel Optate DC) and HBM (High Bandwidth Memory) allocation
  • added blockio controller: class-based control over block I/O using the cgroupfs blkio controller
  • added support for metrics collection:
    • collection of raw metrics data, exporting to Prometheus
    • AVX512 usage: collect per container AVX512 instruction usage, tag containers accordingly
  • rdt controller improvements: disjoint partitioning, L3 and memory bandwidth monitoring, and Intel RDT metrics
  • new annotations:
    • assign full pod or a container to block I/O or RDT class:
      • rdtclass.cri-resource-manager.intel.com/container.$container: class-name
      • rdtclass.cri-resource-manager.intel.com/pod: class-name
      • blockioclass.cri-resource-manager.intel.com/container.$container: class-name
      • blockioclass.cri-resource-manager.intel.com/pod: class-name
    • memtier policy preference for type of memory allocated to a container:
      • memory-type.cri-resource-manager.intel.com:
        $container: [dram,][pmem,][hbm]

v0.2.0: More generic runtime configuration handling.

20 Nov 12:28
8965a19
Compare
Choose a tag to compare

Implement a more general, unified mechanism for handling runtime configuration.

v0.1.0: First publicly available release

07 Nov 08:58
@kad kad
677f234
Compare
Choose a tag to compare

Initial release for the project with major functionality available in alpha state.

Note: this is pre-production Alpha release. Not for production use!