Replace sample inventory by auto-generated variable references from defaults #11897

VannTen · 2025-01-16T14:29:09Z

What type of PR is this?
/kind cleanup

What this PR does / why we need it:

Remove sample inventory
Generate variables reference from all defaults files

The two mains reasons to do this are:

the sample inventory is a pain to work with when making defaults changes / changing structure (see Adapt checksums and versions to new hashes updater #11890, Drop explicit k8s_cluster group in CI inventory #11858 (comment)). It tends to mask broken defaults
it encourages bad usage : see discussion in Add description of how to manage only Kubespray custom configuration in a separate git repository #10645 and Ansible variables: reorganize defaults, documentation and inventory samples #10697

Which issue(s) this PR fixes:
Fixes #11876

Special notes for your reviewer:

/label ci-extended
@MrFreezeex
@ant31

Does this PR introduce a user-facing change?:

The sample inventory is removed

k8s-ci-robot · 2025-01-16T14:29:13Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

k8s-ci-robot · 2025-01-16T14:29:16Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: VannTen

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [VannTen]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

VannTen · 2025-01-16T14:29:32Z

/ok-to-test

This is an alternative to the sample inventory, which does not need manual updating.

Sometimes the change done by pre-commit are not obvious, this should help.

ant31 · 2025-01-17T11:02:39Z

It makes maintenance easier (no need to sync variables back to the samples) but not for users.
From 'nicely' separated variables by domain (auth / network / gpu...) we get a single ~7000 lines file reference.
It removes the quick "cp -r samples", tweak two vars and ready to go.

The sample was here to show-up the variables that users most likely want to edit (e.g cluster version, oidc...) and not all of them for an easier on boarding / maintenance.

About #10645

Even if some users are smart and store in a custom git repository files only from the inventory/mycluster directory, there are still a lot of files!

How is it an issue to commit ~20 files ? Why is it better to commit 2? The workflow stays the same (commit/push/pull etc...)
Also I would rather have 20 files with 20 vars in them well organized, than one with 400.

ant31 · 2025-01-17T11:12:44Z

To balance both side (show users subset of things to 'edit' and easier maintenance) something like:

generate-inventory myclusterA --cloud gcp --network cilium --with-gpu --with-oidc --container=containerd

and would create something like:

.myclusterA/group_vars/k8s_cluster/k8s-cluster.yml
.myclusterA/group_vars/k8s_cluster/addons.yml
.myclusterA/group_vars/k8s_cluster/k8s-net-cilium.yml
.myclusterA/group_vars/etcd.yml
.myclusterA/group_vars/all
.myclusterA/group_vars/all/gcp.yml
.myclusterA/group_vars/all/containerd.yml
.myclusterA/group_vars/all/etcd.yml
.myclusterA/group_vars/all/gpu.yml
.myclusterA/group_vars/all/all.yml
.myclusterA/inventory.ini

With all lines 'commented' to show that those are already set in default and can be removed if needed

Later it can help also existing users during upgrades as we have would have more control on var inside inventories:

upgrade-inventory myclusterA

VannTen · 2025-01-17T13:03:46Z

It removes the quick "cp -r samples", tweak two vars and ready to go.

Yes, that's the point, you still should only need to add two vars, for all the rest, kubespray defaults should handle it.
Instead of editing the file you're just creating it.

The sample was here to show-up the variables that users most likely want to edit (e.g cluster version, oidc...) and not all of them for an easier on boarding / maintenance.

There is lots of stuff in the inventory/sample, and the majority is setup specific (cloud provider / container engine / network plugin). So while it certainly started that way, IMO is does not reflect the current state.

$ find inventory/sample/ -type f -exec wc -l {} \;
139 inventory/sample/group_vars/all/all.yml
9 inventory/sample/group_vars/all/aws.yml
40 inventory/sample/group_vars/all/azure.yml
59 inventory/sample/group_vars/all/containerd.yml
2 inventory/sample/group_vars/all/coreos.yml
9 inventory/sample/group_vars/all/cri-o.yml
59 inventory/sample/group_vars/all/docker.yml
16 inventory/sample/group_vars/all/etcd.yml
10 inventory/sample/group_vars/all/gcp.yml
22 inventory/sample/group_vars/all/hcloud.yml
17 inventory/sample/group_vars/all/huaweicloud.yml
55 inventory/sample/group_vars/all/oci.yml
116 inventory/sample/group_vars/all/offline.yml
69 inventory/sample/group_vars/all/openstack.yml
24 inventory/sample/group_vars/all/upcloud.yml
32 inventory/sample/group_vars/all/vsphere.yml
38 inventory/sample/group_vars/etcd.yml
283 inventory/sample/group_vars/k8s_cluster/addons.yml
386 inventory/sample/group_vars/k8s_cluster/k8s-cluster.yml
132 inventory/sample/group_vars/k8s_cluster/k8s-net-calico.yml
380 inventory/sample/group_vars/k8s_cluster/k8s-net-cilium.yml
51 inventory/sample/group_vars/k8s_cluster/k8s-net-custom-cni.yml
18 inventory/sample/group_vars/k8s_cluster/k8s-net-flannel.yml
63 inventory/sample/group_vars/k8s_cluster/k8s-net-kube-ovn.yml
73 inventory/sample/group_vars/k8s_cluster/k8s-net-kube-router.yml
6 inventory/sample/group_vars/k8s_cluster/k8s-net-macvlan.yml
64 inventory/sample/group_vars/k8s_cluster/k8s-net-weave.yml
20 inventory/sample/inventory.ini

About #10645

Even if some users are smart and store in a custom git repository files only from the inventory/mycluster directory, there are still a lot of files!

How is it an issue to commit ~20 files ? Why is it better to commit 2? The workflow stays the same (commit/push/pull etc...) Also I would rather have 20 files with 20 vars in them well organized, than one with 400.

The issue is not having 20 files, it's having more than you need and not knowing if you need it.

If you copy the sample inventory, unless you're recording everything in your commit message (which most people don't, and most people don't look at git history either), 6 months after, you won't remember if the variable you have in your inventory is there because you put it deliberately, or because it was in the sample inventory.
(I'm speaking of experience here. When I first started working on the clusters I'm currently managing, the sample inventory was copied for each cluster, and it was a little nightmare to figure out what was needed and what wasn't

generate-inventory
update-inventory

I think this would be very hard to maintain effectively, especially update-inventory (unless we only support unmodified inventory, but franky that would make it a bit useless). How would we ensure we distinguishes between just a copied default and an intentional choices ?

ant31 · 2025-01-17T14:11:18Z

There is lots of stuff in the inventory/sample, and the majority is setup specific (cloud provider / container engine / network plugin). So while it certainly started that way, IMO is does not reflect the current state.

Then I would rather go in that direction.

The issue is not having 20 files, it's having more than you need and not knowing if you need it.

Then it goes in the wrong direction, as a new user you would have no idea what you need what you don't need because we won't cherry pick few of them anymore (can argue there are too many already). They would just have a list of the 7000 knobs they can tweak.

If you copy the sample inventory, unless you're recording everything in your commit message (which most people don't, and most people don't look at git history either), 6 months after, you won't remember if the variable you have in your inventory is there because you put it deliberately, or because it was in the sample inventory.

Any non-commented value is intentional.

I think this would be very hard to maintain effectively, especially update-inventory (unless we only support unmodified inventory, but franky that would make it a bit useless). How would we ensure we distinguishes between just a copied default and an intentional choices ?

If it's set it's intentional.

Another side effect is that changing a default in a role could break clusters, by copying a subset of default in the inventory, that the user "should" have reviewed it freeze the user configuration to a certain state.

Let's keep providing a curated sample inventory (generated or not), I agree the current one is not optimal and too large, it's still better than none.

ant31 · 2025-01-17T14:14:04Z

.ansible-lint

@@ -37,5 +37,6 @@ exclude_paths:
  - tests/files/custom_cni/cilium.yaml
  - venv
  - .github
+  - docs/ansible/variable_reference.yml


let's keep grouping the variables by domain in multiple files.
then let's have people the cherry pick/copy past what they need:

cp docs/ansible/variables/cloud/gcp.yml inventory/mycluster/group_vars/
cp docs/ansible/variables/network/calico.yml inventory/mycluster/group_vars/

Like, one file per role defaults ? That's mostly redundant with the defaults file themselves, no ?
Especially if we recreate the hierarchy.

it's already redundant with roles.

not all per role but some hierarchy especially when configuration are exclusive:

calico vs weave vs cilium...

crio vs docker vs ....

gcp vs azure ...

and then maybe some hierarchy but not down to the role:

kubernetes/vars.yml (all cluster configs)

apps/vars.yaml (all apps)

but can comes later too. need a bit more though on it.

ant31 · 2025-01-17T14:15:07Z

contrib/network-storage/heketi/README.md

@@ -12,16 +12,16 @@ Heketi provides a CLI that provides users with a means to administer the deploym

 ## Install

-Copy the inventory.yml.sample over to inventory/sample/k8s_heketi_inventory.yml and change it according to your setup.
+Copy the inventory.yml.sample over to `<inventory_path>/k8s_heketi_inventory.yml` and change it according to your setup.


https://github.com/heketi/heketi is abandoned let's drop it in another PR.

VannTen · 2025-01-17T15:02:15Z

Then it goes in the wrong direction, as a new user you would have no idea what you need what you don't need because we won't cherry pick few of them anymore (can argue there are too many already). They would just have a list of the 7000 knobs they can tweak.

But you don't need to look at the reference. I think the playbook runs without any variable defined in inventory just fine (would need to check but I think anything we define in CI is optional).

> If you copy the sample inventory, unless you're recording everything in your commit message (which most people don't, and most people don't look at git history either), 6 months after, you won't remember if the variable you have in your inventory is there because you put it deliberately, or because it was in the sample inventory. Any non-commented value is intentional.

Unfortunately, it's not. Users are copying this directly, just look at the questions in slack ``` $ find inventory/sample/ -type f -exec cat {} + | grep -E '^[a-z0-9_]+:' | wc -l 128 ``` At a guick glance, there is not a lot there that would need to be edited on a typical setup. (maybe LB stuff, not even sure). Besides, the commented values have a bad effect as well. If a user copied the sample inventory (with commented value and comments explaining it) in version N of kubespray, and now they upgraded a couple of times and are in N+3, with the same inventory. Suppose we changed/deprecated a variable `variable_a`. `variable_a` will still be in the user inventory, commented out. If they try to change it (add oidc aauth for instance), or copy the sample to a new cluster, they'll have outdated documentation, or even a variable which Kubespray won't consider at all. Docs should be kept in sync with Kubespray itself, and we can't do that if they're in the users inventories.

Another side effect is that changing a default in a role could break clusters, by copying a subset of default in the inventory, that the user "should" have reviewed it freeze the user configuration to a certain state.

But not all do, judging from the questions we get in slack. (And I don't think a majority do either) We can't rely on this to avoid breaking changes breaking stuff, because we can't guarantee user inventories anyway.

VannTen · 2025-01-17T15:18:06Z

I've linked this PR in the slack channel in the hopes we can have some users join the discussion, so we can have more info on how the sample inventory is used.

Nathanael-Mtd · 2025-01-20T08:59:13Z

I understand that sample is complex and a pain to maintain, I'm ok with that removal for that reason.
But on my side, when I started Kubernetes 4 years ago, it helped me SO MUCH to understand Kubernetes settings and features and was a great resource to start deploying a cluster.

Also I use it to upgrade my clusters by comparing my inventory folder with the sample one.

I don't know how new users will start using Kubespray after that removal, and I'm affraid they will probably find another solutions to deploy clusters if they don't have a tool to generate inventory files with important default values.

vdveldet · 2025-01-20T10:58:29Z

Jumped away from the samples inventory a long time ago and retain only the basics in 1 file.
I always use the container version as runtime (so no issues with versions and dependency as such) and load the inventory-dir in the container.

My inventory looks like this :

.
├── group_vars
│   └── all
│       └── configuration.yml
├── host_vars
│   ├── kubnode01
│   │   └── os.yml
│   ├── kubnode02
│   │   └── os.yml
│   ├── kubnode03
│   │   └── os.yml
│   ├── kubnode04
│   │   └── os.yml
│   └── kubnode05
│       └── os.yml
└── hosts.yml

I'm parsing the default values to get most of the information for the variable names and meaning
Basically find . -type d -name defaults will give that info and from there find want I need and trust the defaults.

When I explain the inventory to a colleague, they always came back with the question. "Why is there a separation in k8s_cluster and all in the group_vars" so we removed everything not needed that created this confusion, resulting in configuration.yml

VannTen · 2025-01-20T16:51:34Z

But on my side, when I started Kubernetes 4 years ago, it helped me SO MUCH to understand Kubernetes settings and features and was a great resource to start deploying a cluster.

Could you explicit ? Did you use it as documentation, reference, copying the inventory ?

Also I use it to upgrade my clusters by comparing my inventory folder with the sample one.

Like, something like `git diff <tag> <old_tag> -- inventory/sample` ? It was my impression that given we miss updating it pretty regularly.

I don't know how new users will start using Kubespray after that removal, and I'm affraid they will probably find another solutions to deploy clusters if they don't have a tool to generate inventory files with important default values.

The sample should not be 'default values' => that's what role defaults are for. And ideally (and it's seems to work given #11898) there is not much to define unless you need to customize (at which point you need to edit your inventory anyway).

Nathanael-Mtd · 2025-01-20T17:07:44Z

Could you explicit ? Did you use it as documentation, reference, copying the inventory ?

I used as reference and started new cluster by copying and editing the sample folder to a new one to custom my cluster.

Like, something like git diff <tag> <old_tag> -- inventory/sample ? It was my impression that given we miss updating it pretty regularly.

I use a file comparaison tool (an VSCode extension) with sample folder as source and my inventory folder as destination.

The sample should not be 'default values' => that's what role defaults are for. And ideally (and it's seems to work given #11898) there is not much to define unless you need to customize (at which point you need to edit your inventory anyway).

Yeah sure, but I used it to get some "default keys/values" with comments, without making search on the whole codebase to find keys I need to change (I ended up to done that for specific things).

Overall I'm ok with the sample folder removal, because this PR adds generated variables reference file, it will be more reliable.
On my futures cluster upgrades, I will probably use that file as a source for maintaining custom cluster configs, and it will be so better than sample folder comparaison.

Remove sample inventory

2cf1701

k8s-ci-robot requested review from ErikJiang and MrFreezeex January 16, 2025 14:29

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jan 16, 2025

k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Jan 16, 2025

VannTen force-pushed the cleanup/remove_sample_inventory branch 5 times, most recently from c3aebda to 09c9998 Compare January 17, 2025 08:54

VannTen added 4 commits January 17, 2025 10:54

Generate variables reference from all defaults files

fc1349b

This is an alternative to the sample inventory, which does not need manual updating.

Adapt CI/vagrant to run without sample inventory

a38a689

docs: remove reference to sample inventory

79e4b2d

ci: show pre-commit diff on failure

008c8c9

Sometimes the change done by pre-commit are not obvious, this should help.

VannTen force-pushed the cleanup/remove_sample_inventory branch from 09c9998 to 008c8c9 Compare January 17, 2025 09:54

VannTen marked this pull request as ready for review January 17, 2025 09:58

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 17, 2025

k8s-ci-robot requested a review from cyclinder January 17, 2025 09:58

ant31 reviewed Jan 17, 2025

View reviewed changes

VannTen mentioned this pull request Jan 17, 2025

Run CI without the sample inventory #11898

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace sample inventory by auto-generated variable references from defaults #11897

Replace sample inventory by auto-generated variable references from defaults #11897

VannTen commented Jan 16, 2025 •

edited

Loading

k8s-ci-robot commented Jan 16, 2025

k8s-ci-robot commented Jan 16, 2025

VannTen commented Jan 16, 2025

ant31 commented Jan 17, 2025

ant31 commented Jan 17, 2025 •

edited

Loading

VannTen commented Jan 17, 2025

ant31 commented Jan 17, 2025

ant31 Jan 17, 2025

VannTen Jan 17, 2025

ant31 Jan 17, 2025

ant31 Jan 17, 2025

VannTen commented Jan 17, 2025 via email

VannTen commented Jan 17, 2025

Nathanael-Mtd commented Jan 20, 2025

vdveldet commented Jan 20, 2025

VannTen commented Jan 20, 2025 via email

Nathanael-Mtd commented Jan 20, 2025

Replace sample inventory by auto-generated variable references from defaults #11897

Are you sure you want to change the base?

Replace sample inventory by auto-generated variable references from defaults #11897

Conversation

VannTen commented Jan 16, 2025 • edited Loading

k8s-ci-robot commented Jan 16, 2025

k8s-ci-robot commented Jan 16, 2025

VannTen commented Jan 16, 2025

ant31 commented Jan 17, 2025

ant31 commented Jan 17, 2025 • edited Loading

VannTen commented Jan 17, 2025

ant31 commented Jan 17, 2025

ant31 Jan 17, 2025

Choose a reason for hiding this comment

VannTen Jan 17, 2025

Choose a reason for hiding this comment

ant31 Jan 17, 2025

Choose a reason for hiding this comment

ant31 Jan 17, 2025

Choose a reason for hiding this comment

VannTen commented Jan 17, 2025 via email

VannTen commented Jan 17, 2025

Nathanael-Mtd commented Jan 20, 2025

vdveldet commented Jan 20, 2025

VannTen commented Jan 20, 2025 via email

Nathanael-Mtd commented Jan 20, 2025

VannTen commented Jan 16, 2025 •

edited

Loading

ant31 commented Jan 17, 2025 •

edited

Loading