WIP: implement a secret vars store in nixpkgs #370444

Lassulus · 2025-01-03T01:49:04Z

This allows to create secrets (and public files) outside or inside the nix store in a more delclarative way. This is shipped with an example (working) implementation of an on-machine storage. The vars options can easily be used to implement custom backends and extend the behaviour to integrate with already existing solutions, like sops-nix or agenix.

I wanted to upstream this, since we use code similiar to this in https://clan.lol now for a while. Happy to discuss it a bit more. I'm not very fond of the name vars anymore, so maybe a first change could be to come up with a better name :)

I'm not sure if this code is working yet, I will try to develop it a bit more in the next coming days, but I wanted to get this out earlier so other people can throw in some feedback already :)

lucasew · 2025-01-03T23:51:26Z

How this compares to sops-nix?

Lassulus · 2025-01-04T00:06:20Z

How this compares to sops-nix?

This enhances sops-nix, It is more of an upstream evolution of agenix-rekey.

The gist is: This supports declaring how to generate the secrets you store in sops-nix in nixpkgs in the service module.

I will add a declaration to a service soon, so it is easier to understand :)

Lassulus · 2025-01-07T20:39:57Z

alright, removed some features to make the first implementation easier.
Some things are still missing in the on-machine backend like dependencies and prompts. but the rough shape should be more clear now :)

nixos-discourse · 2025-01-08T15:35:18Z

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2025-01-08-clan-weekly-changelog/58501/1

oddlama · 2025-01-08T22:04:17Z

This looks awesome, I'll have to play around with it a bit more to understand all the details.

Have you already thought about how this would work when someone wants to use vars/secrets in the initrd in stage1? I imagine there could be some kind of hen-egg problem since the initrd will be generated pretty early on when a new generation is activated. The configured vars/secrets may probably not exist at that time, so this might require activating twice (?)

lucasew · 2025-01-08T22:29:23Z

I am curious about how it would work in stage 1 because AFAIK / is not mounted yet in this phase, so how it will read the host key to decrypt the secrets.

Lassulus · 2025-01-09T11:40:25Z

Have you already thought about how this would work when someone wants to use vars/secrets in the initrd in stage1? I imagine there could be some kind of hen-egg problem since the initrd will be generated pretty early on when a new generation is activated.

The current code here doesn't run during activation but creates a script that can be run before activation. in clan.lol we solved that issue differently by having a requiredBy option which can take "activation". We would create the files necessary for activation beforehand in that case, but I think this adds too much complexity to this initial PR.
But this can be something we can add later on top of it.
For now just having all the files created in one phase would be easier and since the creation script can be run at any time (even before installation) they should be available for all phases.

Another problem is also the prompts which some vars can ask for. Having the generation run in the activation would make prompting for input very hard. So I think a better way would be to prompt in nixos-install/nixos-rebuild, so before running activation. But this is also something I will add in a later iteration :)

Lassulus · 2025-01-09T12:55:34Z

I am curious about how it would work in stage 1 because AFAIK / is not mounted yet in this phase, so how it will read the host key to decrypt the secrets.

This is not really in scope for this PR. The current on-machine implementation I added here, doesn't encrypt the secrets (yet?) so there should be no issue with stage-1. secrets can still be added to initrd if they are required to early boot and they should be available at that time.

nixos/modules/system/vars/options.nix

oddlama · 2025-01-12T13:03:18Z

nixos/modules/system/vars/options.nix

+            prompts = lib.mkOption {
+              description = ''
+                A set of prompts to ask the user for values.
+                Prompts are available to the generator script as files.
+                For example, a prompt named 'prompt1' will be available via $prompts/prompt1
+              '';
+              default = { };


Can we add a value option or something similar to each prompt? I would like to be able to pre-define the value of the prompt to ensure that a user can always opt-in to a purely declarative configuration.

Btw, I currently don't understand when an author of a nixos module would decide to utilize a prompt instead of a generated random value. If I am writing a generator for my personal configuration, I can understand that I might want to use prompt for certain things - but when would I want to use a prompt in an upstream module and force interactivity on every user?

I'm just a little worried right now that introducing prompts without a way to declaratively set their value will sooner or later result in some generators being added to nixpkgs which depend on some form of interactivity. Therefore I'd like to have a way to specify the value of a prompt ahead of time, so the generation script can then just copy the specified file to prompts/$name instead of asking the user a question.

we mainly use prompts for API keys, which we usually don't want to leak into the config. I guess we can think about pre declaring things, this would also make the testing code a bit easier which I worked on last time I touched that code. But this increases the surface every backend implementation has to take care of.

oddlama · 2025-01-12T13:51:56Z

nixos/modules/system/vars/options.nix

+                    owner = lib.mkOption {
+                      description = "The user name or id that will own the file.";
+                      default = "root";
+                    };
+                    group = lib.mkOption {
+                      description = "The group name or id that will own the file.";
+                      default = "root";
+                    };
+                    mode = lib.mkOption {
+                      type = lib.types.strMatching "^[0-7]{3}$";
+                      description = "The unix file mode of the file. Must be a 3-digit octal number.";
+                      default = "400";
+                    };


These are currently not used by the on machine generator. Please ignore if that is still in the works :)

I was thinking of removing them again, since the chmod/etc needs to happen after activation anyway and the user/group has to exist for that to work. But we use those definitions in clan.lol generators every now and then.

If you remove these, everything will be owned by root I presume? That might make it hard to provision secrets for services which don't use LoadCredential

ideally we want to fix that in the preStart or with tmpfiles.d but there could be certain cases where it would be necessary to do it beforehand. But maybe it would be better to have a separate activation-script/service for that, that takes a file somewhere on the filesystem and links/copies it to the target? but tmpfiles also does that already, so not sure

I'm also not sure what would be the best action here, I guess I'm mostly just used to being able to specify it on the secret itself

nixos/modules/system/vars/options.nix

This allows to create secrets (and public files) outside or inside the nix store in a more delclarative way. This is shipped with an example (working) implementation of an on-machine storage. The vars options can easily be used to implement custom backends and extend the behaviour to integrate with already existing solutions, like sops-nix or agenix.

nixos/modules/system/vars/on-machine-backend.nix

jfly

I'm excited about this!

Sorry for the pretty myopic/nitpicky review, hopefully it's more useful than noisy. Here are some more high level thoughts:

The vars options can easily be used to implement custom backends and extend the behaviour to integrate with already existing solutions, like sops-nix or agenix.

I'd be interested to hear more about how this would work. Is the idea that a tool like sops-nix would provide a "backend" (similar to the on-machine backend you've included here?). It seems like the bulk of the logic in this PR is in the generate-vars script in the on-machine backend. Are all backends going to have a similar large chunk of logic (generating inputs, prompting, invoking generators with tempdirs, moving files)?
I think it would be nice to lightly document just what a "backend" is (how to create one). For example, are they all supposed to provide a globally installed generate-vars command?
I personally find the generate-vars shell script to be surprisingly readable, but hard to confirm is robust. I think I saw a few missing shell escapes (I left comments about them). I'd feel more sure about things if we were instead generating a JSON config file for something written in python (or something) to deal with. (Similar to how sops-nix works under the hood if you are familiar with that.)
I'm not very fond of the name vars anymore, so maybe a first change could be to come up with a better name

I personally am not bothered by the word "vars". I assume this isn't called "secrets" because these things sometimes have a secret and a public part (as in your syncthing example)?
I did find the use of the word "var" to be a little confusing while reading the code. Specifically, the relationship between a "var", a "generator" and a "file". IIUC, a "var" has multiple "files" in it, and each file may be secret or public. A "generator" can generate these files. But what's the difference between a var and a generator? They sort of seem to be the same thing. I also saw the word "fact" floating around. I assume that's an old name, or something from clan?

jfly · 2025-01-26T10:03:15Z

nixos/modules/services/networking/syncthing.nix

+      ];
+      script = ''
+        syncthing generate --config "$out"
+        < "$out"/config.xml grep -oP '(?<=<device id=")[^"]+' | uniq > "$out"/syncthing.pub


nit: I'd feel better about this if we somehow asserted that the result of this is exactly 1 ID. I imagine things would get weird if this started returning multiple IDs or no ID at all.

Or, perhaps we could ask upstream syncthing if they'd be open to a change to syncthing generate to persist the id somewhere?

jfly · 2025-01-26T10:11:36Z

nixos/modules/system/vars/on-machine-backend.nix

+      OUT_DIR=''${OUT_DIR:-${cfg.fileLocation}}
+
+      # check if all files are present or all files are missing
+      # if not, they are in an incosistent state and we bail out


nit:

Suggested change

# if not, they are in an incosistent state and we bail out

# if not, they are in an inconsistent state and we bail out

jfly · 2025-01-26T10:20:28Z

nixos/modules/system/vars/options.nix

+          A module to be imported in every vars.files.<name> submodule.
+          Used by backends to define the `path` attribute.
+
+          Takes the file as an arument and returns maybe an attrset with should at least contain the `path` attribute.


nit:

Suggested change

Takes the file as an arument and returns maybe an attrset with should at least contain the `path` attribute.

Takes the file as an argument and returns maybe an attrset with should at least contain the `path` attribute.

jfly · 2025-01-26T10:24:29Z

nixos/modules/system/vars/options.nix

+                    name = lib.mkOption {
+                      type = lib.types.str;
+                      description = ''
+                        name of the public fact


"fact" feels like a new piece of vocabulary here. Is this perhaps a term from clan? Should this be "generated file" (or "var")?

Related: what does "public" mean here? Are there also private facts?

jfly · 2025-01-26T10:30:10Z

nixos/modules/system/vars/options.nix

+                      description = ''
+                        Whether the file should be deployed to the target machine.
+
+                        Enable this if the generated file is only used as an input to other generators.


nit: This admonition seems a little bit at odds with the fact that this defaults to true. I personally would expect an option that's documented as "enable this only if ..." to default to false.

idea: Would it be better if we could get rid of this option and instead deduce it based on if the generated file is used as an input to other generators? Right now, I think the API here doesn't allow us to know that (generators depend on other generators). Perhaps it would be better for generators to depend on specific files from other generators? In addition to allowing us to remove this option, it might also move some issues "further left" (such as if generator B depends on generator A and assumes that generator A will create a certain file, but never actually creates the file. We could detect that error when running A, rather than waiting for B to (hopefully) fail.

should be disable. We cannot deduce this automatically since a file can be referenced in another generator and still be needed to be deployed. We also cannot depend on single files by other generators, because files are stored as a unit. For the on-machine backend this option doesn't really make sense though, since we have no other storage location than where the files should be anyway.

We cannot deduce this automatically since a file can be referenced in another generator and still be needed to be deployed.

Sorry, I'm not quite following this sentence: if a file is referenced in another generator, then doesn't that mean that it does need to be deployed?

I think I'm unclear on where the generators are supposed to run. Is it on the target machine, or on some other machine doing the build?

well that depends on the backend that runs the generators. The declarations of the generators inside nixpkgs should be independent of where they run.
What I wanted to say, is that a generator can be a dependency of another generator but the secret file that is depended on can still be required by the system to be present during runtime. for example ssh-ca, where we would generate a certificate and a private key in a generator and later have a ssh key be generated that needs the private key to sign the public key, so it gets trusted automatically. In that case we would need to deploy the certificate from the first generator to the machine and the public key, private key and certificate from the second generator.

OK, I think that makes sense.

I think the part I'm missing is how these generated files actually get "wired up" to running services. Is that present in the syncthing example in this PR? (I don't think it is).

If we have some way of expressing "this file created by this generator is needed by this service" (which feels like something we'd have to do anyway), then wouldn't that let us deduce if the file should be deployed to the machine?

https://github.com/NixOS/nixpkgs/pull/370444/files#diff-fd4c91fd302760b673b3e9b6c8d3474f0ae371005b6be1baff552d0aa7ffe223R639-R640

jfly · 2025-01-28T00:36:46Z

nixos/modules/system/vars/on-machine-backend.nix

+        all_files_missing=true
+        all_files_present=true
+        ${lib.concatMapStringsSep "\n" (file: ''
+          if test -e ${file.path} ; then


Does this need shell escaping? (could file.path have spaces in it?)

jfly · 2025-01-28T00:39:49Z

nixos/modules/system/vars/on-machine-backend.nix

+            ${promptCmd.${prompt.type}}
+            echo -n "$prompt_value" > "$prompts"/${prompt.name}
+          '') (lib.attrValues gen.prompts)}
+          echo "Generating vars for ${gen.name}"


Does this need shell escaping? Could gen.name contain a double quote? I think so.

jfly · 2025-01-28T00:46:23Z

nixos/modules/system/vars/on-machine-backend.nix

+          ${lib.concatMapStringsSep "\n" (file: ''
+            OUT_FILE="$OUT_DIR"/${if file.secret then "secret" else "public"}/${file.generator}/${file.name}
+            mkdir -p "$(dirname "$OUT_FILE")"
+            mv "$out"/'${file.name}' "$OUT_FILE"


This '${file.name}' feels like a "poor man's escaping". I'm pretty sure that file.name could itself contain a ', so I don't think this works. Can we use lib.escapeShellArg here instead?

jfly · 2025-01-28T00:48:32Z

nixos/modules/system/vars/on-machine-backend.nix

+            mkdir -p "$(dirname "$OUT_FILE")"
+            mv "$out"/'${file.name}' "$OUT_FILE"
+          '') (lib.attrValues gen.files)}
+          rm -rf "$out"


nit: This is unnecessary because of the trap 'rm -rf $out' EXIT up above, right?

ah, the trap only runs on exit, it's better to clean up the files if they are no longer needed earlier. the trap is just in case the script exists unexpectedly

jfly · 2025-01-28T01:11:35Z

nixos/modules/services/networking/syncthing.nix

@@ -621,6 +621,21 @@ in {

  config = mkIf cfg.enable {

+    vars.generators.syncthing = {


I'm not understanding the glue between this generator and the syncthing service. How do the files created by this generator actually get wired up into the running service?

Lassulus · 2025-02-01T06:55:58Z

1. > The vars options can easily be used to implement custom backends and extend the behaviour to integrate with already existing solutions, like sops-nix or agenix.
   
   
   I'd be interested to hear more about how this would work. Is the idea that a tool like sops-nix would provide a "backend" (similar to the on-machine backend you've included here?). It seems like the bulk of the logic in this PR is in the `generate-vars` script in the on-machine backend. Are all backends going to have a similar large chunk of logic (generating inputs, prompting, invoking generators with tempdirs, moving files)?

yes basically every implementation has to take care of executing the scripts and storing the generated files in the location, also deployment of those files needs to be taken care of by a backend.

2. I think it would be nice to lightly document just what a "backend" is (how to create one). For example, are they all supposed to provide a globally installed `generate-vars` command?

no, the generate-vars is mostly only useful in this usecase, since generating and deploying of files happens at the same time. Otherwise I would do deployment in activation. But we cannot prompt for inputs there. So this was moved into a script that can be manually run at any time.

3. I personally find the `generate-vars` shell script to be surprisingly readable, but hard to confirm is robust. I think I saw a few missing shell escapes (I left comments about them). I'd feel more sure about things if we were instead generating a JSON config file for something written in python (or something) to deal with. (Similar to how sops-nix works under the hood if you are familiar with that.)

sure, you can just dump the vars config from nix with json. But that increases the complexity of the on-machine implementation, I want a simple one that people can read and understand how to adapt. sops-nix probably would interact more directly with the vars instead of shell scripts.

4. > I'm not very fond of the name vars anymore, so maybe a first change could be to come up with a better name
   
   
   I personally am not bothered by the word "vars". I assume this isn't called "secrets" because these things sometimes have a secret and a public part (as in your syncthing example)?
   I did find the use of the word "var" to be a little confusing while reading the code. Specifically, the relationship between a "var", a "generator" and a "file". IIUC, a "var" has multiple "files" in it, and each file may be secret or public. A "generator" can generate these files. But what's the difference between a var and a generator? They sort of seem to be the same thing. I also saw the word "fact" floating around. I assume that's an old name, or something from clan?

yeah, fact is the old name for vars. we can't really call them generators, because that is already used by other parts of nixos. but in the end vars are generators that generate secret or public files :D

Lassulus · 2025-02-01T07:01:02Z

pushed some changes, so some of the nitpicks should be more or less solved now. primarily restricted the allowed characters in certain names, so we can make escaping easier.

a-kenji · 2025-02-01T08:40:28Z

Possible names for vars could possibly be:

vars
keys
facts
more?

Would be happy if someone has more inputs on this.
As already said we already have generators themselves covered as a toplevel concept.

But what this would be is some form of key/var/fact-generator / generating mechanism.

Lassulus requested a review from oddlama January 3, 2025 01:49

github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Jan 3, 2025

Lassulus force-pushed the nixos-vars branch from 4a478cc to e027034 Compare January 7, 2025 20:38

github-actions bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 labels Jan 7, 2025

Lassulus force-pushed the nixos-vars branch 3 times, most recently from 3f0acbe to 339d2bc Compare January 7, 2025 23:43

Lassulus force-pushed the nixos-vars branch 3 times, most recently from 46b881c to b6b0bed Compare January 12, 2025 00:50

oddlama reviewed Jan 12, 2025

View reviewed changes

Lassulus force-pushed the nixos-vars branch from b6b0bed to fc64ee4 Compare January 13, 2025 10:49

Mic92 reviewed Jan 17, 2025

View reviewed changes

nixos/modules/system/vars/on-machine-backend.nix Show resolved Hide resolved

h7x4 added 8.has: module (new) This PR adds a module in `nixos/` 8.has: tests This PR has tests labels Jan 17, 2025

oddlama mentioned this pull request Jan 27, 2025

generator.dependencies as a list feels awkward in some cases oddlama/agenix-rekey#69

Closed

jfly reviewed Jan 28, 2025

View reviewed changes

WIP: nixos/syncthing: manage secrets with vars

e8a6fef

Lassulus force-pushed the nixos-vars branch from fc64ee4 to e8a6fef Compare February 1, 2025 07:00

	# if not, they are in an incosistent state and we bail out
	# if not, they are in an inconsistent state and we bail out

	Takes the file as an arument and returns maybe an attrset with should at least contain the `path` attribute.
	Takes the file as an argument and returns maybe an attrset with should at least contain the `path` attribute.

		@@ -621,6 +621,21 @@ in {

		config = mkIf cfg.enable {

		vars.generators.syncthing = {

WIP: implement a secret vars store in nixpkgs #370444

Are you sure you want to change the base?

WIP: implement a secret vars store in nixpkgs #370444

Conversation

Lassulus commented Jan 3, 2025

lucasew commented Jan 3, 2025

Lassulus commented Jan 4, 2025

Lassulus commented Jan 7, 2025

nixos-discourse commented Jan 8, 2025

oddlama commented Jan 8, 2025

lucasew commented Jan 8, 2025

Lassulus commented Jan 9, 2025

Lassulus commented Jan 9, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Lassulus Jan 12, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jfly left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Lassulus commented Feb 1, 2025

Lassulus commented Feb 1, 2025

a-kenji commented Feb 1, 2025

Lassulus Jan 12, 2025 •

edited

Loading