-
-
Notifications
You must be signed in to change notification settings - Fork 505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The Plan, phase 1 #1341
base: master
Are you sure you want to change the base?
The Plan, phase 1 #1341
Conversation
f2bd544
to
454a716
Compare
|
||
{ | ||
options = { | ||
system.primaryUser = lib.mkOption { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have an option to disable primary user, so that dogfooding finds problems with not having a primary user sooner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized the default is null
, so never mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, leaving it at the default will ensure your system doesn’t depend on any primary user and give a hopefully‐helpful error message if you set anything that requires it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in practice, it may be useful for us to
nest this intousers.users.*.homebrew
instead, at the expense of
being an unsupported setup if used to its full potential. Since
that would be a breaking change to the inteface anyway, I think
addinghomebrew.user
for now is acceptable.
I understand the motive here, but are you sure it's not better to make more breaking changes in one go? I'm worried that going through another great migration would agitate some.
I previously thought that too – that we should figure out the correct interfaces and write good migration documentation and tools and get all the breaking changes over with in one go – but the end result was that this work stalled out for years because there’s a lot of API design and implementation work to get things to a good state: we need to decide which things should be global or per‐user; we need to decide what should stay in nix-darwin and what should be deduplicated with Home Manager; for things like our If we tried to do it all at once, this work would have spent even longer in development hell and be unreviewably big. For the longest time i wasn’t sure how to resolve that, but after discussing on Matrix I realized that introducing a In the case of Homebrew, |
229a7c6
to
a387b4b
Compare
Most of the system.defaults stuff should really move to home-manager anyway, as you said. It's a natural fit and it works well. |
I've been running this branch on my machine for the past week and had zero issues with it, thanks for these great improvements! |
Been running this for a few days since I got back from vacation. Noticing issues with nix store on reboots, haven't pinned down what's going on atm. Not sure if I need to do something different, I just tested copying the branch and rebasing it on master. Running my normal rebuild mentions the sudo requirement, doing a sudo rebuild mentions I need a primary user, adding a primary user and then it rebuilds. Just noting symptoms:
2025-03-08 14:57:39.712350 <Warning> Could not find and/or execute program specified by service: 2: No such file or directory: /nix/store/fa3jr2wxgbiyk37h0imzhhj13c3p6gqv-yabai-7.1.10/bin/yabai
2025-03-08 14:57:39.712351 <Error> Service could not initialize: access(/nix/store/fa3jr2wxgbiyk37h0imzhhj13c3p6gqv-yabai-7.1.10/bin/yabai, X_OK) failed with errno 2 - No such file or directory, error 0x6f - Invalid or missing Program/ProgramArguments
2025-03-08 14:57:39.712352 <Error> initialization failure: 24D70: xpcproxy + 36768 [1088][E4C3A3C3-0D57-3E4D-8388-880BC7F5F19E]: 0x6f
2025-03-08 14:57:39.712353 <Notice> Service setup event to handle failure and will not launch until it fires.
2025-03-08 14:57:39.712354 <Error> Missing executable detected. Job: 'org.nixos.yabai' Executable: '/nix/store/fa3jr2wxgbiyk37h0imzhhj13c3p6gqv-yabai-7.1.10/bin/yabai' |
That’s very confusing. I’m not sure how that could result from these changes. Are you sure you get the same result if you run the base commit that you rebased on top of? Were there any conflicts during the rebase that you had to resolve? |
Yeah, sorry... not a lot of information, I know... Probably can just disregard it for now... I'll try bisecting and see where it might have come from. Even switching back to master I get the behavior still. |
This is the exact reason |
This is using the nix-darwin service. I do see that it doesn't contain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's finally happening! This looks really good in general — left a couple notes and questions but I think many of these changes may just have to be breaking.
@@ -15,6 +15,8 @@ in | |||
{ | |||
homebrew.enable = true; | |||
|
|||
homebrew.user = "test-homebrew-user"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be misunderstanding how the tests work, but doesn't this need the user test-homebrew-user
to exist?
Although I suspect that a later commit will address this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test systems don’t actually get activated; we just build the system and check the resulting package. (Sadly.) So this is fine.
@@ -1,6 +1,8 @@ | |||
{ config, pkgs, lib, ... }: | |||
|
|||
{ | |||
system.primaryUser = "test-defaults-user"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same concern about the user needing to exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Same answer.)
modules/system/launchd.nix
Outdated
# Set up user launchd services in ~/Library/LaunchAgents | ||
echo "setting up user launchd services..." | ||
|
||
${concatStringsSep "\n" (launchdVariables config.launchd.user.envVariables)} | ||
${concatStringsSep "\n" (launchdVariables "sudo -u ${user} --" config.launchd.user.envVariables)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using sudo -u ${user}
does feel like one of those weird corner-cases where launchd will just fail to do what you expect it to — I like the idea of using launchctl asuser
, especially considering it hasn't been replaced with anything else (other than using a domain target for the given user), but I do notice the man page says
This executes the given command in as similar an execution context as possible to that of the target user's bootstrap. Adopted attributes include the Mach bootstrap namespace, exception server and security audit session. It does not modify the process' credentials (UID, GID, etc.) or adopt any user-specific environment variables. It affects only the Mach bootstrap context and directly- related attributes.
(emphasis mine). This might mean that the way to do it is
sudo -u ${user} -- launchctl asuser $(id -u ${user})
which is ugly but seems like it carries over more of the appropriate context.
That said, if this is working as-is in multi-user setups I'm also inclined to leave it as is until we swap to the new launchctl commands. My concern is mostly around activation for users who aren't graphically logged-in. This is already sometimes a pain point for me when I darwin-rebuild over SSH, and so I imagine that we may run into similar issues here if user A (logged in graphically) runs darwin-rebuild and the launchd activation doesn't work properly for user B (maybe not logged in, maybe logged in in a separate graphical session, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I haven’t tested this extensively; if someone can confirm that it seems to work as‐is then I’d be inclined to just leave it until we start using the recommended commands rather than adding more deprecated ones. I’m certainly curious how this goes with actiavtion over SSH.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mirkolenz @khaneliman and anyone else who's running this branch: any chance you manage any multi-user nix-darwin systems? Have you noticed any strangeness with launchd activation, especially if you've happened to run activations as one user while a different user is actively signed in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or activation over SSH
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also occurring to me that even if this doesn't work perfectly, it should continue to work at least as well as the situation before this PR. I'd say that if anyone reports an issue with this, we can try and track it down, but otherwise we can just leave it as you have it here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All systems I tested this on are single-user only, sorry...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in, because people will usually be doing sudo darwin-rebuild …
as their primary user, and so the sudo sudo --user=primary-user
will cancel out the UID stuff and the Mach bootstrap context will be inherited from their graphical session? I suppose so. But ideally this should work even if you set a different primary user, too, so it’d be good to know one way or another. I just haven’t had the time to set up a VM to test a separate admin user, or SSH activation, or whatever. (I tested all this on my development machine because I love danger.)
@@ -9,6 +9,10 @@ let | |||
"defaults write ${domain} '${key}' $'${strings.escape [ "'" ] (generators.toPlist { } value)}'"; | |||
|
|||
defaultsToList = domain: attrs: mapAttrsToList (writeDefault domain) (filterAttrs (n: v: v != null) attrs); | |||
userDefaultsToList = domain: attrs: map | |||
(cmd: "sudo --user=${escapeShellArg config.system.primaryUser} -- ${cmd}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you use sudo -u ${user}
for launchctl
but sudo --user=${user}
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason at all. I’ve moved to --user=
across the board.
config.nix.enable | ||
&& !config.nixpkgs.flake.setNixPath | ||
&& config.system.stateVersion < 6 | ||
&& options.environment.darwinConfig.highestPrio == (mkOptionDefault {}).priority |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hate that I can't think of a better way to check that this option wasn't changed from the default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only way I've done it, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it’s awful and I hate it. But I hate not giving good error messages when doing complicated migrations even more.
default = | ||
config.users.users.${config.system.primaryUser}.home or "/Users/${config.system.primaryUser}"; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this break anything if config.system.primaryUser
is set to null
(the default)? If I'm reading it correctly, it's only an issue if primaryUserHome
is read by something, in which case primaryUser
needs to be set anyway, but I wonder what the error message would look like. Maybe it's caught by nix.nixPath
requiring primaryUser
to be set? Not sure that works for environment.darwinConfig
though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’ll be an evaluation error if anything depends on it. But, yes, that stuff should set system.requiresPrimaryUser
(as environment.darwinConfig
does) and therefore hopefully give a helpful error message before anything that uses it would get evaluated. "${null}"
is also an error so I’m not sure allowing it to be null
would improve anything. Maybe this could be a more descriptive throw
in that case? But I haven’t triggered it so far in testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe
lib.warnIf (config.system.primaryUser == null) "`system.primaryUser` is unset but `system.primaryUserHome` is being read" config.users.users.${config.system.primaryUser}.home or "/Users/${config.system.primaryUser}";
?
On the other hand, it seems like this problem will only be hit due to a future mistake on our part (reading primaryUserHome
in an option without adding that option to requiresPrimaryUser
), so we could just leave this as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making it a throwIf
might be a good idea, and seems pretty harmless. I’ll think about it. The main thing is that I wish we could make incorrect uses of config.system.primaryUser
fail loudly in the same way, but that’s harder.
system.activationScripts.postActivation.text = mkAfter '' | ||
nix-channel --remove darwin || true | ||
|
||
${optionalString (config.system.primaryUser != null) '' | ||
sudo \ | ||
--user=${config.system.primaryUser} \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not shell escaping the primary user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly anyone who tries to set a username that requires shell escaping has it coming. But, yeah, fixed.
@@ -65,7 +65,7 @@ in | |||
ln -sfn /run/current-system /nix/var/nix/gcroots/current-system | |||
fi | |||
|
|||
${config.system.activationScripts.etcChecks.text} | |||
${config.system.activationScripts.checks.text} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The one thing I am concerned about here is if checks
has stuff that needs to be run in a graphical context. Based on a quick search through our in-tree checks it seems like this change would cause errors for the activation daemon from modules/users/default.nix
in some cases, it might cause errors for our other checks, and might for out-of-tree checks if people were assuming that checks
would only run in a graphical session.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The activation daemon only activates system configurations that were previously current. So in theory it should never be trying to delete users and so on. What other checks do you think might be problematic here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good point. I don't have any problematic checks I can think of, and they should be read-only anyway, so I'm good with this.
${cfg.activationScripts.checks.text} | ||
${cfg.activationScripts.createRun.text} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also a potential point where out-of-tree checks could break, if people assume that /run
will already exist. I think this is still a worthwhile change to make, but worth pointing out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could put it before checks
if this is considered a big problem; I just figured it’s best to change the system as little as possible if any checks are failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is definitely an improvement that we should make, just noting the potential issue here in case we see it in the future.
This adds an optional explicit `homebrew.user` option that allows users to avoid setting `system.primaryUser`, partly as a proof of concept of what the interfaces should look like in the future. Homebrew only officially support one global installation, so a singleton matches upstream’s expectations; in practice, it may be useful for us to nest this into `users.users.*.homebrew` instead, at the expense of being an unsupported setup if used to its full potential. Since that would be a breaking change to the inteface anyway, I think adding `homebrew.user` for now is acceptable. (I think one native Apple Silicon and one Rosetta 2 Homebrew installation – under `/opt/homebrew` and `/usr/local` respectively – may be exceptions to this lack of upstream support, but that would be complicated to support even with `users.users.*.homebrew`.) I’m not entirely sure where in system activation this should go. Probably after the user defaults and launch agents stuff, to match the existing logic in user activation, and I lean towards doing it as late as possible; too early and we might not have the users and groups required to bootstrap a Homebrew installation set up, but as Homebrew installations could be fiddly and fail, doing it in the middle could leave a partially‐activated system. Probably it should be done in a launch agent or something instead, but this is my best guess as to the appropriate place for now. The downside is that activation scripts generally won’t be able to assume that the Homebrew prefix is populated according to the current configuration, but they probably shouldn’t be depending on that anyway?
I’m not *completely* certain that this handles user agents correctly. There is a deprecated command, `launchctl asuser`, that executes a command in the Mach bootstrap context of another user`. <https://scriptingosx.com/2020/08/running-a-command-as-another-user/> claims that this is required when loading and unloading user agents, but I haven’t tested this. Our current launchd agent logic is pretty weird and broken already anyway, so unless this actively regresses things I’d lean towards keeping it like this until we can move over entirely to `launchctl bootstrap`/`launchctl kickstart`, which aren’t deprecated and can address individual users directly. Someone should definitely test it more extensively than I have, though.
System activation scripts shouldn’t (and soon won’t be able to) rely on `$HOME` being the primary user’s.
These can’t be relied upon in a post‐user‐activation world. Technically a breaking change, if anyone has their home directory outside of `/Users` or is using `root` for this, but, well, I did my best and these are legacy defaults anyway.
(With some tweaks to handle `nix.enable` and order it at a more sensible position in the `$PATH`.) The installers actually install Nix into `root`’s profile for some reason, which means that the path’s prioritization backfires when the script runs as root and we’re managing the Nix installation. When running `darwin-rebuild` as a normal user, this wasn’t a problem. Maybe we should just have a check to make sure there’s no conflicting Nix in `root`’s profile – it seems pretty bad for `root` to get the wrong Nix – but it would trigger for almost everyone, which seems kind of annoying. I guess we could automatically remove it from `root`’s profile if it matches what’s in `/nix/var/nix/profiles/default`… This reverts commit 02232f7.
The `activate-system` daemon will now run all the checks, which seems like probably a good idea anyway?
The checks should no longer depend on `/run`, so this avoids modifying the system before they run.
a387b4b
to
0d9daef
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that covers everything I saw! Another pair of eyes certainly wouldn't hurt but this LGTM
Hey, I've just not long ago moved to flakes and don't have a lot of experience here. Could someone give me rough instructions how I would test it with flake setup? I would like to help 🙂 based on what I am seeing in the PR already started migrating away from services and config options which require primaryUser. |
You can test it by adding the following snippet to your flake: inputs.nix-darwin = {
url = "github:lnl7/nix-darwin/pull/1341/merge";
inputs.nixpkgs.follows = "nixpkgs"; # if you want to lock it to the same rev as your main nixpkgs instance
}; |
So it is a bit strange for me with launch agents 🙂 I've had couple launch agents defined via
|
3 sounds like expected behavior. 1, unfortunately, also sounds like expected behavior, since the code to handle removing those LaunchAgents has itself been removed. 2 is surprising to me though, unless the LaunchAgent plists are put at a different path? |
It’s finally time.
This PR eliminates user activation, addressing #96 by following the first phase of the plan from #1016 (comment), fulfilling a goal I started drafting patches for in 2023 before I was even a nix-darwin maintainer.
I strongly recommend reviewing this one commit‐by‐commit. See the commit messages for much more detail and for things I’m still uncertain about. I’ve been running this on my own system for a while now, but it could definitely use a staged roll‐out before we actually hit the merge button. We should also probably also reach out to authors of third‐party modules that will be affected once it’s mostly baked, though at least anything that breaks should mostly break loudly. (Adding text to our existing
activationScripts.{checks,userDefaults,userLaunchd,homebrew}.text
definitions is an exception – those will run asroot
now. But I can’t think of a particularly graceful way to handle that case, and anyway it’s not exactly something we fully support in the first place.)Closes: #96