Launch cockpit-session via socket activation on /run/cockpit/session #16808

allisonkarlitskaya · 2022-01-10T18:06:04Z

This paves the way for removing the static cockpit users in #16811, and fixes cockpit on bootc images.

A nice side effect of this is that we can now connect to unix sockets from cockpitauth, which is useful for https://github.com/allisonkarlitskaya/cockpit-cloud

Fixes #21201

Web server: Increased sandboxing, `setuid` removal, `bootc` support

Cockpit's web server already had low privilege levels, but previously used a setuid helper program for user logins. This helper program had restricted permissions and was only executable by Cockpit (through group ownership). Its sole purpose was to run at the system level and immediately drop permissions to log in to a specific user account. However, the binary was still setuid, and setuid should be minimzed for security reasons.

This release removes the setuid flag from the binary. The helper now starts via systemd socket activation, with the Cockpit web server connecting to it using a protected UNIX socket in the /run directory. This approach enables tighter sandbox security by preventing the login session from being a direct descendant of the web server process. It also resolves an issue with Cockpit running on bootc images.

All Cockpit components now run as dynamic users created at startup using the DynamicUser= systemd feature. Existing systems may still have a cockpit-ws user (and very old systems might even have associated TLS certificates). However, this cockpit-ws user is no longer required and can be safely deleted.

allisonkarlitskaya · 2024-11-21T13:24:28Z

src/session/client-certificate.c

+
+  /* read_proc_pid_cgroup() already ensures that, but just in case we refactor this: this is *essential* for the
+   * subsequent comparison */
+  if (ws_cgroup[ws_cgroup_length - 1] != '\n')


Thanks for allowing me to be pedantic about this <3

mvollmer

Thanks!

mvollmer · 2024-11-21T13:55:46Z

src/session/client-certificate.c

+    }
+  else {
+    ws_pid_dirfd = open_proc_pid (getpid ());
+    debug ("failed to read stdin peer credentials: %m; not in socket mode, reading cgroup from my own pid %d", ws_pid_dirfd);


This gets removed in a later commit. Can we squash that?

This is mostly me being picky: Technically, that commit (session: Support Unix socket mode for client certificates) still requires the fallback. It is made obsolete by "ws: connect to cockpit-session via socket", but that one also cannot land before. So it's a very short transitionary period (just over two commits), but nevertheless for "I can build and test or revert a random commit" correctness I think we should keep it.

Thanks! Then let me review it: errno might be overwritten here, no? And it prints a FD instead of a PID, is that intended?

ah yes, that's in the wrong order, the debug() should come before the open_proc_pid() of course. Thanks, let me fix that completely dead piece of code!

When cockpit-session's stdin is a Unix socket, it is being spawned by cockpit-ws through [email protected]. In that case it doesn't make sense to look at its own cgroup, but we need to check the cgroup of the socket peer (i.e. cockpit-ws). We must guard against PID recycling attacks: 1. Mallory logs into cockpit, gets ws pid M, and hacks ws: connect to session, forks, keeps the session fd in a different process, and kills pid M. 3. Mallory waits until Alice logs in again and happens to get ws pid M (which can happen with a sufficient number of forks, social engineering, and some luck). cockpit-session checks that pid M is in cgroup /cockpit/alice, and starts an alice session for Mallory's ws. (Note: SO_PEERCRED gives you pid/uid/gid at the time connect() was made.) Thus require that the peer (ws) must have started earlier than cockpit-session. This is the same approach that polkit uses as a fallback if pidfds are not available: https://github.com/polkit-org/polkit/blob/main/src/polkit/polkitunixprocess.c Note that pidfds don't help us: There is no API to directly get from a pidfd to a cgroup, startup time, or /proc/<pid> dirfd, this has to happen via `pidfd_getpid()` and opening /proc/pid. But that's exactly what we want to avoid, and thus is pointless (they are also only available since kernel 6.5).

Unless it's otherwise specified in the configuration file, we now spawn cockpit-session by connecting to /run/cockpit/session if that exists. We leave the cockpit_ws_session_program variable in place to allow the tests to override things. Update the unit files for cockpit-ws to ensure that the socket is available when cockpit-ws is running. Adjust TestConnection.testBasic accordingly: When running cockpit-session via unix socket activation, its group permissions are irrelevant. Break the socket instead. Also adjust the reverse proxy tests which start the `cockpit-ws` binary directly to ensure that the session socket is aware. A custom production setup which doesn't use cockpit.socket will have to `Requires/After=cockpit-session.socket` as well to continue to function (as running cockpit-ws as root is undesirable). Drop `testAuthUnixPath`, as that's now the default. Instead, add a new `testAuthDirectSession` which tests a custom cockpit-ws setup that directly runs cockpit-session. Co-Authored-By: Martin Pitt <[email protected]>

systemd spawns this for us now, so we don't need the setuid bit anymore. Clean up the statoverride in the Debian packaging on upgrades.

This avoids an alternative code path which is unlikely to happen in practice, and which we don't test anywhere.

martinpitt · 2024-11-21T16:50:23Z

🌟

bluca · 2024-11-21T22:53:19Z

src/session/client-certificate.c

+
+  /* Guard against pid recycling: If a malicious user captures ws, keeps the socket in a forked child and exits
+    * the original pid, they can trap a different user to login, get the old pid (pointing ot their cgroup), and
+    * capture their session. To prevent that, require that ws must have started earlier than ourselves. */


You are probably already aware, but just in case, in polkit there were CVEs due to comparing processes in this way for authentication purposes, as there are ways for an attacker to control the start timestamp as recorded by the kernel, by holding the forking process at a particular time. See comment at: https://github.com/polkit-org/polkit/blob/main/src/polkit/polkitunixprocess.c#L59

These days we use pid fds which cannot be recycled, and only fallback on heuristics on old kernels.

Thanks for this comment. We discussed this an awful lot. Please see this thread:

#16808 (comment)

It would be helpful to know a bit more about what the pidfd peercred thing actually solves: does it mean that for as long as the kernel is willing to return that pidfd to us that it then won't reissue that pid? That would be a very big help indeed, but it would also mean that we don't actually need to use the pidfd to gain that benefit...

Or (in the case that the process that called connect() is gone) will it issue us a pidfd that points to a process that doesn't exist anymore and possibly the pid of that process already got used?

I admittedly stopped looking into the polkit code after @martinpitt determined that it wasn't available on many of our target platforms and expressed reservations about implementing the fallback, but it would be useful to know how this ought to work, and I do believe that it will be a worthwhile thing to pursue in the future.

In the meantime, though, please understand that this code is only relevant on systems that support logging in with TLS client certificates, and this is only one part of a defence-in-depth strategy:

only cockpit-ws can connect to the socket, as we control access to it via its group ownership

we use a "nonce" system: a randomly-generated certificate file name is created in cockpit-tls. That nonce is passed only to the process that's supposed to be logging in using that client certificate. That file is written to a temporary directory only accessible by root, and lives only as long as the TCP connection is open. In order to convince cockpit-session that you are allowed to login, you need to know the name of the file in that directory.

the first line of that file contains the name of the cgroup of the process that is expected to be logging in using that file and we perform that double check based on the cgroup lookup we're discussing here.

We also limit the time that the login process is allowed to linger to 60 seconds in an attempt to make it more difficult to wait for a recycled PID.

One of the attack scenarios that we consider possible in Cockpit is that one legitimate user could exploit a weakness in the large C code base of cockpit-ws to take over control of the process and use it to gain control over a logged-in session of another user. This is the main reason for why we go out of our way to isolate client-certificate-identified sessions from each other. But: I notice that we could be doing more limiting what cockpit-ws is able to do, via the large number of sandboxing options available in systemd unit files. I'm not sure, for example, that it wouldn't be possible for one cockpit-ws instance to gain control of another via ptrace() — they're running as the same user. We restrict things quite a lot with our custom selinux policy, but it would be good to look into locking things down with systemd. I'll open an issue for that.

It would be helpful to know a bit more about what the pidfd peercred thing actually solves: does it mean that for as long as the kernel is willing to return that pidfd to us that it then won't reissue that pid? That would be a very big help indeed, but it would also mean that we don't actually need to use the pidfd to gain that benefit...

A pidfd can never be recycled, when the process exits, it will simply become invalid and can no longer be resolved to a pid, even if a process with the same pid has since appeared. With recent kernels you can use statx() and compare directly two pidfds for equality, we do this in systemd now for example. On slightly older kernels you get two pidfds, you resolve each pidfd -> pid, compare the pid, and then afterwards check again that both pidfds are still valid - more laborious than statx(), but equally safe against all races.
You can get the pidfd of a socket peer via SO_PEERPIDFD, this is safe because it's the kernel that gives you the pidfd, so you can be always sure it refers to the process that started the communication over the socket, it cannot be faked.

In summary, it is a feature designed exactly to avoid the kind of race conditions and vulnerabilities that have plagued polkit and other system software that need to authenticate processes.

In the meantime, though, please understand that this code is only relevant on systems that support logging in with TLS client certificates, and this is only one part of a defence-in-depth strategy

Of course, it's entirely up to you, and it is dependent on use cases, threat model, etc. I only mentioned because I found the link to this PR on Mastodon, and having had to deal with all of this recently in polkit, I thought of giving a heads-up, as it's a fairly recent kernel feature so it's not very well known yet. But if it's not necessary for the use case here, that's entirely fine, this was just intended as an FYI, not an RFE.

@bluca Thanks! My main woe is that just getting a pidfd doesn't help us much. We need to know the peer's startup time (currently parsed from /proc/pid/stat) and cgroup (/proc/pid/cgroup), and pidfds have neither these APIs nor "give me the corresponding /proc/pid".

If SO_PEERPIDFD is guaranteed to give the pid fd at the time of connect(), and resolving it fails once it's gone, that is a good hardening. We'd still have to resolve it to a pid and open /proc/pid, but it would replace the time comparison on newer kernels -- and we'd keep the time comparison as a fallback on older ones.

We won't need /stat any more, but we very much need the cgroup. That's the reason for doing all this dance 😁 (we use that to identify the correct service instance for the TLS certificate that we receive, and encode the cert's SHA into the unit's instance name)

Thanks for confirming the pidfd mechanics! That's how I understood it, and that will actually be helpful.

Right, in the future we'll hopefully have a SO_PEERCGROUP or so, work was in progress. For now one can ask systemd to translate from pidfd to cgroup in various ways, like sd_pidfd_get_cgroup()

In fact if you want the unit, there's APIs to get that directly, via D-Bus

We don't have D-Bus in cockpit-session. It's a minimal program which should have essentially zero dependencies. But we'll do the pidfd robustification, thanks!

Done in #21341 . Thanks @bluca!

@bluca

We can get a reliable, PID recycling resistant, /proc query for cockpit-session's ws peer (i.e. the far end of its stdin Unix socket) by getting a pidfd instead of an ucred. This is always the pidfd for the process that started the communication, it cannot be recycled. If the original process does go away, querying the pidfd will just fail, even if a new process with the same pid comes along. We still need to "resolve" the pidfd to a pid to open /proc/pid/cgroup (there is no direct kernel API to get a pidfd's cgroup). But validate the pid *after* that query to ensure it didn't get recycled. This is much easier and safer to do than parsing /proc/pid/stat. However, this requires kernel 6.5, so is not yet available in e.g. Debian 12 or RHEL 9. So keep the pid+time comparison fallback for these older OSes. Thanks to @bluca for the helpful technical advice! cockpit-project#16808 (comment) https://issues.redhat.com/browse/COCKPIT-1207

@bluca

We can get a reliable, PID recycling resistant, /proc query for cockpit-session's ws peer (i.e. the far end of its stdin Unix socket) by getting a pidfd instead of an ucred. This is always the pidfd for the process that started the communication, it cannot be recycled. If the original process does go away, querying the pidfd will just fail, even if a new process with the same pid comes along. We still need to "resolve" the pidfd to a pid to open /proc/pid/cgroup (there is no direct kernel API to get a pidfd's cgroup). But validate the pid *after* that query to ensure it didn't get recycled. This is much easier and safer to do than parsing /proc/pid/stat. However, this requires kernel 6.5, so is not yet available in e.g. Debian 12 or RHEL 9. So keep the pid+time comparison fallback for these older OSes. Thanks to @bluca for the helpful technical advice! cockpit-project#16808 (comment) https://issues.redhat.com/browse/COCKPIT-1207

@bluca

We can get a reliable, PID recycling resistant, /proc query for cockpit-session's ws peer (i.e. the far end of its stdin Unix socket) by getting a pidfd instead of an ucred. This is always the pidfd for the process that started the communication, it cannot be recycled. If the original process does go away, querying the pidfd will just fail, even if a new process with the same pid comes along. We still need to "resolve" the pidfd to a pid to open /proc/pid/cgroup (there is no direct kernel API to get a pidfd's cgroup). But validate the pid *after* that query to ensure it didn't get recycled. This is much easier and safer to do than parsing /proc/pid/stat. However, this requires kernel 6.5, so is not yet available in e.g. Debian 12 or RHEL 9. So keep the pid+time comparison fallback for these older OSes. Thanks to @bluca for the helpful technical advice! cockpit-project#16808 (comment) https://issues.redhat.com/browse/COCKPIT-1207

@bluca

We can get a reliable, PID recycling resistant, /proc query for cockpit-session's ws peer (i.e. the far end of its stdin Unix socket) by getting a pidfd instead of an ucred. This is always the pidfd for the process that started the communication, it cannot be recycled. If the original process does go away, querying the pidfd will just fail, even if a new process with the same pid comes along. We still need to "resolve" the pidfd to a pid to open /proc/pid/cgroup (there is no direct kernel API to get a pidfd's cgroup). But validate the pid *after* that query to ensure it didn't get recycled. This is much easier and safer to do than parsing /proc/pid/stat. However, this requires kernel 6.5, so is not yet available in e.g. Debian 12 or RHEL 9. So keep the pid+time comparison fallback for these older OSes. Thanks to @bluca for the helpful technical advice! cockpit-project#16808 (comment) https://issues.redhat.com/browse/COCKPIT-1207

@bluca

We can get a reliable, PID recycling resistant, /proc query for cockpit-session's ws peer (i.e. the far end of its stdin Unix socket) by getting a pidfd instead of an ucred. This is always the pidfd for the process that started the communication, it cannot be recycled. If the original process does go away, querying the pidfd will just fail, even if a new process with the same pid comes along. We still need to "resolve" the pidfd to a pid to open /proc/pid/cgroup (there is no direct kernel API to get a pidfd's cgroup). But validate the pid *after* that query to ensure it didn't get recycled. This is much easier and safer to do than parsing /proc/pid/stat. However, this requires kernel 6.5, so is not yet available in e.g. Debian 12 or RHEL 9. So keep the pid+time comparison fallback for these older OSes. Thanks to @bluca for the helpful technical advice! cockpit-project#16808 (comment) https://issues.redhat.com/browse/COCKPIT-1207

@bluca

We can get a reliable, PID recycling resistant, /proc query for cockpit-session's ws peer (i.e. the far end of its stdin Unix socket) by getting a pidfd instead of an ucred. This is always the pidfd for the process that started the communication, it cannot be recycled. If the original process does go away, querying the pidfd will just fail, even if a new process with the same pid comes along. We still need to "resolve" the pidfd to a pid to open /proc/pid/cgroup (there is no direct kernel API to get a pidfd's cgroup). But validate the pid *after* that query to ensure it didn't get recycled. This is much easier and safer to do than parsing /proc/pid/stat. However, this requires kernel 6.5, so is not yet available in e.g. Debian 12 or RHEL 9. So keep the pid+time comparison fallback for these older OSes. Thanks to @bluca for the helpful technical advice! cockpit-project#16808 (comment) https://issues.redhat.com/browse/COCKPIT-1207

@bluca

We can get a reliable, PID recycling resistant, /proc query for cockpit-session's ws peer (i.e. the far end of its stdin Unix socket) by getting a pidfd instead of an ucred. This is always the pidfd for the process that started the communication, it cannot be recycled. If the original process does go away, querying the pidfd will just fail, even if a new process with the same pid comes along. We still need to "resolve" the pidfd to a pid to open /proc/pid/cgroup (there is no direct kernel API to get a pidfd's cgroup). But validate the pid *after* that query to ensure it didn't get recycled. This is much easier and safer to do than parsing /proc/pid/stat. However, this requires kernel 6.5, so is not yet available in e.g. Debian 12 or RHEL 9. So keep the pid+time comparison fallback for these older OSes. Thanks to @bluca for the helpful technical advice! cockpit-project#16808 (comment) https://issues.redhat.com/browse/COCKPIT-1207

@bluca

We can get a reliable, PID recycling resistant, /proc query for cockpit-session's ws peer (i.e. the far end of its stdin Unix socket) by getting a pidfd instead of an ucred. This is always the pidfd for the process that started the communication, it cannot be recycled. If the original process does go away, querying the pidfd will just fail, even if a new process with the same pid comes along. We still need to "resolve" the pidfd to a pid to open /proc/pid/cgroup (there is no direct kernel API to get a pidfd's cgroup). But validate the pid *after* that query to ensure it didn't get recycled. This is much easier and safer to do than parsing /proc/pid/stat. However, this requires kernel 6.5, so is not yet available in e.g. Debian 12 or RHEL 9. So keep the pid+time comparison fallback for these older OSes. Thanks to @bluca for the helpful technical advice! #16808 (comment) https://issues.redhat.com/browse/COCKPIT-1207

Cockpit 330 dropped the cockpit-session suid permissions and moved to systemd socket activation [1]. It is now a regular root:root 755 file, so drop the fileinfo entry. [1] cockpit-project/cockpit#16808

allisonkarlitskaya requested a review from martinpitt January 10, 2022 18:06

This comment was marked as resolved.

Sign in to view

allisonkarlitskaya temporarily deployed to cockpit-dist January 10, 2022 18:12 Inactive

allisonkarlitskaya force-pushed the cockpit-session-socket branch from aaefe97 to 7ebf9e6 Compare January 11, 2022 07:56

allisonkarlitskaya mentioned this pull request Jan 11, 2022

use DynamicUser=yes for all cockpit components #16811

Merged

2 tasks

allisonkarlitskaya added the blocked Don't land until something else happens first (see task list) label Jan 11, 2022

allisonkarlitskaya temporarily deployed to cockpit-dist January 11, 2022 08:03 Inactive

allisonkarlitskaya mentioned this pull request Jan 11, 2022

ws: add support for connecting to unix sockets #16819

Merged

martinpitt removed their request for review February 2, 2022 06:41

martinpitt removed the blocked Don't land until something else happens first (see task list) label Feb 2, 2022

martinpitt marked this pull request as draft February 2, 2022 06:41

This comment was marked as outdated.

Sign in to view

allisonkarlitskaya force-pushed the cockpit-session-socket branch from 7ebf9e6 to 1330a6a Compare February 10, 2022 14:34

allisonkarlitskaya temporarily deployed to cockpit-dist February 10, 2022 14:40 Inactive

allisonkarlitskaya force-pushed the cockpit-session-socket branch from 1330a6a to ca43353 Compare February 10, 2022 15:06

allisonkarlitskaya added the blocked Don't land until something else happens first (see task list) label Feb 10, 2022

allisonkarlitskaya temporarily deployed to cockpit-dist February 10, 2022 15:11 Inactive

KKoukiou added the review-2022-12 label Dec 14, 2022

martinpitt mentioned this pull request Jan 4, 2023

systemd: Use sysusers.d to create ws users #18112

Closed

1 task

martinpitt added needs-rebase and removed blocked Don't land until something else happens first (see task list) review-2022-12 labels Jan 4, 2023

This comment was marked as resolved.

Sign in to view

martinpitt force-pushed the cockpit-session-socket branch from ca43353 to 9ed95a5 Compare January 4, 2023 17:57

martinpitt removed the needs-rebase label Jan 4, 2023

martinpitt force-pushed the cockpit-session-socket branch from 9ed95a5 to 6778062 Compare January 4, 2023 18:02

martinpitt temporarily deployed to cockpit-dist January 4, 2023 18:07 — with GitHub Actions Inactive

martinpitt force-pushed the cockpit-session-socket branch from 6778062 to f1e5a6a Compare January 4, 2023 18:12

martinpitt temporarily deployed to cockpit-dist January 4, 2023 18:17 — with GitHub Actions Inactive

martinpitt added the no-test For doc/workflow changes, or experiments which don't need a full CI run, label Jan 4, 2023

martinpitt force-pushed the cockpit-session-socket branch from 286aa1c to 32d55a9 Compare November 21, 2024 13:12

martinpitt requested a review from mvollmer November 21, 2024 13:13

allisonkarlitskaya commented Nov 21, 2024

View reviewed changes

martinpitt force-pushed the cockpit-session-socket branch 2 times, most recently from 09ed600 to c6695e3 Compare November 21, 2024 13:35

mvollmer approved these changes Nov 21, 2024

View reviewed changes

martinpitt and others added 5 commits November 21, 2024 15:28

tools: Move cockpit-session.socket to cockpit-ws package

4e32648

cockpit-session: stop installing setuid root

110b45f

systemd spawns this for us now, so we don't need the setuid bit anymore. Clean up the statoverride in the Debian packaging on upgrades.

session: Only support for cert auth with cockpit-session.socket

72bfb6c

This avoids an alternative code path which is unlikely to happen in practice, and which we don't test anywhere.

martinpitt force-pushed the cockpit-session-socket branch from c6695e3 to 72bfb6c Compare November 21, 2024 14:28

martinpitt merged commit 6a49346 into cockpit-project:main Nov 21, 2024
85 checks passed

martinpitt deleted the cockpit-session-socket branch November 21, 2024 16:50

bluca reviewed Nov 21, 2024

View reviewed changes

allisonkarlitskaya mentioned this pull request Nov 22, 2024

Investigate systemd unit sandboxing options #21299

Closed

allisonkarlitskaya added the release-note label Nov 22, 2024

martinpitt mentioned this pull request Nov 27, 2024

session: Use pidfd for determining ws peer cgroup #21341

Merged

martinpitt mentioned this pull request Dec 4, 2024

Remove cockpit-session rpminspect/rpminspect-data-fedora#60

Merged

martinpitt removed the release-note label Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Launch cockpit-session via socket activation on /run/cockpit/session #16808

Launch cockpit-session via socket activation on /run/cockpit/session #16808

allisonkarlitskaya commented Jan 10, 2022 •

edited by garrett

Loading

This comment was marked as resolved.

This comment was marked as outdated.

This comment was marked as resolved.

allisonkarlitskaya Nov 21, 2024

mvollmer left a comment

mvollmer Nov 21, 2024

martinpitt Nov 21, 2024

mvollmer Nov 21, 2024 •

edited

Loading

martinpitt Nov 21, 2024

martinpitt commented Nov 21, 2024

bluca Nov 21, 2024

allisonkarlitskaya Nov 21, 2024

allisonkarlitskaya Nov 22, 2024

bluca Nov 22, 2024

martinpitt Nov 22, 2024

martinpitt Nov 22, 2024

bluca Nov 22, 2024

bluca Nov 22, 2024

martinpitt Nov 25, 2024

martinpitt Nov 27, 2024

Launch cockpit-session via socket activation on /run/cockpit/session #16808

Launch cockpit-session via socket activation on /run/cockpit/session #16808

Conversation

allisonkarlitskaya commented Jan 10, 2022 • edited by garrett Loading

Web server: Increased sandboxing, setuid removal, bootc support

This comment was marked as resolved.

This comment was marked as outdated.

This comment was marked as resolved.

Choose a reason for hiding this comment

mvollmer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mvollmer Nov 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martinpitt commented Nov 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

allisonkarlitskaya commented Jan 10, 2022 •

edited by garrett

Loading

Web server: Increased sandboxing, `setuid` removal, `bootc` support

mvollmer Nov 21, 2024 •

edited

Loading