pid_eins, (edited )
@pid_eins@mastodon.social avatar

4️⃣ Here's the 4th installment of my series of posts highlighting key new features of the upcoming v256 release of systemd.

You might be aware of systemd's per-service setting "ProtectSystem=". When used it ensures the service lives in its own mount namespace, detached from the host's and various key directories become read-only to the service, in particular /usr/. This reflects the fact there's very little code that should ever be able to to write to /usr/.

pid_eins, (edited )
@pid_eins@mastodon.social avatar

/usr/ is where the OS' code and other resources are located, and by making /usr/ read-only to the service the ability to modify the OS is taken away, bringing substantial security benefits.

Hence: turn this option on, unless you are (or wrap) a package manager (basically).

With systemd v256 we are taking the concept one step further: there's now a system-wide knob of the same name, that is applied to the host file system as a whole, very early when PID 1 initializes.

pid_eins,
@pid_eins@mastodon.social avatar

Now you might wonder why that is useful? There are two use cases for this we had in mind.

The first one are initrds, i.e. this little mini OS image that the kernel invokes as first userspace after initializing. It's unpacked by the kernel into a tmpfs instance and then invoked. The tmpfs instance is left writable however. And this is where the global ProtectSystem= knob can add security: when PID 1 initializes early in the initrd we'll immediately mount the /usr/ subtree read-only.

pid_eins,
@pid_eins@mastodon.social avatar

I think that is particularly relevant since initrds are relatively exposed: the are often set up to do some networking to find the initial configuration or root file system, and at the same time – at least in a TPM world — might get access to certain secrets that the later system will not get anymore (i.e. an FDE volume key might get released from the TPM to the OS during the initrd, but becomes inaccessible via PCR policy+measurement subsequently, to lock down things nicely).

pid_eins,
@pid_eins@mastodon.social avatar

The other usecase are regular, package-based systems (e.g. things like rpm + dpkg based systems). If a package manager is updated to be able to automatically re-mount /usr/ writable around the actual package manager operations the system is generally more secure, as the semantics move closer to the immutable semantics of image-based systems. Of course, true image-based systems offer a much greater level of security, because they make modification of the OS impossible entirely (instead…

pid_eins,
@pid_eins@mastodon.social avatar

… of just restricting it to a specific time window), but if you are stuck in a package-based world, I believe it's still a major improvement.

Note that ProtectSystem= defaults to "on" now in the initrd, but to "off" on the root fs. Our assumption that we could get away with turning this on for the initrd. Turns out the assumption was wrong, it did break dracut (i.e. fedora's initrd implementation), but this has been fixed now, so we'll leave the default as it is now.

pid_eins,
@pid_eins@mastodon.social avatar

(Other initrd implementations, such as Debian's, or Arch's weren't affected in case you wonder, as it appears. It's trivial to turn off the feature btw, for worlds where utmost compatibility matters more than security).

And that's all for now. I hope this installment was interesting. See you soon for installment Nr. 5.

nogweii,
@nogweii@nogweii.net avatar

@pid_eins Any thoughts on making an option in the system.conf to apply ProtectSystem by default in all services spawned by the system manager (basically, flipping the default) without changing how /usr/ is mounted? Or does that not really change things whilst still causing compatibility issues?

pid_eins,
@pid_eins@mastodon.social avatar

@nogweii well, the per-service knob implies a seperate mount namespace, leaving the main mount namespace unaffected. The global knob we introduced in v256 otoh touches the main mount namespace, which is a lot more impactful since so many services for various reasons run in the main mount namespace.

danderson,
@danderson@hachyderm.io avatar

@pid_eins My mount ns knowledge is rusty: can these package managers remount r/w in an isolated namespace, so everything else on the system still can't write during the install window? Or right now are they forced to "unlock" the system globally?

pid_eins,
@pid_eins@mastodon.social avatar

@danderson they can unlock locally.

juliank,
@juliank@mastodon.social avatar

@danderson @pid_eins While it sounds reasonable at first, it ultimately breaks down when packages need to mount or unmount or remount file systems in their maintainer scripts.

There may be many such packages in the wild and someone needs to do the analysis, change them all to do mounting in systemd units instead, before you can safely use mount namespaces in package managers.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • kavyap
  • thenastyranch
  • mdbf
  • DreamBathrooms
  • everett
  • magazineikmin
  • GTA5RPClips
  • Youngstown
  • cisconetworking
  • ethstaker
  • slotface
  • ngwrru68w68
  • rosin
  • cubers
  • JUstTest
  • InstantRegret
  • Durango
  • osvaldo12
  • modclub
  • tester
  • Leos
  • khanakhh
  • normalnudes
  • tacticalgear
  • megavids
  • anitta
  • provamag3
  • lostlight
  • All magazines