pid_eins

@pid_eins@mastodon.social

⛵ I write software. ⛵

This profile is from a federated server and may be incomplete. Browse more on the original instance.

pid_eins, 1 month ago to random

1️⃣3️⃣ Here's the 13th installment of posts highlighting key new features of the upcoming v256 release of systemd.

ssh is widely established as the mechanism for controlling Linux systems remotely, both interactively and with automated tools. It not only provides means for secure authentication and communication for a tty/shell, but also does this for file transfers (sftp), and IPC communication (D-Bus or Varlink).

reply

expand (17)

collapse (17)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ jpmens, mcdanlj, bitprophet, ljrk

pid_eins, 1 month ago

@kccqzy yes.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 1 month ago

Thing is though, that there are better ways to communicate locally: AF_VSOCK is an alternative to AF_INET/AF_INET6 for local communication between VMs and hosts. In many ways it behaves very similar to TCP. It is similar enough so that you can just do ssh-via-AF_VSOCK. As opposed to AF_INET/AF_INEt6 it requires next to no configuration, your really just have to enable the knob in your VMM, and have a somewhat non-ancient Linux distribution as guest.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 1 month ago

@NekkoDroid Specifically, there's now PollLimitIntervalSec=, PollLimitBurst=. If this rate limiting is hit, then the effect is not that the socket unit will be put in a failed state, but instead we'll just stop watching the socket for the rest of the time window. Once the time passed we'll start watching again. This basically means, that if the system is hit with a flood of connection attempts, we'll pause processing them for a while, and then return processing them after the configurable…

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 1 month ago

@NekkoDroid … time window passed.

With that in effect I think we should be really safe and robust against DoS, because we actually never deny service, but at the same time we also don't let our CPU use grow unbounded.

Does that make any sense?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 1 month ago

Binding sshd to AF_VSOCK is one thing, but it would be useless if we couldn't also connect to it via an ssh client. Hence v256 also includes a small client-side plugin for ssh that makes it possible to connect to such a VM by specifying "ssh vsock/$CID" (where $CID is the vsock address — one of which each VM gets automatically assigned).

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 1 month ago

@NekkoDroid …distinguish between very high number of attempts because of misconfiguration or because of valid connection attempts.

Now, the this ratelimit was configurable since a long long time, people just never bothered to. You can simply turn it off in the socket unit via TriggerLimitIntervalSec=/TriggerLimitBurst=.

So, the DoS never really was a DoS, it was at best a misconfiguration.

With v255 we added a separate ratelimit to socket units that behaves differently, btw.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 1 month ago

And there's more: the generator also binds sshd to two AF_UNIX sockets for good measure given we are already at it and it's easy. One is a socket in /run/host/ if that exists and is a mount point — which is a scheme defined for full-OS containers: the idea ist that the container manager makes that dir also available on the host, so that you can connect from the host to the container via ssh-over-AF_UNIX without bothering with network config and setup.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 1 month ago

The other one is at a generic socket in /run/ which happens always, and can be used to acquire privs locally.

This automatic ssh-via-AF_VSOCK logic is particularly useful in conjunction with the notification mechanism detailed in installment 3, i.e. sd_notify() notifications from PID 1 to its VMM via AF_VSOCK.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 1 month ago

Because it means a VMM can start a VM, wait for the X_SYSTEMD_UNIT_ACTIVE=ssh-access.target sd_notify() message and then immediately connect to the VM via ssh-over-AF_VSOCK. Reliably, without wasting time, without retries, without network config and setup, without preparing the guest much (well you do have to add sshd to it), just like that.

And that's all for this time. We still have more to cover, hence stay tuned.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 1 month ago

@NekkoDroid So there's a programmable rate limit on socket units. If you hit that too often we assume that something is wrong (i.e. that the service in question dies quickly after activation and never can process its connections). Hence, in order to not consume unbounded CPU we will eventually give up trying to activate the service. This safety feature can be seen as a DoS, if there simply are very high amounts of connections, the rate limit is hit too after all, it cannot…

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 1 month ago

And ssh-via-AF_VSOCK is precisely what we are doing in systemd v256: a small new unit generator (i.e. a plugin that extends systemd's unit tree dynamically) detects if AF_VSOCK is available and sshd is installed, and if so binds AF_VSOCK/port 22 to sshd, via socket activation. Or in other words: in environments where AF_VSOCK is a thing, sshd will now just work, without any extra configuration and at minimal cost of resources (because lazy socket activation rocks).

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ bugaevc

pid_eins, 29 days ago (edited 29 days ago) to random

1️⃣4️⃣ Here's the 14th installment of posts highlighting key new features of the upcoming v256 release of systemd.

This one is going to be quick one. Previously, you had to specify a block device name when invoking systemd-cryptenroll, to specify which encrypted volume to enroll your PKCS11/TPM2/FIDO2 device to. This is now optional. If no device is specified, then the tool will now automatically look for the device behind the /var/ directory and operate on that.

reply

expand (10)

collapse (10)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ljrk

pid_eins, 29 days ago

This of course makes the tool a bit more comfortable to use, since you don't have to figure out the backing block device of your encrypted volume first. But it's actually a security measure too:

On Linux referencing block devices by their device names such as "/dev/sda" is a bit problematic when it comes to security, since these names are reused on unplug/replug. Thus, there's a good chance that various security-relevant operations executed on disks can be tricked, …

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 29 days ago

you have a fully encrypted root fs, with /var/ being placed on the root fs too

you have an immutable root fs, but /var/ is mounted writable.

In both these cases using /var/ as the path to search the backing block device for will work, while using / instead would not work for the 2nd case.

Also note, that this mechanism is automatically disabled when a destructive operation is used (i.e. an existing key slot shall be wiped), for robustness reasons.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 29 days ago

@marie indeed! fixed!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 29 days ago

By automatically detecting the right block device the attack window for enrollment operations becomes a lot smaller, and once the last gaps in the diskseq kernel apis are fixed we can fully lock down things so that we can guarantee that it's really the right device we operate on and not some trick device.

You might wonder why we are derive the backing device of /var/ rather than the root file system for this automatic mechanism: that's because we generally focus on two ways to set up a system:

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 29 days ago

And that's all for now, stay tuned for episode 15 soon.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 29 days ago

@neingeist indeed! fixed!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 29 days ago

… detect such tricks, and refuse operation. In systemd we thus started to make more and more use of diskseq, for example via /dev/disk/by-diskseq/ device symlinks, or by referencing block devices via their diskseq numbers whenever possible. This works is incomplete currently, there are gaps at various places (for example, there's no userspace API to query the diskseq number from a mounted file system that is backed by a block device).

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 29 days ago

… by quickly removing one disk, and plugging in another, so that the device node goes away, and comes back under the same name, but suddenly referring to a different device.

There has been work on addressing this problem, specfically there's now a "diskseq" counter exposed on Linux block devices that changes on every media change and device unplug. Thus, if on first use of a device the diskseq number is stored away, and then verified on each subsequent operation it's possible to…

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 27 days ago to random

1️⃣5️⃣ Here's the 15th installment of posts highlighting key new features of the upcoming v256 release of systemd.

systemd integrates with many components of the OS. Due to this it links against various external libraries. Generic distributions – which typically enable all features a package provides – usually have to deal with relatively large dependency trees in cases like this.

reply

expand (5)

collapse (5)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ 1ace, ljrk, mariusor

pid_eins, 27 days ago (edited 27 days ago)

All our binaries now contain an ELF "note" describing these "weak" deps that can be processed in a similar way as regular ELF dependencies.

The format of these notes is described here:

https://github.com/systemd/systemd/blob/main/docs/ELF_DLOPEN_METADATA.md

There's now work ongoing to process this data automatically at rpm and dpkg build time, so that we get the best of both worlds: "weak" dependencies and proper metadata to declare them consistently.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ 1ace

pid_eins, 27 days ago (edited 27 days ago)

systemd is often used in smaller environments, i.e run in containers, or in the initrd or similar. Hence a large dependency tree is problematic.

Because of that in systemd we started to turn a large number of our dependencies from regular ones to dlopen() ones: instead of always requiring some shared library we only load it the moment we need it. This means we can gracefully degrade our feature set if certain libraries are not available.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 27 days ago

We have been doing this since quite a while, so that in v256 22 of our dependencies have been reworked like that.

This has the benefit that on a typical system the systemd binary itself only pulls in the C library (including libm), libmount, libselinux, libaudit and libseccomp.

Net result: we have a tiny required dependency footprint, but can still provide a large feature set, if the optional deps happen to be installed.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 27 days ago

This comes at a price though: since our library dependencies are now dlopen() based they do not show up in the ELF metadata of our binaries anymore. And package managers such as dpkg/rpm generally look at that, and automatically translate those ELF dependencies into packaging dependencies.

Hence: at first glance, this means that we regress on this front: previously automatically determined dependencies have to be encoded manually again.

With v256 we are doing something about this.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

kernellogger, 28 days ago to linux

The TPM bus encryption and integrity protection changes prepared by @jejb and @jarkko were merged for #Linux 6.10: https://git.kernel.org/torvalds/c/b19239143e393d4b52b3b9a17c7ac07138f2cfd4

"[…] The key pair on TPM side is generated from so called null random seed per power on of the machine [1]. This supports the TPM encryption of the hard drive by adding layer of protection against bus interposer attacks. […]"

[1 https://lore.kernel.org/linux-integrity/20240429202811.13643-1-James.Bottomley@HansenPartnership.com/

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 28 days ago

@jarkko @kernellogger @jejb systemd's disk encryption stuff actually has been using encrypted sessions for a long long time.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

gnubyte, 1 month ago to random

@pid_eins Hi Lennart,
I saw in an article that you’re looking to replace sudo? I’m not sure how accurate that is but for brevity sake my ask is - Is it possible instead to keep sudo and just make run0 an alternative option?

Why I’m asking is because this would break so many scripts and packages. Pip and docker-compose changes have already done a lot of damage, let alone replacing this keyword which is used everywhere. Is there no way to modify sudo to achieve what you need instead?

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

pid_eins, 1 month ago

@gnubyte nobody is taking sudo away. It's about offering something better for OSes that don't have to care about legacy so much.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...