ricci,
@ricci@discuss.systems avatar

Hey! Let's talk about #SSH and #security!

If you've ever looked at SSH server logs you know what I'm about to say: Any SSH server connected to the public Internet is getting bombarded by constant attempts to log in. Not just a few of them. A lot of them. Sometimes even dozens per second. And this problem is not going away; it is, in fact, getting worse. And attackers' behavior is changing.

The graph attached to this post shows the number of attempted SSH logins per day to one of @cloudlab s clusters over a four-year period. It peaks at about 3.4 million login attempts per day.

This is part of a study we did on our production system, using logs of more than 640 million login attempts, covering more than 1,500 hosts on our side and observing more than 840 thousand incoming IP addresses.

A paper presenting our analysis and a new, highly effective means to block SSH brute force attacks ("Where The Wild Things Are: Brute-Force SSH Attacks In The Wild And How To Stop Them") will be presented next week at #NSDI24 by @sachindhke . The full paper is at https://www.flux.utah.edu/paper/singh-nsdi24

Let's dive in. 🧵

ricci,
@ricci@discuss.systems avatar

First things first: everyone "knows" that most brute force attacks are against the "root" account, right? This is certainly what earlier studies have found.

As it turns out, this used to be true, but it's not anymore. This graph shows that the fraction of brute force attacks using the username root was nearly 100% back in 2017, but it's been falling - by mid-2021, only around 20% off the attacks we saw were against root.

So, why? Well, we don't have a hotline to the attackers, but we have an educated guess from our own data and from many others' reporting: a lot of the usernames we see correspond to default usernames for #network #routers, specific #Linux distributions, specific server software, and #IoT devices. Basically, as we connect ever more stuff to the Internet (and generally try to protect the "root" account), attackers seem to be diversifying the accounts they are going after.

(There's a table of the top 100 usernames in the paper.)

ricci,
@ricci@discuss.systems avatar

How do attackers pick their targets? In a word, randomly.

This graph shows what we call "sequence bias" - if an attacking IP address moves from lower-numbered target IP addresses to larger ones, we assign it a sequence bias of 1; if it goes from larger to smaller, we assign it a bias of -1. As you can see in this graph, most of them have a sequence bias of around 0, meaning that once we see an attack, the next target is equally likely to be either higher or lower.

Basically, this means that most attackers are scanning the IPv4 address space at random.

ricci,
@ricci@discuss.systems avatar

OK, so what can we do about all these SSH brute force attacks?

We have a plan - actually, not just a plan, we run this in production on one of the @cloudlab clusters.

Let's start with this observation: if attackers are using a broad set of usernames, then we can use these username sets as a sort of signature. About half of attacking IP addresses only try one username, but that also means that about half are trying more than one - in fact some individual IP addresses tried more than 10,000 usernames!

What we do is this: we find sets of usernames that are used by more than one attacking IP address (actually it's a bit more complicated that this, details in the paper). This gives us "dictionaries" of usernames that are only used by attackers, and not any of our real users. We collect these dictionaries from the logfiles of a bunch of SSH servers, and combine them to form a Username Block List (UBL).

Now, all we have to do is: as soon as we see an IP address try a username from this UBL, we block it. That simple. We call this Dictionary Based Blocking (DBB).

How well does this work? We used logs from our clusters containing a total of 213 million login attempts, and it blocked 99.5% of all attempts, generating a false positives (accidentally locking out a real user) at a rate of just one about every five days.

But what about , you might ask? That's another method people use to block attacks against SSH (and other services) by locking out addresses that fail to log in more than X times in Y minutes. Well, with it's default settings, it only blocks about 66% of attacks, and it generates more than 5x as many false positives (graph attached). As it turns out, there is no way to tune fail2ban to get DBB's accuracy without a false positive rate that's orders of magnitude higher.

I said we run this in production - how well does that work? We run it on one of of CloudLab clusters that already had a firewall - subscribing to popular blocklists and running something very much like fail2ban. It's catching four-fifths of the attacks that were not already getting caught by these measures, and so far it hasn't caused a single false positive.

serge,
@serge@babka.social avatar

@ricci

This bit about the usernames is interesting.

Assuming the system is being used as a bastion/jump host, then are we still secure in key exchange, or should we start thinking about usernames the way we do about API bearer tokens, assigning long, randomly generated strings for both the username and the key?

Or at that point, do we just move onto another technology like VPNs (eg Wireguard)?

ricci,
@ricci@discuss.systems avatar

@serge Excellent question. My take is that if you are taking your SSH security seriously enough to be thinking about this, the answer is no. As long as you have locked down your logins to disable password auth and have disabled root logins, that's going to stop brute force attacks. (I will note that we're focusing in this paper on brute-force attacks and and what I'm saying is probably less applicable to targeted attacks.)

The real danger is devices that have default passwords and/or when users set easy to guess passwords on well-known accounts. As an example, one of usernames that shows up a lot in our dataset is 'pi', probably due to the fact that until painfully recently, that account had a well-known default password in Raspberry Pi OS / Raspbian.

That said, we did find that through spraying of lots of common personal names, attackers did manage to guess a non-trivial number of our users' actual usernames. But so long as you require your users to use keypairs, they're safe from password guessers.

nicolaromano,
@nicolaromano@qoto.org avatar

@ricci silly question... Why are numbers not summing up to 100%? Looks like over time the total is getting lower.

SoniEx2,
@SoniEx2@chaos.social avatar

@ricci @cloudlab @sachindhke do you know if switching to ipv6-only could solve the issue?

ricci,
@ricci@discuss.systems avatar

@SoniEx2 @cloudlab @sachindhke An excellent question that I can only speculate on right now, in part because our study only covers IPv4, and in part because I expect the landscape to change, but it's hard to predict exactly how.

In the short term, switching ssh and other services to #IPv6 only will likely reduce the brute force attacks you see by a lot. Our data suggests that attackers are hitting the IPv4 space at random, which is a perfectly good strategy for the relatively dense IPv4 space, but a terrible strategy for the gigantic IPv6 space. If I were an attacker doing brute force, I'd stick to the IPv4 space that's easy and has plenty of targets.

However, let's consider more sophisticated attackers, and/or a future world where we've moved entirely to IPv6. There are lots of things you can do to cut down the scanning space. Most IPv6 space is not even allocated, so you can just skip that. You can focus on specific prefixes used by large ISPs and cloud providers to increase your hit rate. You can use information about the way some devices use MAC addresses to generate part of their public address to target popular NIC and or IoT vendors. You can keep track of live IP addresses based on observed connections (eg. scan everyone who connects to your website.) You can try to enumerate DNS domains to look for targets (most DNS servers try to prevent this, but there are all kinds of attacks on DNS). You can share lists of the live addresses you find. And these are just off the top of my head, I'm sure people have come up with plenty more already, and will find plenty more in the future.

So, will we eventually reach a point where IPv6 scanning is as effective as IPv4 scanning is today? It seems unlikely, but scanning the entire IPv4 space in minutes seemed unlikely not too long ago. So in the long term, I wouldn't bet on security that depends on IPv6 being hard to scan. I would expect that we'll all want to keep up the same strategies of using keys, blocking attackers that we detect, etc.

One thing I would expect is for the patterns to change: right now acquiring a target is easy, so attacks that just try once and move on are common. On IPv6 - both now and in the future - I'd expect that the difficulty of finding targets means that once you find one, you're going to try a lot more usernames and passwords on it.

adelgado,
@adelgado@eu.mastodon.green avatar

@ricci @cloudlab @sachindhke my simple solution (like @AndresFreundTec mention) is to use random ports for SSHd. Give a bit of extra work but with automation via Ansible and Puppet is negligible.

categulario,
@categulario@mstdn.mx avatar

@ricci @cloudlab @sachindhke Amazing work!

I guess this can be extended to other protocols. For example in the http logs I see a lot of similar requests attempting known vulnerabilities.

vampirdaddy,
@vampirdaddy@chaos.social avatar

@ricci @cloudlab @sachindhke
Set "MaxStartups" to 3 or 5 in sshd_config for rate limiting

but most importantly ONLY allow keybased authentication, so no more bruteforce.

Solved the problem for me quite nicely.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • security
  • tacticalgear
  • DreamBathrooms
  • InstantRegret
  • magazineikmin
  • khanakhh
  • Youngstown
  • ngwrru68w68
  • slotface
  • everett
  • rosin
  • thenastyranch
  • kavyap
  • GTA5RPClips
  • cisconetworking
  • JUstTest
  • normalnudes
  • osvaldo12
  • ethstaker
  • mdbf
  • modclub
  • Durango
  • tester
  • provamag3
  • cubers
  • Leos
  • anitta
  • megavids
  • lostlight
  • All magazines