When @BeAware@social.beaware.live asks for help scaling out Mastodon because his... - Random

devnull, 2 months ago

When @BeAware asks for help scaling out Mastodon because his SINGLE USER INSTANCE is falling over, and he reveals that he's paying for an 8 vCPU server with 16GB of memory, and all the comments are talking about tweaking postgres.

What the flying fuck.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ hrefna

Image

Image alternative text

ryansingel, 2 months ago

@devnull
@BeAware

Rails everytime.

No one ever remembers Twitter's S1 which revealed they paid Chris Fry $10M a year to save Twitter from Rails.

A simple LAMP stack was right there

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

devnull, 2 months ago

That kind of hardware is what we use for our most demanding enterprise level customers who were seeing 1000+ concurrent connections.

People ask us how to tweak Mongo or Redis to optimize NodeBB and 10 years in the answer is the same: the database is not your bottleneck (at least for us).

I don't have enough industry experience to say definitively, but when you start looking into tweaking your database to squeeze more juice out of it YOUR APP IS MAKING TOO MANY EXPENSIVE DATABASE CALLS.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

BeAware, 2 months ago

@devnull I know it's overkill. I wanted it to be future proof but I am noobish with Linux so i haven't learned how to optimize it. Hence the request.😅I'll gladly downgrade but I have to learn what I can do and how to do it first.🤷‍♂️

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

devnull, 2 months ago

@BeAware That's completely fair. If we're to truly democratize fedi software, you shouldn't need to be have advanced sysop skills to administer it.

I'm just railing against the common expectation nowadays that "web apps are slow and resource intensive", because that's tantamount to giving up.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hrefna, 2 months ago

@devnull There's something of a tendency to try to compensate for slowness in the rest of the stack with the database and it can almost work if you are willing to throw enough money at vertically scaling the database.

Which is a shame, because mastodon then proceeds to NOT DO THE THINGS that would allow the database to scale more cleanly or cheaply.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hrefna, 2 months ago

@devnull I commented awhile back that I can't tell what mastodon is actually targeting, because it doesn't seem to scale up nor down particularly well.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

devnull, 2 months ago

@hrefna the thing is, it's an incredibly low bar to clear.

You probably knew this already, but I'm proud of how fast NodeBB is. However, it's not like we spent $10M+ (as @ryansingel shared re: twitter) solving this "hard problem".

@baris and I literally just spent a couple weeks optimizing our code to not do stupid things, batched calls if able, rewrote lower level calls to optimize, etc. and continue to keep efficiency back-of-mind when writing new code.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

devnull, 2 months ago

So to sit back and say "yeah there's nothing we can do about it, web apps are slow" is just wilful ignorance at best and learned helplessness at worst.

Easy to do fun things instead of tech debt; we didn't want to optimize NodeBB back then either, we just had a client breathing down our necks to fix it and fix it fast.

But Mastodon is not NodeBB from 10 years ago. This software is used by 1M times the amount of people that used NodeBB. Is that not motivation enough?

@hrefna @ryansingel

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hrefna, 2 months ago

@devnull

Yuuuuup. For the vast majority of PSQL applications it's:

Reduce your number of connections, use pooling.

Use batching where you can.

Reuse results where you can.

Be particularly mindful of when you are doing deletes.

Make sure you are keeping up on database maintenance and that vacuum isn't too far behind.

For most problems that's it.

@ryansingel @baris

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hrefna, 2 months ago

@devnull

I asked if they had investigated partitioning awhile back for large servers and was told on no uncertain terms that they "didn't want to require an SRE org" (not true, it's also trivial to set up in PSQL, albeit with painful migration) and that it would "only help if they also sharded" (also not remotely true and I've worked on 10+ TB databases that prove it).

At that point I basically decided it wasn't worth my time to argue with people and try to convince them.

@ryansingel @baris

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jenniferplusplus, 2 months ago

@hrefna
[nathan-fillion-speechless.gif]

But. But the indexes. Partitions also break up the indexes. You can keep them in reasonable amounts of memory again.
@devnull @ryansingel @baris

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hrefna, 2 months ago

@jenniferplusplus

Exactly! This is one of the absolute biggest reasons to do partitioning. That and it makes time-based deletions trivial instead of hellishly expensive.

They also make vacuum much less resource intensive. Like orders of magnitude less in some circumstances.

@devnull @ryansingel @baris

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jenniferplusplus, 2 months ago

@hrefna
It should even be a reasonably tractable implementation for mastodon. They already use db generated monotonically increasing primary keys. For all the drawbacks of that strategy, they are at least time-ordered, in an application with a strong recency bias. So you would expect to overwhelmingly serve queries from a single partition.

@devnull @ryansingel @baris

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Private

thisismissem, 2 months ago

@hrefna @devnull @ryansingel @baris compounding this is that there's a lot of really bad "scaling mastodon" advice out there, which doesn't actually have the correct number of database connections calculated, so you basically get locks there because there's no connections available.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Add comment