Tomorrow (Oct 18) is day 1 of P99 CONF! I'm excited to learn from:
🟪 Zhen Li & Widya Salim about the impact of latency at Twitter
🟪 Gwen Shapira about high performance on a low budget
🟪 @TimVereecke about noise-cancelling RUM
🟪 More great talks than will fit in this space
What do companies with fast websites have in common? They all (in my experience) use performance budgets to prevent regressions.
This post has been a long time coming. I did my best to distill all the best practices I've shared at conferences and during countless consultations with companies of all sizes. I hope people find it helpful!
My pals in BBC World Service have been doing some awesome work on "lite" versions of their news articles (other page types to follow).
They essentially skip the Server-Side React hydration which means you end up with a simpler HTML+CSS page, no JS.
Page sizes drop significantly:
Fediverse traffic is pretty bursty and sometimes there will be a large backlog of Activities to send to your server, each of which involves a POST. This can hammer your instance and overwhelm the backend’s ability to keep up. Nginx provides a rate-limiting function which can accept POSTs at full speed and proxy them slowly through to your backend at whatever rate you specify.
For example, PieFed has a backend which listens on port 5000. Nginx listens on port 443 for POSTs from outside and sends them through to port 5000:
upstream app_server { server 127.0.0.1:5000 fail_timeout=0;}
This will use up to 100 MB of RAM as a buffer and limit POSTs to 10 per second, per IP address. Adjust as needed. If the sender is using multiple IP addresses the rate limit will not be as effective. Put this directive outside your server {} block.
Then after our first location / {} block, add a second one that is a copy of the first except with one additional line (and change it to apply to location /inbox or whatever the inbox URL is for your instance):
300 is the maximum number of POSTs it will have in the queue. You can use limit_req_dry_run to test the rate limiting without actually doing any limiting – watch the nginx logs for messages while doing a dry run.
It’s been a while since I set this up so please let me know if I mixed anything crucial out or said something misleading.
For a very small instance with only a couple of concurrent users a CDN might not make much difference. But if you take a look at your web server logs you’ll quickly notice that every post / like / vote triggers a storm of requests from other instances to yours, looking up lots of different things. It’s easy to imagine how quickly this would overwhelm an instance once it gets even a little busy.
One of the first web performance tools people reach for is to use a CDN, like Cloudflare. But how much difference will it make? In this video I show you my web server logs before and after and compare them.
The short answer is – before CDN: 720 requests. After CDN: 100 requests.
Usually just turning on a CDN with default settings will not help very much, you’ll need to configure some caching rules or settings. By watching your server logs for a while you’ll get a sense for what needs to be cached but check out mine for a starting point:
Beware of caching by URI Path because often fediverse software will return different data depending on the Accept header that the requester sets. For example, on PieFed and Lemmy instances a request by a web browser to /post/123 will return HTML to show the post to someone. But when that same URL is requested with the Accept: application/ld+json header set, the response will be an ActivityPub representation of the post! You don’t want people getting activitypub data in their browser and you don’t want to be serving HTML to other instances. Once you spot a URL you want to cache, use a tool like Postman to set the Accept header and make a fake ActivityPub request to your instance and see if you get back HTML or JSON.
Another problem that can happen is that often a response will vary depending on whether the viewer is logged in, or who is logged in. If you can figure out how to configure the CDN to pay attention to cookies or whatever headers are used for Authentication by your platform then you might be able to cache things like /post/*… I couldn’t.
The things I’ve chosen to cache by URI Path above are ones that I know don’t vary by HTTP header or by authentication.
Although we can’t use URI Path a lot of the time, we can cache ActivityPub requests by detecting the Accept: allocation/ld+json header:
https://join.piefed.social/wp-content/uploads/2024/02/caching_activity2-1024x811.pngThis will cache all ActivityPub requests, regardless of URL. People browsing the same URLs as those used by ActivityPub will be unaffected as their requests won’t have the special HTTP header. I used a short TTL to avoid serving stale data when someone quickly edits a post straight after creating it.
There seems to be a deep vein of optimization here which I’ve only just started to dig into. These changes have made a huge difference already and for now my instance is under very little load so I’ll leave it there for now…
Google provides a tool called PageSpeed Insights which gives a website some metrics to assess how well it is put together and how fast it loads. There are a lot of technical details but in general green scores are good, orange not great and red is bad.
I tried to ensure the tests were similar for each platform by choosing a page that shows a list of posts, like https://mastodon.social/explore.
The rest don’t seem to have prioritized performance or chose a software architecture that cannot be made to perform well on these metrics. It will be very interesting to see how that affects the cost of running large instances and the longevity of the platforms. Time will tell.
Just spotted that the Google HAR analyser has a "download redacted version of this HAR" button. That's pretty cool, redacting by hand is massively tedious and error prone. https://toolbox.googleapps.com/apps/har_analyzer/
Us sitting here with our fiber internet and recent model phones have it pretty good. But the “i” in iPhone stands for “inequality”. Most people in the world still have pretty bad internet and old/slow phones. For a platform to be widely adopted and to serve the needs of those who often miss out, it needs to be frugal in network and cpu usage.
Lemmy
Kbin
PieFed
Home page
4.5 MB
1.65 MB
700 KB – 930 KB
Viewing a post
360 KB
826 KB (varies)
29 KB
Home pages
Due to Lemmy’s javascript-heavy software architecture, visiting a Lemmy home page involves downloading . And this only gets you 20 posts! Also community thumbnails, even if displayed as a 22px by 22px icon are served directly from their home instances, unresized, which can often be multiple megabytes in size. The home page of lemmy.nz is currently weighing over 9 MB.
Kbin’s home page comes in at a respectable 1.65 MB due to relying less on JavaScript. However it is let down by not using loading=”lazy” on images so they all need to be loaded immediately and by generating post thumbnails that are twice as big as they need to be.
When viewing a post, we can assume various assets (CSS, JS and some images) are cached due to loading the home page first.
The picture looks similar when viewing a post, which is a bit surprising. One of the usual benefits of the JS-heavy SPA architecture used by Lemmy is that once all the ‘app’ is loaded into the browser, subsequent pages only involve a small API call. However, going to a page in Lemmy involves two API calls (one for the page and one for the comments) both of which return quite a bit of data. If you look at the ‘get the comments on this post’ JSON response you can see the developers have fallen into the classic SPA pitfall of “over-fetching“. They’re retrieving a whole haystack from the backend and then using JavaScript to find the needle they want, which involves transferring the haystack over the internet. Ideally the backend would find the needle and just send that to the frontend.
Kbin sends more data than it needs to when viewing a post, again because of not using loading=”lazy” which causes every profile picture of the commenters to be loaded at once. Making this simple fix would bring the weight down, from ~800 KB to around 50 KB.
PieFed only sends 10 KB – 30 KB to show a post, but it varies depending on the number and length of comments. This could be reduced even more by minifying the HTML response but with PieFed under active development I prefer the source to be as readable as possible to aid in debugging.
This is no accident. It is the result of choices made very early on in the development process, well before any code was written. These choices were made based on certain priorities and values which will continue to shape PieFed in the future as it grows. In a world where digital access remains unequal, prioritizing accessible and fast-loading websites isn’t just about technology; it’s a step towards a more inclusive and equitable society.
Every year I revisit the topic of web performance budgets. Here's my updated guide, including:
✅ What are performance budgets?
✅ Why are they a crucial tool in fighting page speed regression?
✅ Best metrics to track
✅ Determining thresholds
✅ Pro tips
More important INP insights from @cliff, including:
🔵 Only 2/3 mobile sites have "good" INP
🟢 Mobile INP = Android INP
🟡 Mobile INP has an even stronger correlation with bounce rate and conversions than desktop INP
Just finished recording my talk for #p99conf. I'm so happy to be sharing best practices and real-world tips about using performance budgets to fight regressions – and keep your users happy!
https://community.perfplanet.com/#webperformance This community forum has promise. We cant do everything in private fora I think that some web facing materials and discourse would be really helpful. At any rate, I made one of the first threads, so I would say this lol
Your time is the most precious thing you have. When I talk to customers, one of the best things I hear is how much time they DON'T spend using @speedcurve:
"We actually don't log in much. We've set up performance budgets and deploy testing. We just wait to get alerts and then dive in to fix things."
Cookie consent popups and banners are everywhere – and they're silently hurting the speed and UX of your pages. @cliff explains common issues – and solutions – related to measuring performance for consent popups.
Another great analysis from @cliff. If your site uses a consent management platform (CMP), it's probably messing with your performance metrics.
Cliff breaks down the five most common issues – which affect all three Core Web Vitals, among other things – and he also provides some helpful scripting workarounds.
PERFORMANCE HERO • per-FAWR-muhns HEER-oh • noun • A person who has made a huge contribution to the #webperf and #ux community, without whom the web would be a sadder, slower place.