lilithsaintcrow,
@lilithsaintcrow@raggedfeathers.com avatar

One last thing before I turn away to the afternoon's work: I see a lot of advice about using a robots.txt file to halt "AI"-scraper bots, which would be great if it worked.

Since I put the preferred verbiage in my own site's robots.txt, my firewall has stopped over 1800 "AI"-scraping attempts.

lilithsaintcrow,
@lilithsaintcrow@raggedfeathers.com avatar

User-agent blocking at the firewall level is better. Sure, belt and suspenders, put the verbiage in your robots.txt, but ALSO keep your user agent blocks up to speed.

lilithsaintcrow,
@lilithsaintcrow@raggedfeathers.com avatar

This isn't as hard as it sounds! Most firewalls have a way to add user-agent blocking. (Remember those asterisks!) Many CDNs, like Cloudflare, have the same function. Do a couple DuckDuckGo searches and protect yourself.

lilithsaintcrow,
@lilithsaintcrow@raggedfeathers.com avatar

Also, I'd recommend getting rid of any Google Site Kit plugins etc., and anything Google-related. There's no telling what they'll slip into their code snippets next, my friends.

Just a friendly bit of advice.

jake,
@jake@posts.jacobhaddon.com avatar

@lilithsaintcrow Yeah starting the slow process of removing google fonts from my sites for this reason.

lilithsaintcrow,
@lilithsaintcrow@raggedfeathers.com avatar

@jake I should look that up. Goddammit, it’s neverending. sigh

jake,
@jake@posts.jacobhaddon.com avatar

@lilithsaintcrow I’ll take it for action and get something posted about it. … take it for action? Did I just say that? (Too much work today)

lilithsaintcrow,
@lilithsaintcrow@raggedfeathers.com avatar

@jake I actually just went monkeying around on my site and found an easy way to do it with the theme I have installed, so…bonus? (I should be working, but…)

jake,
@jake@posts.jacobhaddon.com avatar

@lilithsaintcrow Fixed the font thing which was easier when I found the semi-colon I missed … tomorrow. Robots.txt.

lilithsaintcrow,
@lilithsaintcrow@raggedfeathers.com avatar

@jake I really, REALLY recommend using user-agent blocking with a firewall as well, since the bots don't seem to obey the robots.txt. (So far, my firewall has blocked over 1800 scraping attempts by bots which should have obeyed the robots.txt.) So…belt and suspenders, I'd use both.

jake,
@jake@posts.jacobhaddon.com avatar

@lilithsaintcrow Will do. I’ve got this great page from Neil Clarke bookmarked to go through for info: https://neil-clarke.com/block-the-bots-that-feed-ai-models-by-scraping-your-website/

lilithsaintcrow,
@lilithsaintcrow@raggedfeathers.com avatar

@jake Yep, I recommend that post in particular. Good luck! May we both triumph against the plagiarism pink sauce grift!

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • khanakhh
  • DreamBathrooms
  • InstantRegret
  • magazineikmin
  • everett
  • Youngstown
  • ngwrru68w68
  • slotface
  • ethstaker
  • rosin
  • thenastyranch
  • kavyap
  • GTA5RPClips
  • mdbf
  • JUstTest
  • tester
  • tacticalgear
  • Durango
  • osvaldo12
  • anitta
  • cubers
  • modclub
  • Leos
  • cisconetworking
  • provamag3
  • normalnudes
  • megavids
  • lostlight
  • All magazines