rrwo,
@rrwo@floss.social avatar

A website at work added the AI search engine YouBot to robots.txt and blocks all other requests from that user agent.

So now we get requests from the same IP address (in Amazon's netspace) claiming to be Googlebot.

slink,
@slink@fosstodon.org avatar

@rrwo unsolicited psa: always run a reverse + forward dns check to validate it's really googlebot https://developers.google.com/search/docs/crawling-indexing/verifying-googlebot#automatic

juandesant,
@juandesant@astrodon.social avatar

@rrwo what about reporting as per https://about.you.com/es/youbot/?

rrwo,
@rrwo@floss.social avatar

@juandesant

Thanks, but I think there's a difference between reporting a bot that seems to be making an honest mistake due to software bugs, vs one that is outright lying about who they are when they were told not to index a site.

Edit: also, I don't want to let these sort of people know that I've caught on, lest they try and be more deceptive.

juandesant,
@juandesant@astrodon.social avatar

@rrwo in any case, what you report is in violation of what they say they do. Maybe it was a third party trying to be YouBot, and then GoogleBot?

  • All
  • Subscribed
  • Moderated
  • Favorites
  • infosec
  • InstantRegret
  • ngwrru68w68
  • everett
  • mdbf
  • modclub
  • rosin
  • osvaldo12
  • DreamBathrooms
  • thenastyranch
  • magazineikmin
  • Youngstown
  • GTA5RPClips
  • slotface
  • kavyap
  • JUstTest
  • ethstaker
  • tacticalgear
  • tester
  • cubers
  • Durango
  • normalnudes
  • khanakhh
  • Leos
  • anitta
  • cisconetworking
  • provamag3
  • megavids
  • lostlight
  • All magazines