Admin on the slrpnk.net Lemmy instance.

He/Him or what ever you feel like.

XMPP: povoq@slrpnk.net

Avatar is an image of a baby octopus.

  • 0 Posts
  • 2 Comments
Joined 2 years ago
cake
Cake day: September 19th, 2022

help-circle

  • It seems any somewhat easy to implement solution gets circumvented by them quickly. Some of the bots do respect robots.txt through if you explicitly add their self-reported user-agent (but they change it from time to time). This repo has a regularly updated list: https://github.com/ai-robots-txt/ai.robots.txt/

    In my experience, git forges are especially hit hard, and the only real solution I found is to put a login wall in front, which kinda sucks especially for open-source projects you want to self-host.

    Oh and recently the mlmym (old reddit) frontend for Lemmy seems to have started attracting AI scraping as well. We had to turn it off on our instance because of that.