(⌐■-■) Perplexity is using stealth, undeclared crawlers to evade website no-crawl directives
(⌐■-■) Perplexity is using stealth, undeclared crawlers to evade website no-crawl directives
Scraping for AI training may or may not be legal. But the effort crawlers put into evading detection and blocking is a smoking gun, an admission this scraping is not fair.
#KI randaliert im Netz – #Admins halten dagegen
Meine @campact -Kolumne aus Mai ist heute tagesaktuell dabei!
> Herzlichen Dank an alle Admins, die unermüdlich dafür kämpfen, uns Nutzende und den Planeten vor der Gier von KI zu schützen. Ich hoffe, dieser Text ist ein Beitrag für mehr Verständnis zu diesem Thema.
https://blog.campact.de/2025/05/ki-randaliert-im-netz-admins-halten-dagegen/
#SysAdmins #SystemadminAppreciationDay #FediAdmins #AI #KIScraping
#AIScraping #TDM #AdminLeiden #MastoAdmin #DataPoisoning #aitxt #GPT #GreenIT
A website appears to be scraping hashtags and creating AI articles, and then replying to the OG post
It stole one of my posts (https://oldfriends.live/@paul/114770093020700675) for its AI created article then spammed me from s00laiman@mastodon.social
It's doing it with #HashTagGames tags and other trending hashtags.
Edit: making links dead as it appears to serve malware now: www.trend247daily.com/articles
Article created from scraped post: www.trend247daily.com/article/mastering-the-art-of-the-productive-day-wake-up-look-busy-go-to-bed
See this thread above, unless the AI content spammer deletes its reply and breaks the thread.
I don't know where it is getting its content, from it's Mastodon Account ( s00laiman@mastodon.social ) account, rss, or the API. If it has an application I would hope staff@mastodon.social and moderation@mastodon.social would shut it down from scraping the API.
The web-scraping by is aggressive not just to hoard training data, but also to keep other AI bots from doing the same.
They're not satisfied with stealing all your content, they also want exclusivity by any means necessary.
Wer sich über die vielen tollen Informationsangebote im Internet freut, sollte wissen:
#KI randaliert im Netz – #Admins halten dagegen, damit wir Menschen ungestört surfen können.
Lest mal, wie Admins ihre absolut frustrierende aber unsichtbare Abwehrarbeit gegen KI beschreiben – im Blog von @campact: https://blog.campact.de/2025/05/ki-randaliert-im-netz-admins-halten-dagegen/
Nicht vergessen: 25. Juli ist #SysAdminDay
Hi #Admins ,
Can you give me quotes that explain your fight against #AIScraping? I'm looking for (verbal) images, metaphors, comparisons, etc. that explain to non-techies what's going on. (efforts, goals, resources...)
I intend to publish your quotes in a text on @campact 's blog¹ (DE, German NGO).
The quotes should make your work visible in a generally understandable way
BlueSky bietet auf Grund der offenen Struktur keinen Schutz vor AI-Scraping. Knackpunkt: die Firehose API. Das Warum und die Folgen werden hier erklärt: https://www.404media.co/someone-made-a-dataset-of-one-million-bluesky-posts-for-machine-learning-research/ #bluesky #aiscraping
Hey, pals! They dropped the list of the artists Midjourney has admitted to being ripping out!
Time to call your lawyers and sue the shit out of these thiefs!
Even if your name is not on the list you should rage at this billion dollars theft scheme.
We barely get to the end of the month while these criminals get billions and billions from investment firms!
Fuck Midjourney and their owners!
#art #AIScraping #SueMidjourney #JusticeForArtists
https://storage.courtlistener.com/recap/gov.uscourts.cand.407208/gov.uscourts.cand.407208.129.10.pdf