Login
You're viewing the mastodon.acc.sunet.se public feed.

Replies

  • 💬 0🔄 0⭐ 0
  • Jun 29, 2026, 6:31 PM

    @ilooo what am I supposed to put in for "how successful measures were" for measures I didn't try? There's no "not applicable" option.

    Screenshot of survey options: 
"
How successful where those measures?
Authentification
	
very successful
	
fairly succesful
	
temporarily successful
	
not at all successful
Rate limiting/Throttling
	
very successful
	
fairly succesful "
    💬 0🔄 0⭐ 0
  • 💬 0🔄 0⭐ 0
  • Jun 29, 2026, 6:33 PM

    @ilooo @tante I had serious reservations about this survey just from the phrase "AI crawler pandemic" in the posting and the intro page, but I decided to try answering anyway.

    I made it a couple pages in before giving up. From my point of view the disaster that has developed in the last 12 months is the use of residential proxies for crawling. I don't know or care whether AI training is the motive. It's the proxies that are the problem regardless of the motive. /

    💬 1🔄 0⭐ 0
  • Jun 29, 2026, 6:34 PM

    @ilooo @tante and it doesn't appear there's any way I can convey that information through this survey without giving them support I can't honestly give for an anti-AI thesis that I don't know is true and suspect isn't.

    💬 0🔄 0⭐ 0
  • Jun 29, 2026, 6:41 PM

    @mattskala @tante well, you could provided us with that info about residential proxis in the last text field. but thanks for the hint anyways.

    💬 0🔄 0⭐ 0
  • Jun 29, 2026, 7:00 PM

    @ilooo @tante hey, I could only circumstantially infer whether something is an AI crawling happening or something else, but also, I don't care about that distinction at all, it doesn't help at all to keep systems running; so most of the questionaire is impossible to answer. "I guess it's an AI crawler, but I'm not sure. Let me arbitrarily assign blame here to a specific group of technologists!": is that really useful data?

    💬 1🔄 0⭐ 0
  • Jun 29, 2026, 7:06 PM

    @funkylab @ilooo @tante For me it was very easy to understand what traffic came from agents and AI crawlers, and I filled out the survey accordingly. If you don't have that info, perhaps this survey is not for you.

    💬 0🔄 0⭐ 0
  • Jun 29, 2026, 8:45 PM

    @despens @ilooo @tante how so? Do your clients carry a HTTP header field that say "this is an AI scraper"? No, they don't. I claim: You're just inferring from current general situation, and that the traffic patterns you see changed, don't you? So, you're putting speculation in a survey.

    💬 0🔄 0⭐ 0
  • Jun 29, 2026, 8:50 PM

    @despens @ilooo @tante don't get me wrong, the survey is then still interesting, but it says something about how much traffic gets *attributed to a class of sources by human administrators*, which is something fundamentally different than how much traffic *comes from that class of sources*. It says something about how much of your troubles you attribute, not about how much of your troubles is really caused that way.
    We speculate; fine, but something to be aware of, not to "confidence away".

    💬 0🔄 0⭐ 0
  • Jun 29, 2026, 8:55 PM

    @funkylab @despens @tante you are right! it would be great if the AI crawlers would be honest. But then, the survey might not be necessary. You could just block them...

    💬 1🔄 0⭐ 0
  • Jun 29, 2026, 8:58 PM

    @ilooo @despens @tante right, but the survey kind of conflates the premise of "measuring AI scraping load" with "measuring administrator anger towards AI scrapers", to put it pointedly :-)
    That's why I asked whether this is valuable data to you: are you interested in the former, or the latter?

    💬 1🔄 0⭐ 0
  • 💬 1🔄 0⭐ 0
  • Jun 29, 2026, 9:33 PM

    @ilooo @despens @tante I'm still not clear on *what* you're estimating here, because yes, you only get estimates, but you need to decide of what, but I don't think you can change the survey now, so, hm, maybe I should stop riding this point.

    💬 0🔄 0⭐ 0
  • 💬 0🔄 0⭐ 0
  • Jun 29, 2026, 8:07 PM

    @blotosmetek @tante we had a discussion on that one: since the technical set up of your services might determine quite a bit how your services are affected, we decided against multiple choice. Makes sense?

    💬 1🔄 0⭐ 0
  • 💬 0🔄 0⭐ 0
  • 💬 0🔄 0⭐ 0
  • Jun 29, 2026, 10:00 PM

    @ilooo participated as well but I think my answers aren't going to be super useful or indicative considering I have different roles and systems in a wide range of environments from commercial to open source to personal.

    I'd agree with some others that the survey could be split up, perhaps with impact estimates and similar divided up per-environment that we manage. This should also help show that smaller projects may be harmed moreso than large commercial sites.

    💬 1🔄 0⭐ 0
  • 💬 0🔄 0⭐ 0