• Amberskin@europe.pub
    link
    fedilink
    English
    arrow-up
    73
    ·
    6 days ago

    Uh, are they admitting they are trying to circumvent technological protections setup to restrict access to a system?

    Isn’t that a literal computer crime?

  • Kissaki@feddit.org
    link
    fedilink
    English
    arrow-up
    108
    arrow-down
    1
    ·
    edit-2
    7 days ago

    Perplexity argues that a platform’s inability to differentiate between helpful AI assistants and harmful bots causes misclassification of legitimate web traffic.

    So, I assume Perplexity uses appropriate identifiable user-agent headers, to allow hosters to decide whether to serve them one way or another?

    • lime!@feddit.nu
      link
      fedilink
      English
      arrow-up
      39
      ·
      7 days ago

      yeah it’s almost like there as already a system for this in place

    • ubergeek@lemmy.today
      link
      fedilink
      English
      arrow-up
      11
      ·
      6 days ago

      And I’m assuming if the robots.txt state their UserAgent isn’t allowed to crawl, it obeys it, right? :P

      • Kissaki@feddit.org
        link
        fedilink
        English
        arrow-up
        4
        ·
        6 days ago

        No, as per the article, their argumentation is that they are not web crawlers generating an index, they are user-action-triggered agents working live for the user.

        • ubergeek@lemmy.today
          link
          fedilink
          English
          arrow-up
          3
          ·
          6 days ago

          Except, it’s not a live user hitting 10 sights all the same time, trying to crawl the entire site… Live users cannot do that.

          That said, if my robots.txt forbids them from hitting my site, as a proxy, they obey that, right?

    • Dr. Moose@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      5
      ·
      6 days ago

      Its not up to the hoster to decide whom to serve content. Web is intended to be user agent agnostic.

    • tempest@lemmy.ca
      link
      fedilink
      English
      arrow-up
      29
      arrow-down
      1
      ·
      6 days ago

      CloudFlare has become an Internet protection racket and I’m not happy about it.

      • Laser@feddit.org
        link
        fedilink
        English
        arrow-up
        21
        ·
        6 days ago

        It’s been this from the very beginning. But they don’t fit the definition of a protection racket as they’re not the ones attacking you if you don’t pay up. So they’re more like a security company that has no competitors due to the needed investment to operate.

        • A1kmm@lemmy.amxl.com
          link
          fedilink
          English
          arrow-up
          4
          ·
          5 days ago

          Cloudflare are notorious for shielding cybercrime sites. You can’t even complain about abuse of Cloudflare about them, they’ll just forward on your abuse complaint to the likely dodgy host of the cybercrime site. They don’t even have a channel to complain to them about network abuse of their DNS services.

          So they certainly are an enabler of the cybercriminals they purport to protect people from.

          • MithranArkanere@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            ·
            5 days ago

            Any internet service provider needs to be completely neutral. Not only in their actions, but also in their liability.
            Same goes for other services like payment processors.
            If companies that provide content-agnostic services are allowed to policy the content, that opens the door to really nasty stuff.

            You can’t chop everyone’s arms to stop a few people from stealing.

            If they think their services are being used in a reprehensible manner, what they need to do is alert the authorities, not act like vigilantes.

          • Laser@feddit.org
            link
            fedilink
            English
            arrow-up
            1
            ·
            5 days ago

            If they acted differently, they’d probably be liable for illegal activity that they proxy for (this is for example relevant for the DMCA safe harbor).

            Anyhow, when on their abuse page, I have an option for “Registrar”, which is used for “DNS abuse”, among others.

    • sunbeam60@lemmy.ml
      link
      fedilink
      English
      arrow-up
      10
      arrow-down
      1
      ·
      6 days ago

      They’re not. They’re using this as an excuse to become paid gatekeepers of the internet as we know it. All that’s happening is that Cloudflare is using this to menuever into position where they can say “nice traffic you’ve got there - would be a shame if something happened to it”.

      AI companies are crap.

      What Cloudflare is doing here is also crap.

      And we’re cheering it on.

    • Leon@pawb.social
      link
      fedilink
      English
      arrow-up
      15
      ·
      7 days ago

      I’m still holding out for Stephen Hawking to mail out Demon Summoning programs.

  • ubergeek@lemmy.today
    link
    fedilink
    English
    arrow-up
    47
    ·
    7 days ago

    Good. I went through my CF panel, and blocked some of those “AI Assistants” that by default were open, including Perplexity’s.

  • kreskin@lemmy.world
    link
    fedilink
    English
    arrow-up
    15
    arrow-down
    3
    ·
    edit-2
    6 days ago

    they cant get their ai to check a box that says “I am not a robot”? I’d think thatd be a first year comp sci student level task. And robots.txt files were basically always voluntary compliance anyway.

    • Dr. Moose@lemmy.world
      link
      fedilink
      English
      arrow-up
      18
      arrow-down
      1
      ·
      6 days ago

      Cloudflare actually fully fingerprints your browser and even sells that data. Thats your IP, TLS, operating system, full browser environment, installed extensions, GPU capabilities etc. It’s all tracked before the box even shows up, in fact the box is there to give the runtime more time to fingerprint you.

      • tempest@lemmy.ca
        link
        fedilink
        English
        arrow-up
        7
        arrow-down
        1
        ·
        6 days ago

        Yeah and the worst part is it doesn’t fucking work for the one thing it’s supposed to do.

        The only thing it does is stop the stupidest low effort scrapers and forces the good ones to use a browser.

      • kreskin@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1
        ·
        5 days ago

        you’re not wrong, but it also allows more than 99.8% of the bot traffic through too on text challenges. Its like the TSA of website security. Its mostly there to keep the user busy while cloudflare places itself in a man in the middle of your encrypted connection to a third party. The only difference between cloudflare and a malicious attacker is cloudflares stated intention not to be evil. With that and 3 dollars I can buy myself a single hard shell taco from tacobell.