- cross-posted to:
- aboringdystopia@lemmy.world
- cross-posted to:
- aboringdystopia@lemmy.world
cross-posted from: https://mander.xyz/post/34629331
cross-posted from: https://programming.dev/post/34472919
cross-posted from: https://mander.xyz/post/34629331
cross-posted from: https://programming.dev/post/34472919
Unfortunately robots.txt only stops the well behaved scrapers. Even with disallow all, you’ll still get loads of bots. Setting up the web server to block those user agents would work a bit better, but even then there’s bots out there crawling using regular browser user agents.