Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Shells Virtual Desktop
BMail.ag - Secure Email Service
Server.net
CPLicense.net
VPS Server
Buy VPN
Vultr
VMs for AI
HostDare
ReliableSite White-Label Dedicated Hosting for Resellers
InterServer VPS
BMail.ag - Secure Email Service
Best VPN
High-Performance Bare Metal Server Solutions
Karvl.com
Server Mania Cloud Hosting
DataWagon Hosting
AlphaVPS Hosting
Evoxt.com
Clouvider
VPS Hosting with NVMe
Residential IPs in the US & 4G Mobile Proxies in EU & US with Unlimited Bandwidth
ReliableSite White-Label Dedicated Hosting for Resellers
Rabisu - Hosting Solutions
Shells Virtual Desktop
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Tired of fighting with Bots

praburampraburam Member
edited April 5 in General

day by day bots are getting stronger and please suggest some ideas to block them. I tried to block them via cloudflare based on country. But somehow they break them and comeup with scrapping. 🥲

Comments

  • LeviLevi Member

    Pull the plug or just localhost it. Nothing you can do about it. Nothing.

    Thanked by 2rpqu host_c
  • HotmarerHotmarer Member

    Use Anubis or similar solutions. In Cloudflare you cna also enable similar bot protection.

    Thanked by 1oloke
  • budi1413budi1413 Member

    cant fight them

    then join them

  • zedzed Member

    give in

  • suyadi92suyadi92 Member

    @budi1413 said:
    cant fight them

    then join them

    ref link possibel?

  • yoursunnyyoursunny Member, IPv6 Advocate

    Mentally strong people allow bots into all web properties.
    Block AI Bots Scope (Deploys a Cloudflare-managed rule to block bots that we categorize as AI training crawlers from visiting your site): Do not block (allow crawlers) AI training bots will be allowed to scrape content

  • RubbenRubben Member
    edited April 5

    When you're scraping and the site has cloudflare under attack mode enabled with the puzzle thingie to confirm ur a human, that can really fuck with bots. You should have it enabled.

    Or if you have found a way to curcimvent it, let me know.

  • plumbergplumberg Veteran, Megathread Squad

    @praburam said:
    day by day bots are getting stronger and please suggest some ideas to block them. I tried to block them via cloudflare based on country. But somehow they break them and comeup with scrapping. 🥲

    Become a bot yourself.

    There's a saying (or I am saying ) - if you want to fight something/ someone, get in their shoes, become one of them (figuratively).this will help you be at par or close to combating them

    Hence, again, become a bot.

    Thanked by 2host_c tentor
  • emperoremperor Member
    location = /robots.txt {
           add_header  Content-Type  text/plain;
           return 200 "User-agent: *\nDisallow: /\n";
        }
    
  • NushairAlviNushairAlvi 🚩 Host Rep Tag Suspended

    @praburam said:
    day by day bots are getting stronger and please suggest some ideas to block them. I tried to block them via cloudflare based on country. But somehow they break them and comeup with scrapping. 🥲

    Try Cloudflare Pro with proxy on !! I think it will help ..

  • 384_cz384_cz Member
    edited April 5

    @Hotmarer said:
    Use Anubis or similar solutions. In Cloudflare you cna also enable similar bot protection.

    anubis does require js and is way too bloated, just call the police for DDoS!
    a lawyer will help you as well

  • sillycatsillycat Member

    Fix your website. You should be able to handle that much traffic.

    Thanked by 2384_cz yoursunny
  • HotmarerHotmarer Member

    @384_cz said:

    @Hotmarer said:
    Use Anubis or similar solutions. In Cloudflare you cna also enable similar bot protection.

    anubis does require js and is way too bloated, just call the police for DDoS!
    a lawyer will help you as well

    But it will block most of the bots.

    Thanked by 1384_cz
  • forestforest Member
    edited April 5

    Just put Nepenthes on your server:
    https://zadzmo.org/code/nepenthes/

    Demo page (what the AI sees with Nepenthes):
    https://zadzmo.org/nepenthes-demo


    Or if you want something that uses less CPU but is less evil to the bots, use Iocane:
    https://iocaine.madhouse-project.org/

    Demo page (what the AI sees with Iocane):
    https://poison.madhouse-project.org/

    Thanked by 1suyadi92
  • suyadi92suyadi92 Member

    @forest said:
    Just put Nepenthes on your server:
    https://zadzmo.org/code/nepenthes/

    Demo page (what the AI sees with Nepenthes):
    https://zadzmo.org/nepenthes-demo


    Or if you want something that uses less CPU but is less evil to the bots, use Iocane:
    https://iocaine.madhouse-project.org/

    Demo page (what the AI sees with Iocane):
    https://poison.madhouse-project.org/

    warning from that page:

    but it works by providing them with a neverending stream of exactly what they are looking for. YOU ARE LIKELY TO EXPERIENCE SIGNIFICANT CONTINUOUS CPU LOAD.

  • forestforest Member
    edited April 5

    @suyadi92 said:

    @forest said:
    Just put Nepenthes on your server:
    https://zadzmo.org/code/nepenthes/

    Demo page (what the AI sees with Nepenthes):
    https://zadzmo.org/nepenthes-demo


    Or if you want something that uses less CPU but is less evil to the bots, use Iocane:
    https://iocaine.madhouse-project.org/

    Demo page (what the AI sees with Iocane):
    https://poison.madhouse-project.org/

    warning from that page:

    but it works by providing them with a neverending stream of exactly what they are looking for. YOU ARE LIKELY TO EXPERIENCE SIGNIFICANT CONTINUOUS CPU LOAD.

    Indeed (although that points to their software being poorly optimized). That is why Iocane may be a better option as it does not result in significant CPU load (but it also doesn't slow down bots as much).

    Thanked by 1suyadi92
  • cmeerwcmeerw Member

    Are all bots (equally) bad? How do you make sure you only bock the bad bots? And no humans?

    Thanked by 1MannDude
  • forestforest Member
    edited April 5

    @cmeerw said:
    Are all bots (equally) bad? How do you make sure you only bock the bad bots? And no humans?

    You can whitelist IP ranges from well-behaved bots. I'm sure Cloudflare allows the Internet Archive and Google spiders, for example. As for people, there are generally three ways: IP reputation, fingerprinting, and proof-of-work.

    IP reputation simply works by querying IP databases (or your own database if you're a big CDN like Cloudflare that can see a lot of the internet at once). Fingerprinting is based on analyzing browser behavior. It's possible for bots to get around that by running a genuine browser rather than a lightweight script and interacting with it with something like Selenium, but that's heavier and forces the bot owner to expend more resources. Finally, proof-of-work involves completing a mathematical puzzle. It's automated in JavaScript, but it takes 100% CPU for a fraction of a second. That doesn't bother humans (too much) because it just slightly increases page load time, but it's a huge barrier for bots which would otherwise be able to visit thousands of pages per second.

  • ObelousObelous Member
    edited April 6

    I don't bother fighting bots beyond basic measures like auth, rate limits, and firewalling off services that don't need to be open to the internet (e.g. SSH). In this case, I'm only referring to the bots scanning all types of shit, and not targeted scraping or similar.

    If it causes noticeable load (e.g. with git crawling), then sure, I'll put in a bit more effort.

    @forest said: Finally, proof-of-work involves completing a mathematical puzzle. It's automated in JavaScript, but it takes 100% CPU for a fraction of a second. That doesn't bother humans (too much) because it just slightly increases page load time, but it's a huge barrier for bots which would otherwise be able to visit thousands of pages per second.

    It bothers me, I hate Anubis, and I just close the site whenever I see the Anubis page.

  • MannDudeMannDude Patron Provider, Veteran

    What is the resource bottleneck being exhausted?

    Everything I do is basically on 2GB of RAM or less, 1 vCPU. Nginx and PHP. No bot specific protection.

    What are we afraid of, exactly? That they'll access the publicly accessible data and information I willingly published to be accessible and available to everyone?

    Thanked by 1raindog308
  • I wonder if there’s something like a bot black hole I can create to trap them.

  • @sillycat said: Fix your website. You should be able to handle that much traffic.

    wdym my $12/y vps cant handle the LOAD

    Thanked by 1sillycat
  • forestforest Member

    @MannDude said:
    What is the resource bottleneck being exhausted?

    Everything I do is basically on 2GB of RAM or less, 1 vCPU. Nginx and PHP. No bot specific protection.

    What are we afraid of, exactly? That they'll access the publicly accessible data and information I willingly published to be accessible and available to everyone?

    Bandwidth. There are many sites right now, especially medium-sized forums and blogs, that are being crippled by the tremendous traffic that these scrapers bring. I doubt many people would care if the bots were polite like web spiders.

  • forestforest Member
    edited April 6

    @Obelous said: It bothers me, I hate Anubis, and I just close the site whenever I see the Anubis page.

    I hate it too and often do as well. Unfortunately, the alternative for some sites might be a timeout or 503.

    @DrNutella said:
    I wonder if there’s something like a bot black hole I can create to trap them.

    A few posts up. ;)

    @forest said:
    Just put Nepenthes on your server:
    https://zadzmo.org/code/nepenthes/

    Demo page (what the AI sees with Nepenthes):
    https://zadzmo.org/nepenthes-demo


    Or if you want something that uses less CPU but is less evil to the bots, use Iocane:
    https://iocaine.madhouse-project.org/

    Demo page (what the AI sees with Iocane):
    https://poison.madhouse-project.org/

  • jcn50jcn50 Member
    edited April 6

    How can you have the AIRPLANE MODE activated and still be Wifi-connected??..

  • ObelousObelous Member

    @jcn50 said: How can you have the AIRPLANE MODE activated and still be Wifi-connected??..

    Just enable airplane mode and then enable wifi...

    Thanked by 1yoursunny
Sign In or Register to comment.