Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Someone is using 100s of Hetzner IP to bombard (crawl) my website
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Someone is using 100s of Hetzner IP to bombard (crawl) my website

pkrpkr Member
edited August 2022 in Help

Someone is bombarding my website using their crawler. All the IPs belong to Hetzner. I blocked ~50 IPs, but it has not stopped. What should I do? Is there any IP range of Hetzner so that I can block the full range?

Comments

  • https://bgp.he.net/AS24940#_prefixes
    Good luck with that.

    Looks like someone needs to refresh his Nginx rate limiting + captcha skills, instead
    of blocking half of the internet. Next time it will be Amazon, you'll do the same?

    Thanked by 1martheen
  • @pkr said:
    Someone is bombarding my website using their crawler. All the IPs belong to Hetzner. I blocked ~50 IPs, but it has not stopped. What should I do? Is there any IP range of Hetzner so that I can block the full range?

    What's the user agent?

  • MrRadicMrRadic Patron Provider, Veteran

    @pkr said:
    Someone is bombarding my website using their crawler. All the IPs belong to Hetzner. I blocked ~50 IPs, but it has not stopped. What should I do? Is there any IP range of Hetzner so that I can block the full range?

    If you're behind cloudflare, you can setup a firewall rule to perform a js check on the Hetzner ASN instead of blocking it altogether. This will typically block (most) bots, but won't harm legitimate traffic.

  • @MrRadic said:

    @pkr said:
    Someone is bombarding my website using their crawler. All the IPs belong to Hetzner. I blocked ~50 IPs, but it has not stopped. What should I do? Is there any IP range of Hetzner so that I can block the full range?

    If you're behind cloudflare, you can setup a firewall rule to perform a js check on the Hetzner ASN instead of blocking it altogether. This will typically block (most) bots, but won't harm legitimate traffic.

    @MrRadic do you recommend JS over Managed Challenge

  • MikeAMikeA Member, Patron Provider

    What's the user agent?

  • @pkr said:
    Someone is bombarding my website using their crawler. All the IPs belong to Hetzner. I blocked ~50 IPs, but it has not stopped. What should I do? Is there any IP range of Hetzner so that I can block the full range?

    what is your website that he is crawling lol what kind of info is there

  • @zcorps said:

    what is your website that he is crawling lol what kind of info is there

    https://www.bitsdiscover.com/

  • @Dazzle said:

    @zcorps said:

    what is your website that he is crawling lol what kind of info is there

    https://www.bitsdiscover.com/

    This isn't even the OP

  • EthernetServersEthernetServers Member, Patron Provider
    edited August 2022

    You may like to consider using Cloudflare's WAF to block AS24940 for that specific domain: https://developers.cloudflare.com/waf/tools/ip-access-rules/create/

    There are other options outside of Cloudflare, but this is one of the easiest - especially if you are already using Cloudflare.

    EDIT: I didn't notice MrRadic's reply saying more or less the same thing. A challenge instead of a block would likely be fine as well.

    Or, if you don't want to do that, you could consider grabbing all their IPs from https://bgp.he.net/AS24940#_prefixes and blocking them at the iptables or htaccess level for example.

  • @Dazzle said:

    @zcorps said:

    what is your website that he is crawling lol what kind of info is there

    https://www.bitsdiscover.com/

    @pkr != @Dazzle :D

  • Well, you can’t determine the range of IP addresses that way. Perhaps if you manage to find out the location, you can put a filter.

  • @econnreset said:

    @Dazzle said:

    @zcorps said:

    what is your website that he is crawling lol what kind of info is there

    https://www.bitsdiscover.com/

    This isn't even the OP

    That domain is on OP signature.

    QnA web always attract scrapers, especially if your contents are original. Or if web contents scraped from somewhere, its still attracted lazy bloggers to scrap the content and spun it, or better translated it to other languages.

    I've seen it before and it is milking money from ads.

  • pkrpkr Member

    Thanks, everyone for the suggestion. I found a way to handle it.
    The crawler was accessing search.php of my page with billion combinations of keywords. I just redirected search.php, and now the server load is back to normal. I am not using CF. My static contents are delivered by PushrCDN.

    Thanked by 1emg
  • zcorpszcorps Member
    edited August 2022

    @Dazzle said:

    @zcorps said:

    what is your website that he is crawling lol what kind of info is there

    https://www.bitsdiscover.com/

    oh i see Question2Answers site lol reminded me of my past i had once Q2A site but it was attracting a lot of spammers , people would sign up and were posting there weeds drugs selling sites lol it was hard to administrate then i deleted the project.

    • your site looks like to have genuine content but seems like have not ranked yet. Alexa [ dying ] rank is above 100k.
  • pkrpkr Member

    @zcorps said:

    @Dazzle said:

    @zcorps said:

    what is your website that he is crawling lol what kind of info is there

    https://www.bitsdiscover.com/

    oh i see Question2Answers site lol reminded me of my past i had once Q2A site but it was attracting a lot of spammers , people would sign up and were posting there weeds drugs selling sites lol it was hard to administrate then i deleted the project.

    • your site looks like to have genuine content but seems like have not ranked yet. Alexa [ dying ] rank is above 100k.

    I am not posting regularly as I am busy with finishing my graduation; not working on its ranking.
    I have blocked registration, otherwise users had made it Viagra selling website.

    Thanked by 1zcorps
  • @pkr said: otherwise users had made it Viagra selling website

    You sound as if it was something bad

    Thanked by 1bulbasaur
  • @pkr said:

    @zcorps said:

    @Dazzle said:

    @zcorps said:

    what is your website that he is crawling lol what kind of info is there

    https://www.bitsdiscover.com/

    oh i see Question2Answers site lol reminded me of my past i had once Q2A site but it was attracting a lot of spammers , people would sign up and were posting there weeds drugs selling sites lol it was hard to administrate then i deleted the project.

    • your site looks like to have genuine content but seems like have not ranked yet. Alexa [ dying ] rank is above 100k.

    I am not posting regularly as I am busy with finishing my graduation; not working on its ranking.
    I have blocked registration, otherwise users had made it Viagra selling website.

    LOL same happened to me they would post all sort of shit :D ,

Sign In or Register to comment.