Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Shells Virtual Desktop
BMail.ag - Secure Email Service
Server.net
CPLicense.net
VPS Server
Buy VPN
Vultr
VMs for AI
HostDare
HostDare
ReliableSite White-Label Dedicated Hosting for Resellers
25% Recurring Discount on NVMe VPS
InterServer VPS
BMail.ag - Secure Email Service
Best VPN
High-Performance Bare Metal Server Solutions
Karvl.com
Server Mania Cloud Hosting
DataWagon Hosting
AlphaVPS Hosting
Evoxt.com
Clouvider
VPS Hosting with NVMe
Residential IPs in the US & 4G Mobile Proxies in EU & US with Unlimited Bandwidth
ReliableSite White-Label Dedicated Hosting for Resellers
Rabisu - Hosting Solutions
Shells Virtual Desktop
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Distributed scraping attack from random IP's

On June 25 2026 one of our magento production website was found having heavy load and website stopped working. I have checked the case and I could see it was heavy traffic to website and load has induced. Website is already behiend cloudflare proxy and we already have rate limitations via nginx for bots. When the issue happened, I initially enabled Nginx rate limiting and created a new Fail2Ban jail to automatically block abusive IPs. While monitoring the logs, I noticed the server was still under heavy load and PHP-FPM was repeatedly hitting its limits, eventually causing the website to return server errors.

After investigating further, it became clear that this wasn't a typical attack from a handful of IPs. It was a distributed scraping campaign where each IP was making only one or two requests, making traditional Nginx rate limiting and Fail2Ban largely ineffective since neither IP exceeded the configured thresholds.

Most of the requests were targeting Magento category and product pages with various filter combinations in the query string. These requests appeared to bypass Varnish caching, resulting in PHP generating the pages repeatedly, which explains why PHP-FPM became overloaded.

While analyzing the access logs more closely, I noticed another interesting pattern. A large number of these requests were using randomly generated, but consistently old, Windows and Chrome user agents. Based on that observation, I created a Cloudflare WAF custom rule to apply a Managed Challenge to requests matching those characteristics. Within just a few minutes, the rule had already triggered around 1,360 challenges, confirming that a significant amount of the traffic matched the pattern.

Since enabling the WAF rule, the server load has returned to normal, PHP-FPM is no longer being overwhelmed, and both the website and internal links are functioning normally again.

I'm still interested in understanding the root cause, though. Has anyone else encountered this type of distributed scraping against Magento, where thousands of IPs each make only a couple of requests specifically to filtered category/product pages? If so, how did you handle it without affecting legitimate users? I'm particularly interested in improving cache efficiency or identifying better ways to stop this type of traffic before it reaches the origin server. After the raise of AI, I am experiencing these kind of issues with random servers.

Comments

  • I notice a lack of anubis, stopped my repos getting hammered

  • edited 4:41PM

    @hostcurator said:
    I'm still interested in understanding the root cause, though. Has anyone else encountered this type of distributed scraping against Magento, where thousands of IPs each make only a couple of requests specifically to filtered category/product pages? If so, how did you handle it without affecting legitimate users? I'm particularly interested in improving cache efficiency or identifying better ways to stop this type of traffic before it reaches the origin server. After the raise of AI, I am experiencing these kind of issues with random servers.

    Given i don't do a lot of webhosting these days i haven't but i think it'll become more and more common. It's just logical. Protection is being ramped up and the scrapers follow suit. Consider yourself lucky they gave you such an easy out with the outdated agent strings. Sadly that's very, very easy to fix on the scrapers side and the more these oversights lead to successful blocks the more they'll be avoided.

    To be brutally honest i don't have a positive outlook as far as a functional web is concerned. As things currently stand there's basically no way around turning it into a wasteland of increasingly convoluted captchas and even that might not translate into a solution.

  • layer7layer7 Member, Host Rep, LIR
    edited 4:59PM

    Hi,

    essentially you have the choice between:

    A ) paid protection ( akamai, cloudflare, ... )
    B ) (higher) risk of false positive by going into GEO blocking, Timing blocking, ASN / IP Blocking, ... and others that will lead to 90% of attacker will be filtered out while 10% of legit users will be filtered out too. Means you have to deal with 10% attacker load and have still 90% of legit user traffic

    Both is obviously annoying but thats how it is.

    Depending on your location, in general i suggest to block all public clouds, all hosting type IPs, all VPN type of IPs and so on.

    There is usually no real reason that any hosting/cloud IPs should access a webshop. ( But to scrap information ).

    Also VPN is questionable if its required. But that depends on your own location and target market.

    The times of free internet are already long gone. And with improving bots / AI / ... everything but IP / ASN / Network based filtering will soon turn completely ineffective. You have it already now that captcha's are loosing effectiveness. New captcha comes out, helps for some time and then loose again effectiveness. Endless circle.

    So best is, filter GEO / IP Type / ASN based. Not nice. But all in all lesser evil.

  • slowserversslowservers Member, Host Rep

    Unfortunately, this is a pretty major thing that I'm not sure any one person has a good answer for.

    Mythic Beasts has encountered the same sort of behavior: https://www.mythic-beasts.com/blog/2025/04/01/abusive-ai-web-crawlers-get-off-my-lawn/

  • LEBUserJoeLEBUserJoe Member
    edited 5:21PM

    98% of bad actors rely on public proxies, lot of public projects on github allow you to retreive the majority of these every X interval set by you, apply these proxies to an auto-drop list either server-level or more ideal, cloudflare level leveraging there rules.

    Stops craping, most simple sign-up ddos platforms etc.

    Truth is, it’s a chasing game and it depends how much you want to suffer GEO blocking does nothing sadly, cheap rotating proxies are very accessible.

    Platforms like akamai are great, truly great but everything is bypassable sneaker bots have been selling bypasses to cloudflare, fast, akamai for years for few $100/month

  • @layer7 said:
    There is usually no real reason that any hosting/cloud IPs should access a webshop. ( But to scrap information ).

    Well, being able to concentrate my browsing on a fixed (DC) IP has served me well over the years keeping Paypal from freaking out while staying abroad but, yeah, chances are that's coming to an end. Paypal has already become way more aggressive lately and i wouldn't be surprised at all if one day they'd just tell me to come back with a normal IP.

  • layer7layer7 Member, Host Rep, LIR

    @totally_not_banned said:
    Well, being able to concentrate my browsing on a fixed (DC) IP has served me well over the years keeping Paypal from freaking out while staying abroad but, yeah, chances are that's coming to an end. Paypal has already become way more aggressive lately and i wouldn't be surprised at all if one day they'd just tell me to come back with a normal IP.

    Hi,

    yes, unfortunately at latest with broad availability of AI, developing something to get around a newly implemented filter becomes faster and easier than ever.

    Soon the advantage / sense of "inventing" new filters becomes senseless if workarounds are available within hours and days.

    Maybe in the future, you can add a specific IP to be whitelisted with big websites, if you are a paying / identified customer there and having this way access to the service also with a VPN IP.

    Lets see where the road goes....

    Thanked by 1totally_not_banned
  • rpqurpqu Member

    @layer7 said:

    @totally_not_banned said:
    Well, being able to concentrate my browsing on a fixed (DC) IP has served me well over the years keeping Paypal from freaking out while staying abroad but, yeah, chances are that's coming to an end. Paypal has already become way more aggressive lately and i wouldn't be surprised at all if one day they'd just tell me to come back with a normal IP.

    Hi,

    yes, unfortunately at latest with broad availability of AI, developing something to get around a newly implemented filter becomes faster and easier than ever.

    Soon the advantage / sense of "inventing" new filters becomes senseless if workarounds are available within hours and days.

    Maybe in the future, you can add a specific IP to be whitelisted with big websites, if you are a paying / identified customer there and having this way access to the service also with a VPN IP.

    Lets see where the road goes....

    Assign an ipv6 /128 for a paying customer's panel/api access.

Sign In or Register to comment.