All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
How do you defend your vps from a Dos or DDoS or Bot/AI scrapers?
So lets say you have a vps of
1vcore
1GB ram
10-20GB SDD
1Gbps or 10Gbps
running Debian or Ubuntu and using ufw as a firewall (or iptables)
and running nginx as a reverse proxy. Is it possible to defend this if the provider has DDoS protection or not? I was thinking of useing https://gitgud.io/fatchan/haproxy-protection/ (Basedflare) at first then https://anubis.techaro.lol/ (anubis) appeared. i recently found https://git.gammaspectra.live/git/go-away (go away). i was thinking of
VPS (running nginx on port 82 or 81) & (Basedflare/anubis/go away on port 80/443) -- Cloudflare Tunnels -> Behind CF
or
VPS (running nginx on port 82 or 81) & (Basedflare/anubis/go away on port 80/443)
or
nginx but exposed on port 80/443
or
nginx behind CF tunnels
the ones exposed will be using Certbot to get SSL from Lets Encrypt and the one using CF is just proxying port 80 (nginx) or (81-85 running Basedflare or anubis/go away)
most of my servers are on the last 2 whereas exposed or behind CF


Comments
You can harden a 1-core / 1 GB VPS against light DoS and bot floods, but once real DDoS traffic hits (hundreds of Mbps+), there’s not much you can do without upstream filtering — the pipe just gets saturated before your rules even kick in.
For light attacks or scraper spam though, you can make life hard for them:
Rate limit in Nginx: limit_req_zone is your friend. It’s lightweight and can slow down scraper bursts.
Use fail2ban: good for catching repeat offenders hitting the same endpoints.
CF Tunnel is the most painless solution if you don’t mind putting Cloudflare in front — it hides your origin IP and filters most junk automatically.
Tools like Anubis or go-away are fine for layer-7 junk (especially AI scrapers or malformed requests), but they won’t help against volumetric attacks.
In short:
If your provider has decent DDoS protection, use it and add your local filtering as icing.
If they don’t, use Cloudflare (tunnel or proxy mode) — that’s about the only realistic line of defense for such a small VPS.
I’d personally go:
VPS → CF Tunnel → Nginx (port 81) — minimal exposure, simple setup, no certificate headaches.
If you dont want to use CF Tunnel. A simple firewall to block all IPs except IPs you want to access ssh or any other open port that you want to restrict. In addition to the above suggestions.
You could also run CSF, but will use too much overhead for such a tiny 1gb vps
CSF is becoming abandoned
I don't really think you can defend your website against AI crawlers.
A few weeks ago, I posted A.I summaries on LET here, how do you think they worked?
CF Crawl protection is a joke.
I don't think it would be wise to like that github repo here.
for nginx i usually have a GeoLite2 Country db file to block country's like the UK or AUS with 444 or 451 HTTP nginx codes. i do have rate limit built in place. I haven't used fail2ban for years. I have used CF tunnels for one of my vps and on my servers at home behind nginx for services.
i would usually block Hosting ASN on CF FW level.
Its all a question of, how much resources are you willing to spend, to crawl someones website.
Residental Proxies exist.
Yea you can block them, but that doesn't prevent anyone from crawling your website.
This would be a cat and mouse game.. and ig blocking bots would be a start like blocking AWS for example. Honestly it would be too much to check and block
Safeline WAF is pretty good at blocking bots, and if you use the "Dynamic Protection" it will also help prevent AI scrapers from stealing your content: https://github.com/chaitin/safeline
Useless might as well not use any protection at all. There's a reason why ddos prot costs as much as it does (not to the end client)
https://llm-brain-rot.github.io/
my im ai ignor
I am actually going through this crap right now. I put the sites behind cloudflare. Blocked all AI bots. Monitor the access.log* with goaccess and mass block ASN that I recognize coming from data centers that do not sell internet to residents. It's getting difficult day by day since they are now spoofing from residential connections. Also no use of rate limiting as well since they make small 1-10 requests per second from 1000's of different ips trying to blend in with real users.
Sometimes it becomes impossible to block and I have to turn on "Under Attack" mode on cloudflare but this degrades the user experience (captcha)
If you get DDOSed: wait until they get bored or run out of money. They're paying $50 an hour on daddy's stolen credit card.
If you get AI scraped: make your site more efficient so you can handle it, then feed them false information.
this was what i was doing on my main domain (not subdomains) before since i wanted to block every single hosting/proxy/VPN ASN from bots and scalpers from reaching my personal domain. atm, i was looking for ways to block in the past but @fatchan did DM me this morning so im talking to him atm
How is this going to help? i get that there are some projects that will bork the LLM/AI of useing false data but for DDOSed, that's another L7/L4 level
How does a search engine work?
By "scraping" all of the world's websites
By blocking all bots except Google, you are helping Google's monopoly
Why is bot protection on static sites so popular these days?
I accept CAPTCHA on forms however
Funny enough i have nginx to hide from search engines and my website is only on my twitter and discord, github, OSU! profiles. I still however distrust bots/LLM/AI even though Comet and atlas is just chrome + AI built in
It helps your mental sanity. DDOS is mostly a form of trolling. They want you to feel angry. If you get more and more angry, they win. If you ignore them, you win. You have to live with the server being inaccessible for a few days, though.
Now they are launching browsers , it wasn't sufficient till now
Someone build a docker container, that provides a REST API for you to call which in the background starts multiple browser instances yea.
I modified his code and turned the REST API into a transparent proxy.
It works better than expected.
Prob depends what services im running. If it's a simple blog or homepage eh but if it's something i use everyday like nextcloud/copyparty or jellyfin then i would look at the logs and find what IP is being attacked from and (ban the ASN/IP range after the attack) cut off the reverse proxy or wireguard connection to prevent my server from being overloaded.
It's unlikely your self hosted services get attacked if you are the only one that knows what's running there. Nobody's out there DDOSing random IPs for the fun of it.
true but it doesn't stop someone from looking for subdomains, SSL certs history (crt.sh). but since given what i know, the risk for me is highly low since i tend to use wildcard ssl for my subdomains under nginx and i have nginx to de-index or hide from being searchable from anywhere.
Defending your VPS from DoS/DDoS attacks and AI/Bot scrapers requires layered protection: network-level hardening, application-level defenses, and smart rate control. Use a provider with built-in DDoS mitigation. This is the most effective and simplest protection. They use hardware scrubbers that clean traffic before it reaches your server. This will stop DoS, scraper, and brute-force bots without expensive appliances.
Is someone out there DDOSing random domains that say plex for the fun of it?
Zero. Just saying the risk. nothing else
DDoS is hard to protect against without upstream filtering. The issue is that an attack can constantly fill your conntrack table, preventing new inbound connections. That can happen even if you try nftables (or legacy iptables) based protection like fail2ban, CrowdSec, etc. or manually writing your own nftables rules.
You really need upstream filtering (DDoS protection at your provider) or a third-party solution like Cloudflare, etc. If you do use a third-party service like Cloudflare, you need to ensure that only Cloudflare IPs can connect to your server, to prevent attackers from bypassing Cloudflare and connecting directly.
It's fairly trivial to portscan the entire IPv4 address range, so if you're running something, someone's going to find it at some point, even if you change the default port. (that's less likely if you run it on a random IPv6 address in your /64 range)