All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Are you web scraping? How can VPN.fail help? :)
As you know, VPN.fail is a project dedicated to helping internet users from countries such as Iran, Russia or China bypass censorship and restrictions and access the free global internet.
Despite our main focus being anti-censorship technologies, our network inevitably gets used also by people from countries where the internet is free - we assume they just want to change their IP address.
So we're considering upgrading our VPN servers to offer a friendly solution for people who are web scraping.
So if you're running a project scraping the web, we have some questions for you to help us better understand how to help you:
How many unique IPs do you need? How often do you change them?
How many locations (and what locations) are you utilizing?
What type of IPs are you using: data center, residential, or mobile?
Tell us about the tech stack you prefer. Do you connect directly to SOCKS/HTTP proxy, do you prefer using a web API, or other technologies?
What volumes do you generate when scraping? Traffic / Requests per day / hour
What types of websites do you target for your scraping activity?
Last but not least, to ensure the scraping can be done without legal issues - what specific data are you interested in scraping?
Thank you 🙏
Comments
Welcome to aboard 🚀
🚀🚀🚀 thanks!
Billions.
Change IP after each request.
193 countries and 9999 cities.
Cellular.
WireGuard.
112 Kbps.
Dealz.
CP.
Central Park
If you invite scrapers to your network you will risk that your human users will be blocked from the sites they want to visit.
This looks like a genuine call for feedback, so I'll give my honest replies.
Only a handful, really. Something in the range of 10-20 perhaps.
It depends on how often you plan to reuse different IPs.
If it's a pool of 20 different IP addresses, shared between all users, they can quickly be marked by various tools.
Northern European. Amsterdam and Frankfurt mainly.
This is just because of my servers locations.
Datacenter is fine.
Residential would open the possibilities for way more use cases, but for the regular scraping I'm doing, a transparent proxy is fine.
Socks5 is absolutely preferred. This is the most universal, and allows for connecting in many different ways.
Very dependent on what I'm actually scraping, but for low request count 1-5 per day, and high count 15-20 per minute.
Online stores and social media.
I'd like to add that if you plan to start selling premium subscriptions, a good way to get an initial load of cash is to do a Lifetime deal promotion.
Basically selling lifetime licenses to the first front runners, which will in turn help give you feedback.
I know this is how I prefer to pay for VPNs at least.
thanks - this is a very valid point. we're considering deploying separate IP addresses for scrapers, as they usually need big number of IPs. regular VPN users don't mind sharing the same IP, it actually improves their privacy/anonymity
highly appreciate response 🙏 it's very helpful!
fair point about sharing few IPs with lots of users. IPs should ideally be dedicated to each user. it could work with sharing the same IP between a very small number of users, if their use cases are compatible and don't generate trouble between eachother
europe and US are good for bandwidth costs 👍 asia is what we see a bit challenging
we are thinking to start testing with some extra IPs which will be of course data center IPs
would go for socks5 myself, but still curious how popular API based solutions are
your usage sounds more than decent!
thanks - duly noted!
good to know you don't need a proxy in the Vatican!
FYI, IPv6 proxies exist & changing IP after each request isn't something outlandish for any type of proxy.
Banning by a /64 isnt unheard of, infact I think its kinda customary.
If I can get my hands on residential IPs, then most of times, even two or three of them were sufficient for my needs.
1-2 IP per country.
6 Europe, 4 US, 4 APAC.
residential, I prefer to pay a friend for keeping my orangepi / rooted phone / nuc in their place, and then use them as exit node. but i do have backup provider in case i need something in sudden manner. there are devices that being donated for use on me, but that doesn't count as my own property.
datacenter IP / LET vps are for hosting api endpoints.
each client acting on it's own independently using preconfigured setup (just some standard headless chrome, and archivebox instances), there are api endpoints available in case the scraper client looking for "tasks". output data are sent using filebeat to elasticsearch. traffic are mostly over PPTP / wireguard.
I like the flexibility for soc devices. orange pi for example, I created my own custom armbian with specific bootstrap script a-la tailscale setup. plus they're cheap.
20-30 request per day for website per IP, 1440 request for dns query per IP.
don't really care about legal issues, everyone that working in my scraping project has agreed not to share the scraped data to the public. we only needs texts, no images. then again, the data is accessible in public to begin with.
we don't have any form of cookie/session stuffing to see data that behind login gate. and if it's recaptcha or hcaptcha, we just outsource them to some indian captcha services.
How many unique IPs do you need? How often do you change them?
The more the better, to rotate them per request
Not particular for my kinds of project
Residential preferably, but seems any other would be fine for some of the target sites
http
Differs, but sometimes 100k pages in a day and may be lower
Corporate websites
Ecommerce websites
About pages
Services
Product pages
do you need residential in specific geolocation, or any country works?
thanks for your input, and very interesting project monitoring country-wide internet censorship. do you publish this data somewhere?
specific countries (most of the time)
no. at very least there are no plan yet to publish it to public. we still looking for data in high interest country (like some country in south asia, and in middle-east)
👍
got it. good luck with the project going forward!