New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
MJJs, how does one get a Baidu Japan server? (111.108.96.0/20)
A service I operate has been getting (AI-related) bot scraping that partially operate from 111.108.96.0/20 which is leased by Baidu Japan from KDDI according to the WHOIS. Was just wondering whether anyone knows how the operator has gotten servers on this IP range since as far as my research indicates Baidu doesn't operate cloud services there?
It wouldn't surprise me if Baidu itself has been operating this scraper, but was just curious whether there's a way to purchase Baidu Japan services for Chinese-based entities.

Comments
baidu doesn't sell any vps/cloud services in tokyo, unless my knowledge is outdated and they do recently.
The bot scraping for its search engine use
https://www.wsj.com/tech/ai/baidus-ai-assistant-reaches-milestone-of-200-million-monthly-active-users-2ad30bfb
It's definitely not search engine scraping, they have implemented functionality to register for accounts on our platform and they send POST requests with AI generated content.
I suspect it's for AI benchmarking since we operate an automated grading service. Further investigation indicates that a Baidu engineer is definitely involved with the creation of this particular bot and they are continuing to create new accounts even as we ban them.
They also briefly switched to fake US residential IP addresses (under a residential AS but leased to third party companies) but we started banning their /24s so they had to switch back to ones that register as servers on common IP databases.
Baidu has previously been exposed for using subsidiary companies that engage in shady practices, such as large-scale ad fraud, where ad-clicking functions were embedded into apps downloaded by millions of users.
They likely scrape content, rewrite it, and use it elsewhere, where they can monetize it in some shady way.
You can name and shame them publicly by posting exactly what they are doing, backed up with logs and screenshots.
Otherwise, there isn’t much you can do beyond playing whack-a-mole by banning their IP addresses.