need help and insight about huge visitor traffic spikes.

emre · August 2017

Here is the situation I am in briefly and I need some insight and thoughts from you about how to solve this problem.

I've got a customer who is a social media phenomenon, which means he got around 10-15 million followers in snapchat and instagram.

he also have an e-commerce web site which he sells his own branded products which happen to be hosted by poor me.

every week for 2 times he adds a new product to his website and send new posts about this new product in his social media accounts.

normally his e-commerce website is around 10 visitors per second. But after his social media posts traffic spikes to 35-40 thousand unique ips per second.

I by somehow need to deal with this traffic/visitor spikes and his website must never be late to load or give an error or stop responding. his business must be keep going and new orders must be taken.

this 35-40 thousand visitor spikes only lasts for at most 1 hour after his social media posts and traffic returns back to normal 10 (ten) per second after this.

So I need a system which can handle this traffic visitor spikes.

I am thinking of load balancing and according to my own tests at the moment each server can handle at most 2000 visitors/unique ips per second.

His website is custom php/mysql e-commerce website, nothing special. only 3 -4 products on sale and only about 5-6 pages total.

so I need at least 20 servers per traffic spike.

But I only need these servers for about 1-2 hours after each traffic spike.

What can you recommend me? Has anyone here ever deal a problem like that?

I need some insight and brain storming about this problem.

Thank you all for your thoughts

I am listening now..

Hxxx · August 2017

You could benefit of this stack:

-Try LAMP stack with Nginx in front.

-You can also try a plain LEMP stack.

-OPCache can be of benefit too.

-Read about redis.

bugrakoc · August 2017

Nginx Fastcgi cache is also really nice. You can also try Varnish. For both, putting the cache directory on ramdisk helps big time.

dwtbf · August 2017

Amazon EC2 + Load Balancing should do the trick, if there's a mass traffic spike the spin balancers, when it's down stop them. If you're looking for a ready-made solution, that can support this traffic, send me a PM. I can provide a reasonably priced solution.

Edmond · August 2017

You could try Open LiteSpeed which should be able to speed up your web server, but you talk to your customer about getting a load balancer and multiple servers behind it to offset the load. But it all depends on how much your customer is paying you and how much he is willing to spend extra. Getting a cdn would be ideal, even Cloudflare for free can help by caching state resources (ofc you still need a way to handle the checkout).

emre · August 2017

@hxxx @bugrakoc @dwtbf @Edmond thank you for your kind words,

I will make the site as static as possible, change lamp stack with lemp one,

create a kvm master image with nginx and deploy it on several idle servers,

create a failover/load balancing system with cloudflare and will see how it goes.

dwtbf · August 2017

@emre said:
@hxxx @bugrakoc @dwtbf @Edmond thank you for your kind words,

I will make the site as static as possible, change lamp stack with lemp one,

create a kvm master image with nginx and deploy it on several idle servers,

create a failover/load balancing system with cloudflare and will see how it goes.

Cloudflare will hardly help. You'll spend more time wasting it fucking with that and money, versus AWS will cost lot less and do the charm

caracal · August 2017

To add.. I would imagine if your customer could let you know in advance when he'd be posting his stuff, you could quickly spin up hourly servers from DO, Vultr, Linode etc to balance the load -- to reduce costs overall.

Makenai · August 2017

Varnish and scheduled AWS auto scaling is what you are looking for. If your current DC is close to AWS you can keep it as "sleeping" auto scaling and send traffic via Route53 only when the ASG has more than one instance.

If the application is fully custom you might want to rework it to make use of caching i.e cart is a request to API which is not cacheable by CDN, same goes for checkout.

Be aware that you must utilize CDN heavily when using AWS load balancing, else you will need to request load balancer pre-warming.

vovler · August 2017

I have no idea what server specs you have now... But something like an E5-1650v3/16gb ram/120gb ssd should get the job done. The more ram you have, more room you have for memcached.

Put all static files (images, js, css, ...) into a CDN, MaxCdn or Keycdn. If you are looking for something cheaper go with Bunnycdn.

Get Litespeed 2-CPU over Apache. Set custom lscache (private cache) w/ .htaccess.
Set opcache for php and memcache for mysql/mariadb.

If you are going with cloudflare, make sure to use agressive caching with your page rules, and set the Edge TTL to something like a few days, and Browser Cache TTL to at least 2 hours

MikePT · August 2017

@dwtbf said:
Amazon EC2 + Load Balancing should do the trick, if there's a mass traffic spike the spin balancers, when it's down stop them. If you're looking for a ready-made solution, that can support this traffic, send me a PM. I can provide a reasonably priced solution.

Correct. Along with DB too HA multi region if needed. I do all this stuff if you want me to @OP.
Done this for very large companies.

risturiz · August 2017

You have to monitor first what is causing load ( maybe http? maybe database? maybe php? )... Then you can choose the best option and balance load with haproxy+nginx-cache+offloadsql+cdn

SplitIce · August 2017

Something no one has mentioned. "35-40 thousand unique ips per second" this will usually require multiple servers unless there is heavy caching, either on your backend or via a CDN / Web Accelerator service (feel free to contact us). Using a web accelerator / CDN however will only help you if there is some degree of cachable content (images are the easy target, but caching dynamic content gives higher returns if possible).

Perhaps if your load is only MySQL look into Galera (database clustering). If it's NIC then look at the sizes of images, and at offloading the serving CDN). If it's dynamic page serving (which it likely is) then look into the above.

vovler said: ut something like an E5-1650v3/16gb ram/120gb ssd

There is no way that you can know the resource requirements of the script being run. Furthermore I doubt it at those request rates. Likely in excess of 100k r/s, that's usually a multi-server problem (also redundancy)

corbpie · August 2017

If I had a client with that many followers => site traffic, I would not be kicking stones around here

eva2000 · August 2017

emre said: he also have an e-commerce web site which he sells his own branded products which happen to be hosted by poor me.

every week for 2 times he adds a new product to his website and send new posts about this new product in his social media accounts.

normally his e-commerce website is around 10 visitors per second. But after his social media posts traffic spikes to 35-40 thousand unique ips per second.

I by somehow need to deal with this traffic/visitor spikes and his website must never be late to load or give an error or stop responding. his business must be keep going and new orders must be taken.

this 35-40 thousand visitor spikes only lasts for at most 1 hour after his social media posts and traffic returns back to normal 10 (ten) per second after this.

So I need a system which can handle this traffic visitor spikes.

Get creative.

If it's custom script and has 100% control over when and how a new product announcement is made and where to direct visitors on his social media accounts, then change the landing page he points followers to be 100% static html product page first and make separate yet identical static html landing product pages for each social media account.
Think about how you are funneling visitors from social media accounts and analyse their visitor profiles i.e. with google analytics and social media analytics metrics so you can best design & layout the html product landing pages as well when announcements are posted
Basically break down those 40k ip visitors/sec to more manageable segments

There's so many creative ways you can tackle this, but yes optimised LEMP stack or openlitespeed/litespeed based web end is a must

emre · August 2017

eva2000 said: There's so many creative ways you can tackle this, but yes optimised LEMP stack or openlitespeed/litespeed based web end is a must

@eva2000 you are exactly right. And I am using centminmod now in my own custom load balanced solution.

tests are going pretty well.

phpmin · August 2017

@vovler said: Get Litespeed 2-CPU over Apache. Set custom lscache (private cache) w/ .htaccess. Set opcache for php and memcache for mysql/mariadb.

Why Litespeed Over Apache ? = To keep to use a Giant elephant (Apache)
It's better a dedicated install of OpenLitespeed (7.1.x php) with OpCache (ofc) and to use even Per Client Throttling Static Requests/second.
This is for me a better based web end

bsdguy · August 2017

I see 2 ways to go. Either you continue the, Pardon me, idiots route of not analyzing your problem but rather throwing resources at it, in which case, well throw resources at it. As you quite probably will not want to keep massive server resources idling 90+% of the time "just get a big fat monster dedi" is not sensible. Which leaves you with "find a good cloud provider".

Or you go the professional route that is, you first analyze your problem and then find or develop an adequate solution, which obviously is more expensive in the short run.

Let's have a look: 40k UIP translates to some 100k req/s, maybe even much more depending on how the site is structured, how many images, etc.

The first thing, obviously, as some have already hinted, is to get the static stuff done. With that I personally wouldn't go with CDN as it's either lousy and troublesome or quite expensive, plus it's not needed, as a decent async/event driven server can handle that quite well and cheaply. Just throw memory at it and use a caching server.

The problem is the dynamic part. And in that regard you (or your client) unfortunately has chosen about the worst imaginable "solution" based on fat elephants in wheelchairs.
Before I go on: I have created sites in that ballpark or even higher. One can do it with relatively modest resources (say w/16 cores and 32 GB memory). And I will be based on that experience in what I tell you now.

Assuming a realistic load of 200k req/s, just to have a halfway realistic number to work with, translates to 5 microseconds per request. It should be strikingly blindingly obvious that that is not achievable by php+mysql. The fcgi and mysql connections alone are more expensive than that. And forget funny php caches; they may be fine for 1k req/s but not for your situation.

So:

get rid of php as far as any possible. Have it create static pages 3 or 5 times per second which are then saved by a fast caching aio web server for everything that doesn't change state or data(base). In other words: Have only orders, new customer registrations, etc., i.e. stuff that changes state and/or data handled dynamically.
Even if "show only" pages like product pages contain some dyn. data like e.g. "this product has already been sold xyz times" it almost certainly won't hurt if those are updated only 5 times per second.
get rid of mysql as far as any possible. mysql, as well as all its derivates is hobby crap. What you need (besides quite probably some resilience/backup) is a write through in-memory database and preferably not sql; keep in mind that the sql interpreter quite often eats up more time than the db lookup.
be sure to have your OS, centos linux it seems, properly and smartly configured. Example: linux keeps closed sockets "blocked" for quite long. You want to make sure it doesn't do that on your system. There are configs all over the place, e.g. sysctl. conf, (quite probably) php.ini, the socket open call parameters, etc.

Good luck

vovler · August 2017

@SplitIce said:
There is no way that you can know the resource requirements of the script being run. Furthermore I doubt it at those request rates. Likely in excess of 100k r/s, that's usually a multi-server problem (also redundancy)

His website is custom php/mysql e-commerce website, nothing special. only 3 -4 products on sale and only about 5-6 pages total.

wait... what's that? Either it's really poorly coded, or it should not be resource intensive. 3 product pages and 2 other pages (landing and checkout??)
How many SQL calls would that have?
Few

How many pages can you heavily cache? Probably all except checkout, if there is no cart/my account areas, otherwise the header would fuck up everything, and then it would have to go with private cache.

By the way, WHERE IS THE THROTTLE?. Is your client's server running apache or something else?

eva2000 · August 2017

@emre said:

eva2000 said: There's so many creative ways you can tackle this, but yes optimised LEMP stack or openlitespeed/litespeed based web end is a must

@eva2000 you are exactly right. And I am using centminmod now in my own custom load balanced solution.

tests are going pretty well.

Centmin Mod especially in latest beta auto optimises it's LEMP stack based on detected server resources - cpu, memory, disk i/o etc so fairly optimised out of the box, but there's additional advance optimisations you can turn on too. For instance, if you're using PHP 7 (recommended as it's 2x times faster than PHP 5.6), then be sure to enable Profile Guided Optimisations https://community.centminmod.com/threads/added-profile-guided-optimizations-to-boost-php-7-performance.8961/ - you can go further with modifications to Centmin Mod PHP-FPM PGO routines and add your custom script's PHP code paths to train PHP 7 to perform better too for your PHP scripts specific code/paths. Benchmark results for PHP 7.2 betas vs 7.1 vs 7.0 vs 5.6 with and without PGO vs Remi yum distro provided PHP rpms https://community.centminmod.com/threads/php-7-2-0-beta-2.12447/#post-52794

Centmin Mod latest beta also supports Nginx built against various compilers - native OS GCC versions, GCC 5.3.x, GCC 6.2.x, Clang 3.4, and soon Clang 4 and 5 and GCC 7 as well as pairing with LibreSSL and OpenSSL (later BoringSSL) so you can figure out which pairing gives best Nginx performance for your server's cpus https://community.centminmod.com/threads/centmin-mod-nginx-libressl-openssl-support-in-123-09beta01.11122/. Probably, only really beneficial for newer cpus i.e. Intel Xeon E3 v4, v5, v6 and E5 v4 and newer and AMD EPYC/Rzyen based processors (haven't tested yet).

Centmin Mod latest beta also optional support for HTTP/2 HPACK Full Encoding support in Nginx - provided by Cloudflare's HTTP/2 HPACK patch instead of normal Nginx provided partial HTTP/2 HPACK Encoding support so you save some bandwidth at the HTTP header level too. Info https://blog.cloudflare.com/hpack-the-silent-killer-feature-of-http-2/. With alot of traffic = big pipe requirements, so shaving some size off requests is nice anyway. So when you setup Nginx, in persistent config file /etc/centminmod/custom_config.inc set NGINX_HPACK='y' to enable it.

example on Cloudflare's blog with HTTP/2 HPACK Full Encoding support has up to 79% savings and as more requests are made can be as high as 92% savings which I am seeing on my own Centmin Mod LEMP powered HTTP/2 based sites.

url=https://blog.cloudflare.com
for i in $(seq 1 4); do echo "h2load run $i"; h2load $url -n $i | tail -6 | head -1; done                     
h2load run 1
traffic: 27.83KB (28502) total, 541B (541) headers (space savings 27.96%), 27.21KB (27858) data
h2load run 2
traffic: 55.12KB (56443) total, 570B (570) headers (space savings 62.05%), 54.41KB (55716) data
h2load run 3
traffic: 82.41KB (84384) total, 599B (599) headers (space savings 73.41%), 81.62KB (83574) data
h2load run 4
traffic: 109.69KB (112327) total, 630B (630) headers (space savings 79.03%), 108.82KB (111432) data

Don't forget Nginx Brotli compression support too if your site it HTTP/2 HTTPS based https://community.centminmod.com/threads/how-to-use-brotli-compression-for-centmin-mod-nginx-web-servers.10688/.

While the boosts are small, it's free performance boost anyway

There's plenty more optional/advance options you can enable. If you want to learn more - 6 steps https://community.centminmod.com/threads/guide-to-learning-more-about-centmin-mod.10838/ ^_^

Levi · August 2017

So...

Find your bottleneck
Go for amazon services

Case closed.

jh · August 2017

If it's a sudden spike twice a week then I would seriously consider provisioning for the spike traffic all the time. Eliminating all the scaling would reduce the possibility of glitches and open up the possibility of cheaper hosting providers. It's not necessarily the best option but it's worth considering.

Optimising the application would definitely be money well spent.

We can help with both of these (especially the latter). Drop me a PM. We have a track record managing large scale e-commerce.

eva2000 · August 2017

LTniger said: Go for amazon services

Not always AWS bandwidth costs need to be taken into account at US$90/TB versus the revenue $$$ brought in from the spikes.

emre · August 2017

bsdguy said: I see 2 ways to go. Either you continue the, Pardon me, idiots route of not analyzing your problem but rather throwing resources at it, in which case, well throw resources at it. As you quite probably will not want to keep massive server resources idling 90+% of the time "just get a big fat monster dedi" is not sensible.

this is the solution to my problem. because as you can see I am not talking about any budget from the first post.

budget is Irrelevant at the moment. I need a solution fast.

although what you wrote later in your post is what I must do in an ideal world , I can not do it. It must be php/mysql . not another stack will work.

thank you for your great insigt on the other hand.

If i have more time to spend on this project let's say like 3 4 more months I will do exactly as you said.

but it's too late now.

I'll be stupid and throw more resources now.

emre · August 2017

vovler said: By the way, WHERE IS THE THROTTLE?. Is your client's server running apache or something else?

>

yes, standart apache/php/mysql at the moment.

but it will change this night to lemp and very heavily performance modded centos 7.x most possibly.

emre · August 2017

some pics:

I don't understand this one

eva2000 · August 2017

That isn't that much though 90k unique ip visitors/day peak or ~3,400+ requests/s average of peak day - but is that normal day or spike day ?

vovler · August 2017

I know this is not a real world benchmark and external stuff like images/css/js are not downloaded, for that cdns exist. But with a 20$ DO droplet w/ Litespeed (trial license) + PHP7 and proper caching I managed to get this on wordpress.

I believe it should perform even better on a light custom coded store.

vovler · August 2017

Also, cloudflare is dealing wih 97% of your bandwidth.

What are your server specs right now?

bsdguy · August 2017

@emre said:

I'll be stupid and throw more resources now.

a) find a good cloud provider with high reserves and be done.
b) optimize at least somewhat.
c) @eva2000 seems to be right (based on the funny stat. images) and your load (req/s) is ca. 3400/s avg on a hot day. Which should be about 10k req/s peak. Which mean your job is much easier.
d) at least run both fcgi and mysql through daemon sockets (and not tcp).
e) be sure sysctl & friends are optimized for your scenario.>

@vovler said:

[bla bla]

I admire you. There is just this little thing that this isn't about 5 ms or 25 ms or 50 ms per request but about 10k req/s. Good luck trying that with your droplet thingy.

eva2000 · August 2017

bsdguy said: find a good cloud provider with high reserves and be done.

Yup some cloud providers (exclude AWS due to bandwidth costs) support outbound network bandwidth pipes up to 10Gbps coupled with hourly billing and a robust API, you can pretty much script auto scale up/down when needed. Pretty amazing what we can do these days compared to what we had available 10+ years ago

Howdy, Stranger!

Categories

In this Discussion

need help and insight about huge visitor traffic spikes.

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

need help and insight about huge visitor traffic spikes.

Comments