Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


How to scale a download server ?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

How to scale a download server ?

RolterRolter Member

Hi guys , so i have a website that is serving a lot of downloads and during the peak hours , CPU load average goes a lot beyond the 100% and disk I/O during the peak hours is around 20Mb/s , which i guess is the upper limit for my HDD (it is SATA)

I might switch to SSD for serving downloads ,what do you guys think I should do , besided getting SSD ?

«1

Comments

  • NomadNomad Member

    Maybe if you share information other than having an HDD people might reccomend better and you can prevent needless multiple questions. Such as ia it a VPS, what's your Web server software, ram, cpu etc?

  • telephonetelephone Member
    edited May 2015

    A few questions so we can better understand the situation:

    • What specs are the VPS/Server?
    • How many downloads are initiated during peak hours? (100 req/s)
    • What's the average file size?
    • Are the files static? (Only updated every X days)
    • How are you serving the downloads? (HTTP via Nginx)
      • What settings are you using? (Your virtual host or conf file)
    Thanked by 1ehab
  • rds100rds100 Member

    Split it to several servers? Use a CDN?
    What is there to download? How big is it?

  • RolterRolter Member
    edited May 2015

    The server is a dedicated server

    Here are the specs

    Ram - 16GB
    CPU model - 1230 V3 
    HDD - 2X2TB
    Port speed - 1GB/s unmetered
    

    @telephone
    During the peak hours their are around 500 simulatenous downloads , the files are small , each around 2-8Mb

    The files are cached files , so yes they are static files , I am serving most files from cache already , which has a total allocated space of 3TB and yes the files are being served by the nginx proxy-pass , proxy-cache module , but i have not implemented any real upstreams yet as i did not need any so far.

    @rds100 - I am using cloudflare for caching the html pages , but other than that , their is no CDN setup by myself yet .

    I have also made sure that their no script or mysql bottleneck , since mysql has plenty of memory and a lot of RAM is unused even during peak with Redis being used to cache some of the most volatile mysql data .

  • vfusevfuse Member, Host Rep

    multiple A records if you can afford to setup another server.

  • chihcherngchihcherng Veteran
    edited May 2015

    Serve download files with Bittorrent? Shrink the memory for MySQL and give it to file buffer instead, so I/O speed won't be limited by your HDD?

  • RizRiz Member

    Where is your server currently located? What is your audience?

    Can always use rsync / btsync / syncthing to sync files between the servers, and offer geo based serving.

  • With a spec like that you can do a LOT more than 500 downloads of 2-8MB. I basing this on a busy mirror site we run, which is running 700-800Mbps almost continually with thousands of downloads going on.

    You need to optimise that machine or server. Are you serving the file straight from disk?

    What kernel optimisations have you run? What NIC is in there?

  • RolterRolter Member
    edited May 2015

    @MarkTurner

    Thank you for chiming in

    Yes i am serving the files straight from disk , but it is from nginx cache ..

    Can you please give some suggestions on how can i optimise the kernel and about NIC , here are output of the command i ran --

    lspci | egrep -i --color 'network|ethernet'

    02:00.0 Ethernet controller: Intel Corporation 82545EM Gigabit Ethernet Controller (Copper) (rev 01)

  • blackblack Member

    How much is nginx using as cache? Are people downloading your files in a random fashion (as it's just as likely for a user to download any file from any other file?) If that's the case, there would be a lot of cache misses.

  • Cache or not you should be getting a lot higher out of that. I have a straight of dedicated server in France with just hardening and it can do 80mb/s on its gbit line with no optimization. 2 2tb discs in raid 0 though.

    Ben h the discs and make sure their okay or look at the network would be my idea.

  • bsdguybsdguy Member

    What else is that server doing, other than serving downloads? What is your problem? I'm asking because it isn't clear to me other than "cpu > 100%".

    Unless your server does major other jobs, from what you describe (ngnix, mysql, redis) it sounds as if your problem were along the line "throwing too much technology supposed to enhance performance at a situation".

    As a general rule you might want to keep in mind that it's still the OS itself which does disk caching best for files. Another issue is the network but serving 500 connections is not at all a significant load for a system like the one you described,

  • msg7086msg7086 Member

    500 connection and it will use up the iops on a hard drive.

    A typical SATA hard drive can handle up to 120 iops. More than that and you will have endless iowait on CPU.

  • RolterRolter Member
    edited May 2015

    @black - nginx cache is 3TB , since the total disk space is 4TB i had allocated 3TB to nginx cache .

    @cmsjr123 -- That is what i am expecting too , i have thrown a lot more pressure at other servers of mine with the same config and they never had any problems at all.

    @bsdguy - It is a simple LNMP stack , the entire mysql database size is less than 200MB , and redis DB is only consuming 150MB of RAM . The only thing that I can think of that might be keeping the server from serving downloads at good speeds is I/O , since it is a SATA HDD .

    Can you please tell me how can OS serve cached files via HTTP without using nginx ?

    @msg7086 - That is what i have been thinking as well .

    I have started using SSD for serving downloads for now , need to wait for a few hours before the traffic peeks and see how SSD performs .

    @MarkTurner -- still waiting for you to chime in ..

  • @Rolter said:
    black - nginx cache is 3TB , since the total disk space is 4TB i had allocated 3TB to nginx cache .

    cmsjr123 -- That is what i am expecting too , i have thrown a lot more pressure at other servers of mine with the same config and they never had any problems at all.

    bsdguy - It is a simple LNMP stack , the entire mysql database size is less than 200MB , and redis DB is only consuming 150MB of RAM . The only thing that I can think of that might be keeping the server from serving downloads at good speeds is I/O , since it is a SATA HDD .

    Can you please tell me how can OS serve cached files via HTTP without using nginx ?

    msg7086 - That is what i have been thinking as well .

    I have started using SSD for serving downloads for now , need to wait for a few hours before the traffic peeks and see how SSD performs .

    MarkTurner -- still waiting for you to chime in ..

    Data hdds can actually so 100 to 150mbps

    Unless it's SATA 1 and not 3. Then it will be less but 20MBs is no where near a SATA drives limit. The issue is elsewhere.

  • vfusevfuse Member, Host Rep

    I used to run a site with about 19 million images of about 500kb-1mb with a regular HDD I could only get about 75-100mbit at best when setup SSD's in the server 800-1000mbps no problem at all, even with low IO wait. Do you have high cpu i/o wait?

    How often do your files change? If they never change proxy_store might be better.

    `
    location /images/ {
    root /data/www;
    error_page 404 = @fetch;
    }

        location @fetch {
            internal;
    
            proxy_pass         http://backend;
            proxy_store        on;
            proxy_store_access user:rw group:rw all:r;
            proxy_temp_path    /data/temp;
    
            root               /data/www;
    

    }
    `

  • RolterRolter Member
    edited May 2015

    @vfuse -- the files never change , ones in cache , they are served from cache only ..

    I am using proxy_cache instead of proxy_store , since i want to benefit from nginx automanaging the cache ..

    But yeah , other than that , my config is pretty similar to yours ...

    About I/O wait , I have not checked it (do you know how can i check it ?) , i have only run iotops and I was getting 20mbit of I/O when the problem started occuring and it happened during the last peek hours as well..

  • bsdguybsdguy Member

    @Rolter said:
    bsdguy - It is a simple LNMP stack , the entire mysql database size is less than 200MB , and redis DB is only consuming 150MB of RAM . The only thing that I can think of that might be keeping the server from serving downloads at good speeds is I/O , since it is a SATA HDD .

    A DB's size doesn't tell much about its activity and requirements (other than space).

    Some things you might want to look into is setting swappiness and blockdev --setra XXX.

    But again we can hardly give you any good advice without knowing more like e.g. your current memory usage. Fact is that your machine is damn powerful enough yet you seem to experience problems. Sure, you could throw SSD at it but with 16GB RAM there should be better and cheaper ways.
    Again, I'm under the impression that you tend to simply throw hardware (like SSDs) or application caches at problems which often doesn't help that much.

    One example I have in mind that you run mysql for a small DB, which might well be a major overkill eating away resources.

  • tomletomle Member, LIR

    Question is why OP is using disk cache when the file can be served directly from disk anyway. Either way it will need to be read from the disk.

  • RolterRolter Member
    edited May 2015

    @bsdguy - I went with mysql since it was easiest to setup at the time i was developing the site , about RAM only 1.5-2GB of RAM is used , even during the peak hours .

    But , i think it is time for me to start scaling the setup , I was also reluctant in getting a SSD since i am getting a lot of better performance from other servers with the exact same config as this one and even higher traffic during the peak hours .

    Again , i try my best to squeeze the last drop of performance from a server , but this time i am just unable to think of what could be wrong ..

    The CPU load average goes to like 35+ (it is a 8 core server) for a few hours (during the peak) and lingers arounds that .

    @tomle -- cuz the files are cached from a remote server that i control and then serve them from local cache , since it reduces the latency and keeps RAM usage a lot lower , the central server only has to send the files ones .

  • bsdguybsdguy Member
    edited May 2015

    @Rolter said:
    bsdguy - I went with mysql since it was easiest to setup at the time i was developing the site , about RAM only 1.5-2GB of RAM is used , even during the peak hours .

    Well, I don't know about you but a mysql server with 200 MB data eating 1500 - 2000 MB RAM makes me think and scratch my head.

    So, based on what little I know about your situation I'd do two things:

    • throw that mysql thing away and replace it with something more sensible like Sqlite (your DB is mostly reading anyway, right?) or, if you want a full blown SQL server you might want to have a look at Firebase (a very fine SQL server that is undeservedly ignored by many)

    • find some kind of system to order my files into e.g. files < 2 MB and larger files or, if you know that, frequently needed ones and rarely needed ones. Next I'd put those most frequently needed (or the many, many small ones) into a simple RAM disk to take IO load away from my SATA drives.

  • tomletomle Member, LIR

    @Rolter said:
    tomle -- cuz the files are cached from a remote server that i control and then serve them from local cache , since it reduces the latency and keeps RAM usage a lot lower , the central server only has to send the files ones .

    Right ok. rsync the files over to the server and then serve directly from disk without the additional resource usage by mod_cache.

  • xyzxyz Member

    "CPU load" != CPU usage
    You can have an idle CPU and really high load.

    You need to investigate a number of things:

    • is your CPU actually running at 100%; if so, what is consuming it? (user, system, iowait?)
    • what process is actually using your system resources?
    • are using using SSL?
    • if you believe you're I/O bound, what does iostat indicate?
    • have you tried doing a disk benchmark with your webserver turned off to see what performance you're actually getting from your disks?

    To help others, I suggest posting the output of htop, dstat and iostat.

    Thanked by 1Rolter
  • MikePTMikePT Moderator, Patron Provider, Veteran

    Have you tested the HDD performance, with no load? Are you sure both are performing well? It seems that your CPU load is caused by the iowait, which means that your disks may not be performing well. Please double check that. Such huge load for a nice processors as yours, is caused by the iowait, not by the LNMP stack.

    If you're experiencing huge iowaits, then it'll affect your CPU load, by a lot.

    I run a cluster of 4TB images, all in hdd, got only 2 processors and 2GB ram, it's plenty for my case, more than enough.

  • Flashcache works nice for a buffer IMO

  • RolterRolter Member
    edited May 2015

    @tomle - already using nginx proxy_cache .

    @MrGeneral and @xyz - I will post the output of htop, dstat and iostat after the next traffic surge -- which is a few hours from now .

  • MikePTMikePT Moderator, Patron Provider, Veteran

    @Rolter said:
    tomle - already using nginx proxy_cache .

    MrGeneral and xyz - I will post the output of htop, dstat and iostat after the next traffic surge -- which is a few hours from now .

    Great, run some hdd tests too.

  • RolterRolter Member

    @MrGeneral -- can you give me the commands that you like to see the outputs of ?

  • MikePTMikePT Moderator, Patron Provider, Veteran

    Check hdd's health:

    smartctl -a /dev/sda | less

    Direct read:

    hdparm -t /dev/sda2

    Cached read:

    hdparm -T /dev/sda2

    Thanked by 1Rolter
  • blackblack Member

    What do you mean you have 3TB of nginx cache? If you're using your hard drive as cache, then it's just copying one data block to the other on your hard disk. Nginx cache should be smaller than your physical RAM (perhaps 2/3 or 3/4 of your physical RAM) so data is cached from your hard drive to RAM, then served from RAM.

    Thanked by 1vimalware
Sign In or Register to comment.