Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


How to build a CDN?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

How to build a CDN?

Hi,

I'm slowly gathering some LEBs and I though about building a small CDN for myself (for learning purposes).

What are important keywords that I have to look for?

Are there any articles worth reading?

The main purpose is to learn how a CDN works. Some questions that come into my mind are:

  • How to balance the load?
  • How to choose the nearest location to the client?
  • How to achieve HA?
  • How to synchronize the data?
  • How to serve the files (nginx) ?

Thanks in advance,
gehaxelt

Comments

  • AlexBarakovAlexBarakov Patron Provider, Veteran

    Anycasted IPs would be your best bet I think. And maybe a distrivuted filesystem? Glusterfs for example.

    Never built one myself, but shooting out some ideas.

    Thanked by 1gehaxelt
  • sepeisepei Member
    edited August 2014

    GeoDNS, Anycast IP (as example Rage4) should be good keywords

    Thanked by 1gehaxelt
  • Buy a few servers in different regions a good example would be: LA, Dallas, Jacksonville, New York, Amsterdam, etc.

    The best way to keep files in sync, would be to set up a rsync cronjob. I had one setup for every 15 minutes that would copy new files from a master server.

    Then, setup some sort of geo redirect either through GeoDNS (Rage4 is a good example) or through another way, like PHP, here's an example.

    GeoDNS and the PHP method should give you HA. Rage4 allows you to also have a failover, so if one server dies, it can be replaced with another one. I believe the PHP method also has this covered too.

    In my experience, that's one of the easiest ways of doing it.

    Thanked by 1gehaxelt
  • filefile Member

    NSONE is another DNS solution which may be useful. They have GeoDNS as well but also have a data source method which can influence results. You can hook up some existing solutions (server monitoring for example) or use their API to push stuff in.

    Thanked by 1gehaxelt
  • @avayl is from nsone

    I'm bringing up my own anycast network shortly and will be providing anycasted IP space, feel free to PM for more information.

  • StellaEVStellaEV Member
    edited September 2014

    @gehaxelt said:
    Hi,

    I'm slowly gathering some LEBs and I though about building a small CDN for myself (for learning purposes).

    What are important keywords that I have to look for?

    Are there any articles worth reading?

    The main purpose is to learn how a CDN works. Some questions that come into my mind are:

    • How to balance the load?
    • How to choose the nearest location to the client?
    • How to achieve HA?
    • How to synchronize the data?
    • How to serve the files (nginx) ?

    Thanks in advance,
    gehaxelt

    For the Geoip Section.

    If you have an OS with a new enough version of BIND, you can scrape by without using any 3d party DNS services. Have a gander at https://kb.isc.org/article/AA-01149/0/Using-the-GeoIP-Features-in-BIND-9.10.html (Original patch that I used is at https://code.google.com/p/bind-geoip/ which is now merged into BIND-9.10)

    Bind 9.10 has full City/Region/Country GeoIP Support with the ability to set individual US States/etc to a certain ip.

    Now, most people wouldnt pay for the accurate non-free maxmind geoip database, so I decided to do a workaround to increase accuracy without having to pay more (This was actually discovered by accident)

    Cloudflare happens to have this nifty little feature where it automatically resolves CNAMES for you (see http://blog.cloudflare.com/introducing-cname-flattening-rfc-compliant-cnames-at-a-domains-root)

    So basically, setup a few Bind servers, and make sure they are all working. You can sync the configuration/zones/etc using a mix of btsync/syncthing/etc and iwatch to detect when a file is updated.

    Then, set a subdomain in cloudflare to point at the nameservers. For each address that you want to be geoip-ed now, just set the CNAME to the subdomain thats getting the IPs at the nameservers.

    Cloudflare will do the rest and make your geoip reasonably accurate. Note that this is kind of dependent on cloudflare's locations as well, so you might sometimes see a few requests to be off/etc. But then again, I have never noticed it going really off (i.e. requesting a subdomain while in US gives an IP in EU)

    Note:
    If your running Ubuntu or some other distro and are using the geoip package from the repositories, make sure that they are updated. On ubuntu, install geoip-database-contrib instead

    A few more thoughts:

    Setup a monitoring system (i.e. uptimerobot/etc) to switch the IPs when they fail. Make sure you have a low TTL for DNS records.

    Synchronizing the files now...

    Ive had two successful ways of doing this.

    1. Use Varnish to pull from a central location (Sync the varnish configuration with btsync/syncthing + iwatch as indicated above)

    2. Use btsync/syncthing + iwatch to sync the files, and run nginx.

    Notably, if your on a LEB vps with low ram, high disk space, ask Varnish to cache on disk instead. You may have to do some performance testing with each method (file vs malloc) to see which one is best for you.

    Depending on whether your VPSes are fast enough of not, you may want to benchmark both methods as method #1 relies on the connection speed of the origin VPS to serve the initial file (before it goes into cache)

  • Rage4, LEB's & rsync > http://ldc.pw

    Thanked by 1gehaxelt
  • @wych said:
    Rage4, LEB's & rsync > http://ldc.pw

    Weird. Not loading for me. Sounds like a cool project though, from what I can see via Google cache.

  • wychwych Member
    edited September 2014

    @mikeyur said:

    Strange seems to be up, any errors or anything?

  • @wych Looks like your DNS. I use OpenDNS and this is what they're seeing (I'd probably be hitting the Seattle pop):

    https://www.dropbox.com/s/a0ixeg115hsro1f/Screenshot 2014-09-01 01.59.56.png?dl=0

  • @mikeyur said:
    wych Looks like your DNS. I use OpenDNS and this is what they're seeing (I'd probably be hitting the Seattle pop):

    https://www.dropbox.com/s/a0ixeg115hsro1f/Screenshot 2014-09-01 01.59.56.png?dl=0

    Thanks will forward it on to Rage4.

  • If you wanna run your own GeoDNS servers, take a look at this: http://phix.me/geodns/
    Pretty minimal instructions but it's definitely possible getting this working (made my DNS servers using this method and works fine!).

  • nexusrainnexusrain Member
    edited February 2015

    @wych said:
    Rage4, LEB's & rsync > http://ldc.pw

    Up again. But.. That project doesn't seems pretty active according to copyright 2014 in the footer and 3 of 5 nodes down. Btw, 3 POPs are in US, one in France and that's it. Bit weak, maybe.

    For 21€/year (+ GeoDNS hosting or servers for that job) you can build your own cdn with LES boxes and have nodes in Italy (for south EU and maybe Africa (nearest les location to it)), Dallas (for south US & South America), Japan (for Asia), Australia (if needed..), NC (for central us), LA (West us) and Düsseldorf / Falkenstein (for central Europe. German DCs got a good network / peering in most cases. And I'm not only writing this cauz I'm german :p ). Wont be a high end cdn, but definitely faster than without it to many locations I guess.

    Edit: oh, didn't take a look at the date of the last Post. Anyways, maybe it'll help someone anyways :p

  • wychwych Member
    edited February 2015

    @nexusrain said:

    Why are you bumping such an old thread?

    "This is still in alpha stages."

    Its something I dabbled with will go back to later; I have a few higher priority projects right now.

    However my original point of rage4 (free) + LEB's stands.

Sign In or Register to comment.