New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
How to build a CDN?
Hi,
I'm slowly gathering some LEBs and I though about building a small CDN for myself (for learning purposes).
What are important keywords that I have to look for?
Are there any articles worth reading?
The main purpose is to learn how a CDN works. Some questions that come into my mind are:
- How to balance the load?
- How to choose the nearest location to the client?
- How to achieve HA?
- How to synchronize the data?
- How to serve the files (nginx) ?
Thanks in advance,
gehaxelt
Comments
Anycasted IPs would be your best bet I think. And maybe a distrivuted filesystem? Glusterfs for example.
Never built one myself, but shooting out some ideas.
GeoDNS, Anycast IP (as example Rage4) should be good keywords
Buy a few servers in different regions a good example would be: LA, Dallas, Jacksonville, New York, Amsterdam, etc.
The best way to keep files in sync, would be to set up a rsync cronjob. I had one setup for every 15 minutes that would copy new files from a master server.
Then, setup some sort of geo redirect either through GeoDNS (Rage4 is a good example) or through another way, like PHP, here's an example.
GeoDNS and the PHP method should give you HA. Rage4 allows you to also have a failover, so if one server dies, it can be replaced with another one. I believe the PHP method also has this covered too.
In my experience, that's one of the easiest ways of doing it.
NSONE is another DNS solution which may be useful. They have GeoDNS as well but also have a data source method which can influence results. You can hook up some existing solutions (server monitoring for example) or use their API to push stuff in.
Here is something that you may wanna try.
http://www.scalescale.com/rolling-your-own-cdn-build-a-3-continent-cdn-for-25-in-1-hour/
@avayl is from nsone
I'm bringing up my own anycast network shortly and will be providing anycasted IP space, feel free to PM for more information.
For the Geoip Section.
If you have an OS with a new enough version of BIND, you can scrape by without using any 3d party DNS services. Have a gander at https://kb.isc.org/article/AA-01149/0/Using-the-GeoIP-Features-in-BIND-9.10.html (Original patch that I used is at https://code.google.com/p/bind-geoip/ which is now merged into BIND-9.10)
Bind 9.10 has full City/Region/Country GeoIP Support with the ability to set individual US States/etc to a certain ip.
Now, most people wouldnt pay for the accurate non-free maxmind geoip database, so I decided to do a workaround to increase accuracy without having to pay more (This was actually discovered by accident)
Cloudflare happens to have this nifty little feature where it automatically resolves CNAMES for you (see http://blog.cloudflare.com/introducing-cname-flattening-rfc-compliant-cnames-at-a-domains-root)
So basically, setup a few Bind servers, and make sure they are all working. You can sync the configuration/zones/etc using a mix of btsync/syncthing/etc and iwatch to detect when a file is updated.
Then, set a subdomain in cloudflare to point at the nameservers. For each address that you want to be geoip-ed now, just set the CNAME to the subdomain thats getting the IPs at the nameservers.
Cloudflare will do the rest and make your geoip reasonably accurate. Note that this is kind of dependent on cloudflare's locations as well, so you might sometimes see a few requests to be off/etc. But then again, I have never noticed it going really off (i.e. requesting a subdomain while in US gives an IP in EU)
Note:
If your running Ubuntu or some other distro and are using the geoip package from the repositories, make sure that they are updated. On ubuntu, install geoip-database-contrib instead
A few more thoughts:
Setup a monitoring system (i.e. uptimerobot/etc) to switch the IPs when they fail. Make sure you have a low TTL for DNS records.
Synchronizing the files now...
Ive had two successful ways of doing this.
Use Varnish to pull from a central location (Sync the varnish configuration with btsync/syncthing + iwatch as indicated above)
Use btsync/syncthing + iwatch to sync the files, and run nginx.
Notably, if your on a LEB vps with low ram, high disk space, ask Varnish to cache on disk instead. You may have to do some performance testing with each method (file vs malloc) to see which one is best for you.
Depending on whether your VPSes are fast enough of not, you may want to benchmark both methods as method #1 relies on the connection speed of the origin VPS to serve the initial file (before it goes into cache)
Rage4, LEB's & rsync > http://ldc.pw
Weird. Not loading for me. Sounds like a cool project though, from what I can see via Google cache.
Strange seems to be up, any errors or anything?
@wych Looks like your DNS. I use OpenDNS and this is what they're seeing (I'd probably be hitting the Seattle pop):
https://www.dropbox.com/s/a0ixeg115hsro1f/Screenshot 2014-09-01 01.59.56.png?dl=0
Thanks will forward it on to Rage4.
If you wanna run your own GeoDNS servers, take a look at this: http://phix.me/geodns/
Pretty minimal instructions but it's definitely possible getting this working (made my DNS servers using this method and works fine!).
Up again. But.. That project doesn't seems pretty active according to copyright 2014 in the footer and 3 of 5 nodes down. Btw, 3 POPs are in US, one in France and that's it. Bit weak, maybe.
For 21€/year (+ GeoDNS hosting or servers for that job) you can build your own cdn with LES boxes and have nodes in Italy (for south EU and maybe Africa (nearest les location to it)), Dallas (for south US & South America), Japan (for Asia), Australia (if needed..), NC (for central us), LA (West us) and Düsseldorf / Falkenstein (for central Europe. German DCs got a good network / peering in most cases. And I'm not only writing this cauz I'm german ). Wont be a high end cdn, but definitely faster than without it to many locations I guess.
Edit: oh, didn't take a look at the date of the last Post. Anyways, maybe it'll help someone anyways
Why are you bumping such an old thread?
Its something I dabbled with will go back to later; I have a few higher priority projects right now.
However my original point of rage4 (free) + LEB's stands.