Would this setup work?

Saragoldfarb · December 2017

I'm testing a failover setup that should be working and would like to get some input on possible issues that may arise. So, I was after an utterly cheap and affordable solution so this is what I'm doing. I usually work with monitoring, failover ips, etc but I want things to be as simple as possible, just for the fun of it.

I have 2 servers and for the argument I'll use mydomain.com for the domain.

Server A, IP 1.1.1.1, DNS a.mydomain.com

Server B, IP 2.2.2.2, DNS b.mydomain.com

Both are just serving 3 static pages replicated with rsync. No issues there. Both servers run a DNS server as well. They'll only serve the records for mydomain.com, no other domains will be used. And I'm using a TTL of 300 to minimise downtime becouse of DNS caching.

For DNS for mydomain.com I'll be using a.mydomain.com and b.mydomain.com. When a.mydomain.com gets a query for mydomain.com it returns the IP 1.1.1.1, when b.mydomain.com get a query for that domain it returns the IP 2.2.2.2.

So basically, if the domain is trying to resolve it will always return the IP of a live server as a dead DNS server wouldn't respond and the next sever would be asked for a response. Now I know this is no where the best or ideal setup but I just wanted to try something different. Any thoughts?

WSS · December 2017

It should work.

seanho · December 2017

I suspect this is not too different from DNS round-robin, with its attendant tradeoff of long failover time vs short TTL. I think I also remember some clients (web browsers) were starting to ignore TTL and cache longer.

Any reason not to explore haproxy?

bsdguy · December 2017

Disadvantage: You are relying on the domain system (with e.g. plenty badly configured recursors, caches, etc. which don't give a fuck about your ttl).

I'd prefer an IP switching (if not a vps) or a firewall based solution that sends traffic for both IPs to the live server (if one is dead). But granted that might be difficult in a hosted situation where you have little to no control.

rm_ · December 2017

seanho said: I suspect this is not too different from DNS round-robin

@Saragoldfarb Why not just use DNS round-robin itself, though? I.e. have both nameservers always return both webserver IPs, then, optionally, withdraw the record for one that is down from both (or from the remaining one, since if your downed webserver also doubled as a NS, you can't edit it anyway).

Even with always keeping all of the webserver IPs in DNS, it won't cause major issues on downtimes, as in case of some being down, clients will go on to try the next one with only a minimal delay. And as a benefit you get a massively simpler setup.

IonSwitch_Stan · December 2017

Just use Amazon Route53 with health checks on the two backing instances. You can have a record (www) that calls a healthcheck URL, and if that URL returns healthy it will be given to resolvers. Keep your TTL reasonably short 60 seconds. As long as everyone is respecting those TTL's it works fairly well.

duckeeyuck · December 2017

My man, from my experience, don't manage a dns server, use cloudflare or something and point an A record to your server with nginx, that's all.

Edit: You don't even need load balancing for static sites bro, even if you had millions of unique hits per day.......

datanoise · December 2017

Should work but:

rm_ said: DNS round-robin

is probably the best solution: no problem with caches, and the client will try the second server if the first is down. Also acts as some kind of poor man load balancer..

bsdguy · December 2017

@duckeeyuck said:
My man, from my experience, don't manage a dns server, use cloudflare or something and point an A record to your server with nginx, that's all.

Edit: You don't even need load balancing for static sites bro, even if you had millions of unique hits per day.......

@rm_ works for you or is a slave or your husband?

Or does all that "my man" and "bro" rather tell us something about you?

Saragoldfarb · December 2017

@seanho said:
I suspect this is not too different from DNS round-robin, with its attendant tradeoff of long failover time vs short TTL. I think I also remember some clients (web browsers) were starting to ignore TTL and cache longer.

When you use DNS round robin it doesn't check if the IP is up and running. It'll just return IP 1.1.1.1 or IP 2.2.2.2

Any reason not to explore haproxy?

Yes, trying to avoid anything I don't really need. For production stuff I use haproxy, load balancers, failover ips and the whole shebang but where's the fun in that?

Saragoldfarb · December 2017

@bsdguy said:
Disadvantage: You are relying on the domain system (with e.g. plenty badly configured recursors, caches, etc. which don't give a fuck about your ttl).

Good point. Thanks! I will Google a bit to see if I can come up with some statistics to check if that might be an issue. Obviously I'm not after a real HA setup but a poor man's failover

I'd prefer an IP switching (if not a vps) or a firewall based solution that sends traffic for both IPs to the live server (if one is dead). But granted that might be difficult in a hosted situation where you have little to no control.

Same here. That's why I usually go that route but wanted to explore something else. Thanks though.

Saragoldfarb · December 2017

@rm_ said:

seanho said: I suspect this is not too different from DNS round-robin

@Saragoldfarb Why not just use DNS round-robin itself, though? I.e. have both nameservers always return both webserver IPs, then, optionally, withdraw the record for one that is down from both (or from the remaining one, since if your downed webserver also doubled as a NS, you can't edit it anyway).

Even with always keeping all of the webserver IPs in DNS, it won't cause major issues on downtimes, as in case of some being down, clients will go on to try the next one with only a minimal delay. And as a benefit you get a massively simpler setup.

Yeah that would be a better solution but I'd need to use a monitor + it doesn't solve any problems with cached results. Monitoring is not an issue but it'll just add another point of failure.

Saragoldfarb · December 2017

@IonSwitch_Stan said:
Just use Amazon Route53 with health checks on the two backing instances. You can have a record (www) that calls a healthcheck URL, and if that URL returns healthy it will be given to resolvers. Keep your TTL reasonably short 60 seconds. As long as everyone is respecting those TTL's it works fairly well.

Thanks. In theory TTLs should be respected right? I know they're not always followed in a real word scenario but for this particular case I'd settle if they'd be followed 90% of the time.

Saragoldfarb · December 2017

@duckeeyuck said:
My man, from my experience, don't manage a dns server, use cloudflare or something and point an A record to your server with nginx, that's all.

I use cloudflare occasionally but rather use my own DNS. I like having control. Eversince I started using our own cluster never ever did anything go down becouse of DNS issues. Cloudflare is great but couses me more issues than it solves.

Edit: You don't even need load balancing for static sites bro, even if you had millions of unique hits per day.......

I know. I just wanted to explore some things and @davidgestiondbi made me buy VMs in all his locations on BlackFriday so time to put them to use and try an experimental setup

IonSwitch_Stan · December 2017

Thanks. In theory TTLs should be respected right? I know they're not always followed in a real word scenario but for this particular case I'd settle if they'd be followed 90% of the time.

If you have a site under any sort of load, you could quite easily test this (have two servers, leave server 1 in DNS for a week, then swap to server 2, and see what traffic goes to server 1). In my experience its fine for humans/browsers/etc.

In practice, its mostly poorly written Java based bots that don't respect TTL's (Java generally has very poor DNS caching).

Saragoldfarb · December 2017

@WSS said:
It should work.

Should is the magic word ,:) I should just give it a try I guess and see how that goes in a real world scenario. Check from multiple locations and see how that works out. I can just use a script to shutdown or start either server at random.

Any thought on how to measure availability? I have a payed account with uptimerobot so monitoring the domain with a 1 min interval should give some decent numbers right?

mfs · December 2017

It's, like, the third thread in a few days on this topic; DNS round-robin and a solution similar to @rm_ one's was linked here (just, more unnecessarily convoluted in an attempt to "auto-remove" the dead entries as soon as one of the server dies)

EDIT: and TTLs (for browsers, Java and standard browsing) were linked few posts down

EDIT #2: and here yet another thread

Saragoldfarb · December 2017

@IonSwitch_Stan said:

Thanks. In theory TTLs should be respected right? I know they're not always followed in a real word scenario but for this particular case I'd settle if they'd be followed 90% of the time.

If you have a site under any sort of load, you could quite easily test this (have two servers, leave server 1 in DNS for a week, then swap to server 2, and see what traffic goes to server 1). In my experience its fine for humans/browsers/etc.

In practice, its mostly poorly written Java based bots that don't respect TTL's (Java generally has very poor DNS caching).

Good idea to just do some real world testing. It wouldn't be a problem if things were not accessable for like a couple of minutes cos if that would be the issue I'd go for the failover, haproxy route.

I have no traffic as of yet but if I do decide to try this setup live it'll probably get 2000 hits a day so not that much. May be worth the try.

Thanks for the info about Java. Wasn't aware of that.

Saragoldfarb · December 2017

@mfs said:
It's, like, the third thread in a few days on this topic; DNS round-robin and a solution similar to @rm_ one's was linked here (just, more unnecessarily convoluted in an attempt to "auto-remove" the dead entries as soon as one of the server dies)

EDIT: and TTLs (for browsers, Java and standard browsing) were linked few posts down

I know, sorry about that. Difference is, I'm just looking for an experimental solution not taking into account what best practice would be . Appreciate the input!

Also, this solution would need no monitoring of any kind as dead records won't be served. No need to edit your zones, no possible false negatives by your monitor. As long as your webserver and dnssever are functional you should be set.

Howdy, Stranger!

Categories

In this Discussion

Would this setup work?

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Would this setup work?

Comments