The low-end fail-over everyone should implement

dallascao · March 23

Step one: Buy two VPSes on two locations, preferably from two providers.

Step two: setup them up IDENTICALLY both as "masters". Each can independently run your service.

Step three: use a smart DNS service like "Azure Traffic Manager" and add the IPs and ports of both machines to your domain name. Azure Traffic Manager will monitor your servers automatically and remove the IP when one server is down and add it up when it's up.

If you run wordpress, you will need to implement master/slave mysql setup and install Maxscale on both machines to do read-write split and automatic fail-over.

In short, keep two identical machines of different locations that can run your service independently of each other and use SmartDNS to achieve simple automatic fail-over.

In this way, you achieve really reliable high availability with the least effort and expenditure.

You don't even need a backup plan.

Don't meddle with Docker. Don't install NetData. They will eat up your harddisk and ram and CPU (since you are on the lowend).

Bottom line: Never run production on a a single machine. It will be down sooner or later. No matter how good or how expensive the provider is. If the description above sounds confusing, it's time to learn!

FrankCastle · March 23

Doesn't Azure Traffic Manager cost something? Like you said, since we are on the lowend I tend to prefer free options if there are any. Cloudflare has a load balancing option too if you are willing to pay for it. Right now, I just run http healthchecks from my home Internet connection to multiple lowend VPS servers. Based on those healthchecks I update my Cloudflare http proxied records. It isn't instant since I'm only running the healthchecks every 30 seconds but with Cloudflare caching even with a server outage it is rarely noticable. I've run HAproxy, Seesaw, Kemp, Zevenet, Neutrino, Pen, etc (pick whatever HA solution you think is best) as well but unless you can get multiple servers on the same network and run vrrp/carp it becomes the single point of failure which defeats the purpose. With my home healthcheck solution unless my home Internet is down while the VPS is also down it is a pretty simple and solid solution which doesn't cost a penny extra.

ehhthing · March 23

You dont need "smart dns", browsers will automatically pick the DNS record that works if one of them doesn't.

sliix · March 23

Real low end don't run 2 servers and don't make backup at all.

matey0 · March 23

@ehhthing said:
You dont need "smart dns", browsers will automatically pick the DNS record that works if one of them doesn't.

Do you have resources on this and which browsers do this? Couldn't find much on the topic. It would also be interesting what the timeouts are set at.

@dallascao said: You don't even need a backup plan.

Doesn't hurt to have one in case your application corrupts itself or if there's a security breach for example

FrankCastle · March 23

@ehhthing said:
You dont need "smart dns", browsers will automatically pick the DNS record that works if one of them doesn't.

Not everything is http(s) though. I run healthchecks for proto icmp, tcp ports 25 and 587 and a few others then failover whenever I detect a problem so that there are no service interruptions for anyone.

ehhthing · March 23

@matey0 said:

@ehhthing said:
You dont need "smart dns", browsers will automatically pick the DNS record that works if one of them doesn't.

Do you have resources on this and which browsers do this? Couldn't find much on the topic. It would also be interesting what the timeouts are set at.

As far as I can tell this works in Chrome, Firefox and Safari. I don't know of any real documentation on how this works unfortunately.

With Chrome + Firefox + Safari all working this way it should cover ~everyone basically.

@FrankCastle said:

@ehhthing said:
You dont need "smart dns", browsers will automatically pick the DNS record that works if one of them doesn't.

Not everything is http(s) though. I run healthchecks for proto icmp, tcp ports 25 and 587 and a few others then failover whenever I detect a problem so that there are no service interruptions for anyone.

This is also how MX records work as well, it's baked into the spec. OP only mentioned websites.

marian · March 23

@dallascao said: Buy two VPSes on two locations, preferably from two providers.

man, this is low end... it should be at least 5 to be on the safe side

gwnd1989 · March 23

I have this set up. mariadb replication + unison

Neoon · March 23

Doesn't sound lowend to me.
gdnsd has integrated monitoring support, so if one of your servers goes blub, it would switch the records for you.

It may not be as reliable, since it doesn't need or use any external confirmations but its as decentralized as it gets and its free.

boot · March 23

Thanks Dad.

ehhthing · March 24

@Neoon said:
Doesn't sound lowend to me.
gdnsd has integrated monitoring support, so if one of your servers goes blub, it would switch the records for you.

It may not be as reliable, since it doesn't need or use any external confirmations but its as decentralized as it gets and its free.

Love gdnsd but the configuration format is such a pain in the ass sometimes

raindog308 · March 24

LEB will run a 7-part tutorial on setting up a highly-available Wordpress site with https starting April 11.

I wrote a previous tutorial that used DRBD + mysql multi-master + RRDNS. The new one will use Gluster, Galera, and RRDNS.

bustersg · March 24

Lowend means muz anticipate downtime, u get what u pay for. I use bash script dump backup n rscp to another lowend, basically they do backup to one another. Recently i use 321 backup strategy dump to remote cloud idrive too.

raindog308 · March 24

@dallascao said: You don't even need a backup plan.

That is wrong.

Suppose someone hacks one of your nodes. Suppose you accidentally delete a post. Database corrupts on one node. A bad plugin goes ape. Someone hacks the VPS provider.

All your HA will do in that case is make the damage more thorough.

Backups are a different concept than high availability.

JasonM · March 24

average low end joe: "what, you take backups?"

dallascao · March 24

@raindog308 said:
LEB will run a 7-part tutorial on setting up a highly-available Wordpress site with https starting April 11.

I wrote a previous tutorial that used DRBD + mysql multi-master + RRDNS. The new one will use Gluster, Galera, and RRDNS.

For two VPSes, the old Master-slave setup may be a better choice than Galera.
With a two-node Galera, when one node is ungracefully shut down (which is what we are prepared for), the other stop working too.

For Multi-master, Maxscale can do auto-switch over and auto rejoin: in case of master down, the slave will become the new master. When the old master is up later, it will rejoin the new master and become the new slave.

MeAtExampleDotCom · March 24

@dallascao said:
Step three: use a smart DNS service like "Azure Traffic Manager" and add the IPs and ports of both machines to your domain name. Azure Traffic Manager will monitor your servers automatically and remove the IP when one server is down and add it up when it's up.

Or skip the extra cost of the load balancer (unless there is a completely free plan?): Have both your servers run authoritative DNS but only give their own address out, with a low TTL, in response to queries. If a host goes down, it'll stop responding to DNS queries so will stop being asked for HTTP/other connections until it comes back up. This is relying on DNS servers using whichever server(s) are available, but (unlike relying on browsers to do this) the behaviour is baked into the DNS spec.

One potential concern is DNS resolvers out there dealing badly with short TTLs, but while this wasn't uncommon in the 09/00s¹ I'm pretty sure it is not an issue these days. The other issue is if something takes down the web server on one of the boxes but not the related DNS daemon, this can be mitigated somewhat with simple monitoring and taking down the DNS service if the HTTP(S) service appears unavailable.

Of course if modern browsers do try each address given, as suggested above², then even this isn't needed, and you can just have all your name servers report all the relevant addresses and let the UA to the extra work.

In any case, the elephant in the room is the need to keep the content on each server in sync, which is easy if your service is static but potentially much less so otherwise.

You don't even need a backup plan.

Yes, you do. HA and DR are different concepts. A high-availability arrangement can act as a partial backup, in that each machine has a copy of the content, but it won't protect you from key failure modes that a proper backup regime would. For instance human error on your part accidentally removing some data, and not noticing until the error is replicated on both the servers.

--
[1] for instance: a popular daemon, I forget which, would assume anything less than 300s was an error, and it would apply its own default instead which was orders of magnitude longer
[2] I've not tested

Howdy, Stranger!

Categories

In this Discussion

The low-end fail-over everyone should implement

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

The low-end fail-over everyone should implement

Comments