All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
High availability for SaaS: from day one or can it wait?
I am planning to relaunch my SaaS in around 3 months, but I haven't decided yet what to do about infrastructure. When I was running the previous iteration of the project I set up a proper Kubernetes cluster with autoscaling and everything, absolutely everything, was set up in highly available configuration to minimise risks of downtime.
However at the moment, during development, I am using a single VPS with Dokku to deploy the app and I just love the simplicity, and this is coming from a die hard Kubernetes fan.
So the question I have for you is, do you think that I should have a highly available setup from day 1 considering that it's a SaaS to build and host static sites and blogs? Or would it be OK to keep this ridiculously simple single-server configuration for now, until I gain some traction?
To be honest I would like to keep this simpler setup for a while, but I am worried of what people might say if they experience downtime. I mean, I know that even massive companies suffer from outages but still....
What do you think?
Comments
We deployed HA from day one on our cloud and turns out it is not very popular. It also ended up breaking some things when we had some failures.
Since you would be running your own SaaS on top of that, I think you know best if HA is worth it, IMO I think not, but I suggest 2 setups, one for regular users, one premium with HA.
Trust me, HA gets you a bigger and bigger headache as the footprint grows, so having it separate only for people who need it and agree to pay extra for it is probably a better way.
Basically the thing a dev/devops should be least concerned about is what people say.
Having HA can definitely result in the whole system being more prone to downtime if the sysadmin in charge doesn't have a firm grip of all the moving parts.
A clustered app is more likely to rely on etcd (not the one used by k8s control plane), consul, vault, udp based vpn, complex deployment scripts/configs, etc. If any one of these is unhardened and exploited, the whole cluster goes to shit. Or any one of them could glitch out and then troubleshooting is exponentially harder due to all the moving parts.
Hardening all those parts is extra work and may not yield an ROI until a certain point.
Having a monolith and a DB in the same OS is unsexy these days, but is still a legit way to get up and running quickly until you've reached maybe 100k+ users, depends. Basically a matter of what you're comfortable with.
Now I feel dirty talking about unhardened and exploited unsexy monolith getting up (& running).
I would choose a simple HAProxy setup with 2 backend server. Database is replicated between two servers. No need for Kubernetes. I know that it is possible to run database in K8s but I think it is too complicated.
I think you can just use Dokku on each server. Database can be either in a separate vps or a managed db from planetscale. HAProxy can be removed if you use Cloudflare load balancer.
Just KISS it. What matters the most is the fast launch. Saas can be unfinished, half baked crap, but on launch you will see if product is in demand or you just waste your time. 98% you just waste time.
Keeping two different setups for different users makes things even more complicated, while I am trying to keep them simpler.
I think I will do this at least in the early days until I see that it's worth investing more time and money in it. You're right on the demand aspect.
Why you need ha unless you have some good cashflow. Start simple.
Don't bother at first, but be transparent about it and what you're planning to improve availiability.
It can definitely wait and better spend time making sure backups are done properly (it gets backed up, it is valid and complete).
This. Proper backup and restore policy on that grim day is worth all high availabilty x10.
Yeah backups are definitely robust. I am not worried about those at all since their are solid and I test restoring periodically.
Having to explain this so often gets tiring, but it's nice when you're not the only one that gets it. HA is so often most appreciated by people who have never set up anything complex in a high availability way. It's great if everything works exactly as theorized, but complex apps at scale rarely work as theorized.
I prefer alerting and manual failover for a lot of things. A human reviewing context can often cause less downtime than an automated system that just never seems to be able to be programmed for every possible scenario.
But I mean if it's reasonable for the stack in question, why not.