Kubernetes self management

SplitIce · March 2023

Does anyone here manage any production grade kubernetes clusters? Does anyone want to share their experiences for someone looking to get a picture of the management challenges and costs?

We are talking in the 32-64 CPU sort of size.

Cpt_Ben · March 2023

My 2 cents:

Get a management layer on top of it, helps with managing and simplifies making snapshots of the clusters. Rancher is a good start.
Have redundant and performant storage, rook-ceph is fairly nice, other options are NFS but lower performance ofc. VMware also has it's own thing, but it's expensive. Hitachi has some stuff too but it's a turnkey solution, but well integrated with public clouds and also supports on-prem. Depending on size, could also take a look at MinIO.
Set up an ELK stack, prometheus+grafana with prope monitoring setup, alertmanager, alerta, etc.
Have your own Harbor with proxy cache as you'll get limited by dockerhub for too many images pulled. Can be done as an easy cluster-wide setup as a default repository.
Have a clearly communicated plan for the lifecycle of the different K8S versions so that your customers are aware when a specific version is EOL. Things change between different versions and it'll break customers' deployments.
Make it absolutely clear to anyone buying services from you that they understand the core concept of Kubernetes and what a kubernetes-native application is. A lot of people just deploy a monolithic application in a single pod and expect it to be uninterrupted. That's not how K8S works, pods are killed and restarted, the application's components need to be redundant.
Beware that backup solutions are few and far inbetween and have quirks and are expensive. If you have the money, look into Portworx Backup. They also have good storage, but it's out of the scope of LET price ranges
Automate EVERYTHING. Node deployment (either baremetal or VMs), Terraform is a good start, with some ansible on top of it. Keep things up-to-date.

ehab · March 2023

@Cpt_Ben thanks a lot for your point.

may i ask you what is the infrastructure / stack you recommend. you already mentioned

storage : rook-ceph , nfs
monitoring & alert : ELK prometheus+grafana
Image management : Harbor

what about proxy, certs, secrets "vaults" , repo, backups and restore,

anything else you can add would be great to know and thanks in advance.

@vitobotta is also invited to add his valuable inputs.

Cpt_Ben · March 2023

@ehab said:
@Cpt_Ben thanks a lot for your point.

may i ask you what is the infrastructure / stack you recommend. you already mentioned

storage : rook-ceph , nfs

monitoring & alert : ELK prometheus+grafana

Image management : Harbor

what about proxy, certs, secrets "vaults" , repo, backups and restore,

anything else you can add would be great to know and thanks in advance.

@vitobotta is also invited to add his valuable inputs.

For proxy, have the HAproxy load balancer in front of the cluster, then nginx-ingress and set up dynamic DNS configurations for a specific domain, while also allowing the customers to define their own domains with the ingresses. This requires engineering ofc.

For certs, Rancher has a nice graphical interface of storing encrypted secrets (as of 1.6), as well as certs. For more info, read the documentation here.

For repo, I prefer Harbor, it's fairly nice and easy to manage. Recommend an LDAP integration with whatever SSO (if any) you're using for user authentication. Both Harbor and Rancher are LDAP ready.

For backups, it's a bit complicated as most storage-level backups only create snapshots of the PVs but not the actual application configuration. Portworx Backup supports both, but it was broken on Rancher 2.5.x the last time I tried it, they were aware of the issue and were working on a fix. Not sure if it's fixed already. Worth a try though, they have a 30 day trial.

Of course there're other storage drivers available, I only mentioned a handful.

+1 a Gitlab instance is nice for CI/CD deployments, it's fairly easy to set up with K8S.

Cpt_Ben · March 2023

It's also worth taking a look at Portainer, not everyone needs Kubernetes.

tjn · March 2023

For secrets management, have a look at Hashicorp vault, or up and coming Infisical

vitobotta · March 2023

@ehab said:
@Cpt_Ben thanks a lot for your point.

may i ask you what is the infrastructure / stack you recommend. you already mentioned

storage : rook-ceph , nfs

monitoring & alert : ELK prometheus+grafana

Image management : Harbor

what about proxy, certs, secrets "vaults" , repo, backups and restore,

anything else you can add would be great to know and thanks in advance.

@vitobotta is also invited to add his valuable inputs.

Hi, I have been working with k8s for 5 years and have managed clusters on prem before moving to GCP/GKE. I strongly recommend you go with something like Rancher instead of doing everything with "vanilla" k8s. It's a solid combo management+distro (if you use RKE2 or k3s, RK1 is old and will be abandoned at some point). Using something like Rancher will make many things easier and if you are in trouble, they offer paid support so you can solve problems quickly with them if you don't know how to fix.

At previous jobs I was managing everything myself and luckily I never needed external support, but it's good to know it's there if needed.

Storage: there is a number of options both open source/free and paid. Rook-Ceph, mentioned by @Cpt_Ben is solid, but be aware that while orchestrating persistent volumes is easy with Rook because it's automated, it can happen that you need to intervene manually to fix some issues if something serious happens to the cluster. In my opinion if you are not expert with this I would go with something like Longhorn (also created by Rancher). Longhorn is a lot easier to install and use, much easier to manage and recover volumes when some replicas are faulty with a nice dashboard, and can be significantly faster than Ceph depending on which disks you use (Ceph was really designed in the HDD era). Longhorn also supports both snapshots and backups to off site storage like S3-compatible or NFs, and the backups are "crash consistent" because they are performed after taking a snaphot, automatically. So this can give you peace of mind that can restore your volumes easily if something happens, even in another cluster. With Longhorn you can even configure disaster recovery: so you have volumes on a primary cluster continuously replicated to a "standby" cluster, and if something major happens to the primary cluster, you can instantly promote the volumes on the standby cluster and set them as primary, minimizing data loss dramatically in the event of disaster. One advantage Ceph has on Longhorn, though, is that it stores data in chunks replicated across multiple nodes (usually 3), and you can even create volumes larger than the size of the actual disks on a single node, because it automatically spreads the data across nodes. This is very powerful because you can really distribute data, but it comes at the expense of performance. With Longhorn, on the other hand, you can only create volumes as large as the disk space available (minus some space reserved for the system). Both can do "thin provisioning", meaning that you can create say a volume 100GB big for example, but the volume will not take 100GB initially. It will grow as you add data to it. This allows you to provision larger volumes just to have more capacity in the future. Having said that, with both Ceph and Longhorn you have storage classes that allow volume expansion (this just requires a restart of the workload using the volume after updating the size, so not a big deal). There are other options. Open source there's also OpenEBS, which supports different storage engines. Pretty easy to use, it used to be the slowest though. This may have changed recently with the new engine Mayastor, but I haven't tested it. Other options: Robin, StorageOS, Portworx and more. Some of these are paid but have free tiers as well. Generally speaking, if you can afford it Portworx is the absolute best storage solution for on prem Kubernetes clusters.

Monitoring: I usually use the kube-prometheus-stack Helm chart in my clusters. It's a nice setup between Prometheus, AlertManager and Grafana. On top of this I also add Robusta.dev for better notifications.

Logging: I prefer Loki, because it's lighterweight compared to other options and it's easy to install and use (querying logs is very easy).

Backups: like I mentioned Longhorn can back up volumes, but you may likely want to back up whole applications. For this there's Velero, which is free and open source and works really well. It can also back up volumes but these backups are not "crash consistent" out of the box because they use Restic, so they are file level backups. You can use a trick with fsfreeze to make crash consistent backups which works quite well if the volumes don't see a very high write load. If you want a better backup solution I recommend Kasten (there's a free tier). It can even create crash consistent backups with supported storage drivers, and Longhorn is supported as well.

Certificates: this one is super easy. Just use cert manager. It can provision certificates with Let's Encrypt very quickly with both HTTP and DNS challenge methods.

Secrets: Hashicorp Vault is a solid option.

By "proxy" you probably mean ingress controller. I recommend ingress-nginx because it's the most documented of all of them, and just works. To expose the ingress to the outside world you would normally use a load balancer, which is provisioned automatically in cloud environments. On prem it's a different story of course, but if you can use MetalLB then you can provision load balancers there as well. This is the best solution if you can make it work with your environment because Kubernetes automatically keeps the load balancer configuration up to date with the active endpoints without requiring intervention from you. So you can replace nodes, move apps around etc, it's automatic. If MetalLB cannot work in your environment, then the easiest option is to use the ingress controller with host ports, and configure DNS to round-robin load balance the nodes. But load balancing based on DNS is not a good idea usually. A better option, which requires additional effort, is to set up an external load balancer such as HAproxy in high availability mode using something like keepalived. The problem with both of these solutions is that you need to keep the config updated if the IPs of the nodes change etc.

I don't know what else to add, perhaps it's easier if you ask direct questions on topics you are not sure about. Happy to help

vitobotta · March 2023

I forgot to mention that I built a tool - which you can find at https://github.com/vitobotta/hetzner-k3s - that allows you to very quickly and easily create production level clusters in Hetzner Cloud. It supports highly available node pools across different regions, highly available control plane, and even autoscaling among other things. The reason why I am mentioning it is that there are many companies using my tool for their clusters because you get out of the box

provisioning of load balancers
provisioning of persistent volumes using block storage
support for autoscaling

just like managed Kubernetes services, but you save a ton of money by using my tool and Hetzner instead. One guy told me that in his company they were saving 25K (yes thousands) per month after switching from Google Kubernetes Engine to my tool + Hetzner. So if you can use Hetzner, this is an option worth considering

vitobotta · March 2023

BTW I wrote a little comparison of some storage solutions here https://vitobotta.com/2019/08/06/kubernetes-storage-openebs-rook-longhorn-storageos-robin-portworx/ if interested.

bgerard · March 2023

@vitobotta I think you've just made me want to give k8s another go. Great information, thank you

SplitIce · March 2023

Clarification my inquiry relates only to Kuberentes cluster self management. I've got 3+ years experience porting customer applications and kubernetes based application design at this point.

None with managing Kuberentes musters (sizing, upgrade plans, etc) or the design choices behind Kubernetes components (csi, etc)

AlyssaD · March 2023

Highly recommend Rancher as well.

netswitch · March 2023

Question related :
How does one manages the billing of kubernetes clusters ?
We have a test setup deployed and managed with Rancher but I do not see any tool to manage the billing..

SplitIce · March 2023

Have you experienced.higherboberheads with k3s Vs k8s (e.g etcd Vs SQL backed). What about compatibility?

@vitobotta said:

@ehab said:
@Cpt_Ben thanks a lot for your point.

may i ask you what is the infrastructure / stack you recommend. you already mentioned

storage : rook-ceph , nfs

monitoring & alert : ELK prometheus+grafana

Image management : Harbor

what about proxy, certs, secrets "vaults" , repo, backups and restore,

anything else you can add would be great to know and thanks in advance.

@vitobotta is also invited to add his valuable inputs.

Hi, I have been working with k8s for 5 years and have managed clusters on prem before moving to GCP/GKE. I strongly recommend you go with something like Rancher instead of doing everything with "vanilla" k8s. It's a solid combo management+distro (if you use RKE2 or k3s, RK1 is old and will be abandoned at some point). Using something like Rancher will make many things easier and if you are in trouble, they offer paid support so you can solve problems quickly with them if you don't know how to fix.

At previous jobs I was managing everything myself and luckily I never needed external support, but it's good to know it's there if needed.

Storage: there is a number of options both open source/free and paid. Rook-Ceph, mentioned by @Cpt_Ben is solid, but be aware that while orchestrating persistent volumes is easy with Rook because it's automated, it can happen that you need to intervene manually to fix some issues if something serious happens to the cluster. In my opinion if you are not expert with this I would go with something like Longhorn (also created by Rancher). Longhorn is a lot easier to install and use, much easier to manage and recover volumes when some replicas are faulty with a nice dashboard, and can be significantly faster than Ceph depending on which disks you use (Ceph was really designed in the HDD era). Longhorn also supports both snapshots and backups to off site storage like S3-compatible or NFs, and the backups are "crash consistent" because they are performed after taking a snaphot, automatically. So this can give you peace of mind that can restore your volumes easily if something happens, even in another cluster. With Longhorn you can even configure disaster recovery: so you have volumes on a primary cluster continuously replicated to a "standby" cluster, and if something major happens to the primary cluster, you can instantly promote the volumes on the standby cluster and set them as primary, minimizing data loss dramatically in the event of disaster. One advantage Ceph has on Longhorn, though, is that it stores data in chunks replicated across multiple nodes (usually 3), and you can even create volumes larger than the size of the actual disks on a single node, because it automatically spreads the data across nodes. This is very powerful because you can really distribute data, but it comes at the expense of performance. With Longhorn, on the other hand, you can only create volumes as large as the disk space available (minus some space reserved for the system). Both can do "thin provisioning", meaning that you can create say a volume 100GB big for example, but the volume will not take 100GB initially. It will grow as you add data to it. This allows you to provision larger volumes just to have more capacity in the future. Having said that, with both Ceph and Longhorn you have storage classes that allow volume expansion (this just requires a restart of the workload using the volume after updating the size, so not a big deal). There are other options. Open source there's also OpenEBS, which supports different storage engines. Pretty easy to use, it used to be the slowest though. This may have changed recently with the new engine Mayastor, but I haven't tested it. Other options: Robin, StorageOS, Portworx and more. Some of these are paid but have free tiers as well. Generally speaking, if you can afford it Portworx is the absolute best storage solution for on prem Kubernetes clusters.

Monitoring: I usually use the kube-prometheus-stack Helm chart in my clusters. It's a nice setup between Prometheus, AlertManager and Grafana. On top of this I also add Robusta.dev for better notifications.

Logging: I prefer Loki, because it's lighterweight compared to other options and it's easy to install and use (querying logs is very easy).

Backups: like I mentioned Longhorn can back up volumes, but you may likely want to back up whole applications. For this there's Velero, which is free and open source and works really well. It can also back up volumes but these backups are not "crash consistent" out of the box because they use Restic, so they are file level backups. You can use a trick with fsfreeze to make crash consistent backups which works quite well if the volumes don't see a very high write load. If you want a better backup solution I recommend Kasten (there's a free tier). It can even create crash consistent backups with supported storage drivers, and Longhorn is supported as well.

Certificates: this one is super easy. Just use cert manager. It can provision certificates with Let's Encrypt very quickly with both HTTP and DNS challenge methods.

Secrets: Hashicorp Vault is a solid option.

By "proxy" you probably mean ingress controller. I recommend ingress-nginx because it's the most documented of all of them, and just works. To expose the ingress to the outside world you would normally use a load balancer, which is provisioned automatically in cloud environments. On prem it's a different story of course, but if you can use MetalLB then you can provision load balancers there as well. This is the best solution if you can make it work with your environment because Kubernetes automatically keeps the load balancer configuration up to date with the active endpoints without requiring intervention from you. So you can replace nodes, move apps around etc, it's automatic. If MetalLB cannot work in your environment, then the easiest option is to use the ingress controller with host ports, and configure DNS to round-robin load balance the nodes. But load balancing based on DNS is not a good idea usually. A better option, which requires additional effort, is to set up an external load balancer such as HAproxy in high availability mode using something like keepalived. The problem with both of these solutions is that you need to keep the config updated if the IPs of the nodes change etc.

I don't know what else to add, perhaps it's easier if you ask direct questions on topics you are not sure about. Happy to help

Howdy, Stranger!

Categories

In this Discussion

Kubernetes self management

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Kubernetes self management

Comments