Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Building a new Cloud for Best Cost/Performance Ratio
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Building a new Cloud for Best Cost/Performance Ratio

randvegetarandvegeta Member, Host Rep

Considering setting another cloud and looking for some opinions as to which would be the best considering cost and performance.

Generally speaking, this VMs running on this Cloud will not be intended for the LET/LEB market. So 'budget' may be a little more flexible than if catering to such low cost VMs.

We have an existing Virtuozzo Cloud, mainly running on E5 v1 and v2 CPUs, distributed storage over 10G network with SSD caching. The performance is alright but not stellar. Looking at overall CPU usage, it doesnt seem like this is a bottleneck for us. Usage generally remains low. Performance of the cloud storage seems to be about as good a directly attached desktop HDDs on dedicated servers for VMs.

Does anyone have experience with Virtuozzo and other cloud platforms (with distributed storage)? How do they compare?

Are the new E5 v4 CPUs really worth it? They have more cores sure, but given a utilization of sub 20% on average, can it really provide any significant performance increase? In theory, I could squeeze more VMs/CTs on a single node with more CPU, but the bottleneck would then probably become the storage.

I imagine that faster disks would boost performance more than CPU or RAM. But we are already doing SSD read and write caching, so the only thing I can think of that would speed up the disks would be to go full SSD. Anyone have any experience running a pure SSD storage cluster? What kind of performance do you get over HDD + SSD cache? And what kind of SSDs do you use? Given the distributed nature of cloud storage, is it worth using all enterprise SSDs or can larger capicty consumer grade disks be considered? Right now, all our disks are Enterprise, but HDDs and a few caching SSDs cost fraction of what an equivalent sized FULL SSD cluster would cost. Even using consumer SSDs (Crucial MX300 for example) would still cost at least 7x more than using Enterprise HDDs. Will the performance improvements be sufficient (bare in mind I'm already using SSDs for caching).

Or is it really worth it just going all out and getting the top end E5s, loaded with SSDs and enormous amounts of DDR4 RAM?

Comments

  • doghouchdoghouch Member
    edited May 2017

    @randvegeta If you truly want fast storage, follow @VMHaus' route. He uses NVMe storage (it is wicked fast by the way), and decent hardware in terms of CPU, and etcetera. The only issue with this is that NVMe storage is (really) not cheap (I mean it).

    You could just toss a bunch of SSDs and throw them in a RAID10 array as a NAS, and connect servers to it for extra storage, while having smaller amounts of NVMe storage for the main server.

  • Why are you not thinking about upgrading your network?

  • randvegetarandvegeta Member, Host Rep

    doghouch said: You could just toss a bunch of SSDs and throw them in a RAID10 array as a NAS, and connect servers to it for extra storage, while having smaller amounts of NVMe storage for the main server.

    This is an option but I'm looking for distributed storage for redundancy. RAID isn't redundant enough unless I'm doing a single node simple VPS offering. Which I suppose I could do, but I would prefer high availability.

    At the moment, our cloud storage wastes tons of disk space. We have 3x replication, so for ever 1GB of data we store, we used 3GB of storage in the cluster. NVMe sounds good, but for our cloud storage, that would mean 3x the cost. That's a no go then.

  • randvegetarandvegeta Member, Host Rep

    flatland_spider said: Why are you not thinking about upgrading your network?

    Upgrade to what?

  • raindog308raindog308 Administrator, Veteran

    By the time you finish coding expected cloud features - snapshots, resizable instances, hourly billing, movable block storage, floating IPs, load balancers, firewall appliances, and the rest of Amazon's API, E5s will have been replaced.

    OTOH if you're just going to be yet another OpenVZ/Virtuozzo VM peddler calling yourself "cloud"...

    Thanked by 1Lm85H4gFkh3wk3
  • randvegetarandvegeta Member, Host Rep

    raindog308 said: By the time you finish coding expected cloud features

    Not sure I'm following you here. I already have a 'cloud' platform. And by cloud, I mean distributed / high availability / scalable system.

    So right now our platform uses Virtuozzo VMs (don't really do containers like OpenVZ because no one wants it). The storage is shared/distributed and there is high availability in place. Node goes down and the VM is automatically migrated to another node and brought back online. Storage is distributed among all nodes with 3 copies throughout the cluster allowing for upto 2/3 of your disk to fail without any data loss at all, or 1/3 of your disks to fail without any downtime.

    Such a system allows for very restful nights and so I'm not too keen on the idea of standalone, single-node, virtualisation. I'm sure if I built a totally new system today, the hardware would probably go 5+ years without any issue, but then if that node ever does fail, or if we need to look at migrating all the VMs off of it, it would likley be a huge pain in the arse. I base this on the fact that we have dozens of half used Xen, HyperV, Proxmox and VMWare nodes that were setup and put into production years ago. Consolidation and optimisation is hard to do in a live node, so it's a long term concern.

  • @randvegeta said:
    Upgrade to what?

    >

    Something greater then 10Gig on the storage side.

  • raindog308raindog308 Administrator, Veteran

    @randvegeta said:

    raindog308 said: By the time you finish coding expected cloud features

    Not sure I'm following you here. I already have a 'cloud' platform. And by cloud, I mean distributed / high availability / scalable system.

    So right now our platform uses Virtuozzo VMs (don't really do containers like OpenVZ because no one wants it).

    You're doing Virtuozzo without containers...?

    The storage is shared/distributed and there is high availability in place. Node goes down and the VM is automatically migrated to another node and brought back online. Storage is distributed among all nodes with 3 copies throughout the cluster allowing for upto 2/3 of your disk to fail without any data loss at all, or 1/3 of your disks to fail without any downtime.

    So you have high availability clustering. There isn't a consensus definition of "cloud" but I wouldn't call what you're doing a cloud. I'd call it clustering. If you want to call it a cloud, no reason you can't, but if you market it that way, you'll get comparison to cloud providers.

  • randvegetarandvegeta Member, Host Rep

    @flatland_spider said:

    @randvegeta said:
    Upgrade to what?

    >

    Something greater then 10Gig on the storage side.

    I can upgrade to 20G quite easily. My 10G switch handles NIC Teaming, and my nodes all have dual 10G. But with that being said, I never see the 10G network being maxed out... EVER. Not even during a rebuild. Heck it would probably do fine on 1G most of the time. It could be upgraded, but Im not surr it would be worth while.

  • teamaccteamacc Member

    My experience with scaleway (which also uses some sort of SAN) is that storage latencies were way too high to be ssd-like. Ever since I try to avoid anything that's non-local ssd.

  • randvegetarandvegeta Member, Host Rep

    teamacc said: My experience with scaleway (which also uses some sort of SAN) is that storage latencies were way too high to be ssd-like. Ever since I try to avoid anything that's non-local ssd.

    The disks are also 'local', and there is an SSD cache. But the data gets replicated and distributed.

    So every VM node is also a storage node, and much of the data will be written to local disks any way, but will be copied to disks on other nodes. And that node will contain copies of other machines too.

Sign In or Register to comment.