Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Putting NVMe drive in raid 1 or 5?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Putting NVMe drive in raid 1 or 5?

For a web server, do I need to put NVMe drive in raid? At minimum raid 1? Or they are reliable enough to not put them in raid?

Am I just asking for trouble for not putting in a raid configuration?

Also does software / hardware raid matters for NVMe drives? Will software raid 1 NVMe drive work just as well as hardware raid? If yes, then what is the point of getting hardware raid?

The NVMe drive would host the database files only.

Will do a separate daily backup if that matters.

Comments

  • AC_FanAC_Fan Member

    Software RAID 1/10, 5 if unavoidable.

  • PureVoltagePureVoltage Member, Patron Provider

    As said above, I would suggest 1/10 raid 5 isn't a good option for raid. We did a 12x3.84TB NVMe setup for a client they wanted raid 50 however performance would have been much better on raid 10 granted less disk space usable.

  • IonSwitch_StanIonSwitch_Stan Member, Host Rep

    We did a 12x3.84TB NVMe setup

    Aaaaand thats how you saturate the PCIe bus. Well done.

  • NiFlexNiFlex Member

    Raid 1 or 10 is the way to go.

  • Does anyone do no raid , but daily backup or something?

  • rcxbrcxb Member

    @Bandai said:
    For a web server, do I need to put NVMe drive in raid?

    If you don't want to suddenly lose all your data, yes.

    My last hard drive failure wasn't anything mechanical, the electronics just suddenly stopped working. How will you feel when that happens to your server SSD?

    The NVMe drive would host the database files only.

    You want RAID... Or something similar, like continuous backups (every 5 minutes), replicating all database updates in real-time to a second webserver, etc.

    Will software raid 1 NVMe drive work just as well as hardware raid?

    Software RAID is the higher-performance option with NVMe. Two of them are FASTER than any single (RAID) controller card is. There aren't many hardware NVMe RAID cards out there. Look up Intel VROC... they're trying to sell CPU acceleration of software RAID for NVMe.

    If yes, then what is the point of getting hardware raid?

    People are accustomed to hardware RAID being better than software RAID for hard drives, and haven't caught up with the reality of NVMe.

    Thanked by 2Bandai bikegremlin
  • dedicatserver_rodedicatserver_ro Member, Host Rep
    edited July 2019

    rcxb said: There aren't many hardware NVMe RAID cards out there. Look up Intel VROC... they're trying to sell CPU acceleration of software RAID for NVMe

    With NVMe can you only made harware raid 0, anything else is a kind of software raid with no warrantie of data loss.
    Intel VROC is a mess.

  • rcxbrcxb Member

    @dedicatserver_ro said:
    With NVMe can you only made harware raid 0, anything else is a kind of software raid with no warrantie of data loss.

    RAID-0 on NVMe isn't a great idea, either. Really doesn't double your performance like you'd expect, and significantly increases the latency, so performance of RAID-0's NVMe drives looks worse for most tasks.

    Perhaps you meant RAID-1?

    I'm not recomending VROC, just a branded product that's easy to find and learn about for somebody that doesn't know the state of NVMe.

  • dedicatserver_rodedicatserver_ro Member, Host Rep

    @rcxb i meant RAID-0 is the only one that you can made "hardware raid" .
    I'm not recomending any kind of Raid on NVMe for now.

  • jsgjsg Member, Resident Benchmarker

    Interesting things are told here, like e.g. hardware raid not being capable to handle NVMe or 5 min. backup cycles being about the same as Raid . Uhum. Plus, of course, all DBs are about the same it seems ...
    Hint: Raid controller via 8 PCIe lanes and having considerably less traffic on the bus is virtually always faster and at the same time easier on the system that 2 NVMes each via 4 lanes.

    @Bandai

    First, understand what you want to achieve - and make sure that your premises are correct. Example: a database is quite different from a web server.

    As your focus seems to be the database you might want to consider the difference between the data itself and the indexes; the latter are the smaller and by far most relevant part wrt performance.
    Speaking of performance keep in mind that RAM is by far faster than even NVMe - and that's why any good DB tries to have at least a considerable part of the indexes in memory.
    So if your concern is performance the best answer probably is "get lots of RAM".

    The other point to know about and to consider is Raid vs. backup. First rule: make backups._ Raid does NOT replace backups!_ Those two are about different things. Except for Raid 0 (which usually is about speed) the purpose of Raid usually is to keep storage available (albeit usually at reduced performance if a disk fails). In a way Raid is about keeping the system up and running and having a bit of time to replace a defect or dead drive. One might say that the head line for Raid is "continuity of operations".
    A backup on the other hand is about data or more precisely about not losing any data no matter what.
    Those two things seem to overlap a lot but to make it simple just consider Raid to be a concern of the persons in charge of the hardware while data safety is a concern of the persons using a server. "Holy rule:" always make backups! Do not rely on Raid.

    Last but not least keep your feet on the ground and ask yourself how much speed you really need and what is the best way to achieve that speed. Btw, whenever performance really is a concern the first question to ask is whether you have enough RAM; in fact in quite many cases having more RAM is more important than having a faster processor.

    In case you insist on getting a simple answer to your basic question: Raid 1 almost always is faster that Raid 5. Raid 10 is even faster but Raid 0, 1, or 10 usually need more disks which can become quite a significant difference (and cost factor) for large storage volumes.

    Thanked by 2Bandai bikegremlin
  • PureVoltagePureVoltage Member, Patron Provider

    IonSwitch_Stan said: Aaaaand thats how you saturate the PCIe bus. Well done.

    >
    That's what the customer requested, we highly suggested not however they're happy so far without any issues pushing close to 10Gbps. :)

  • I bought three Threadripper 2920X's on Prime Day. Looking forward to NVMe Raid goodness.

  • BandaiBandai Member

    @jsg very well written, thank you I've learned a lot!

  • rcxbrcxb Member

    @jsg said:
    Interesting things are told here, like e.g. hardware raid not being capable to handle NVMe or 5 min. backup cycles being about the same as Raid . Uhum.

    High availability pairs of servers with continuous data replication are superior to RAID in every way.

  • beaglebeagle Member
    edited July 2019

    @rcxb said:

    @jsg said:
    Interesting things are told here, like e.g. hardware raid not being capable to handle NVMe or 5 min. backup cycles being about the same as Raid . Uhum.

    High availability pairs of servers with continuous data replication are superior to RAID in every way.

    Geo-distributed are even better. You can keep adding as much redundancy as you want. Your budget is the limit. :wink:

    Thanked by 1Aidan
  • NeoonNeoon Community Contributor, Veteran

    If you already for already got NVMe you decided for speed, why do you want to cripple it with Raid 5?

  • jsgjsg Member, Resident Benchmarker

    @rcxb said:

    @jsg said:
    Interesting things are told here, like e.g. hardware raid not being capable to handle NVMe or 5 min. backup cycles being about the same as Raid . Uhum.

    High availability pairs of servers with continuous data replication are superior to RAID in every way.

    HA server pairs serve a different purpose. They are not "better" than Raid but in a way "Raid over whole systems". Fun fact: Usually their disks are Raid'ed.

  • FHRFHR Member, Host Rep

    @jsg said:

    @rcxb said:

    @jsg said:
    Interesting things are told here, like e.g. hardware raid not being capable to handle NVMe or 5 min. backup cycles being about the same as Raid . Uhum.

    High availability pairs of servers with continuous data replication are superior to RAID in every way.

    HA server pairs serve a different purpose. They are not "better" than Raid but in a way "Raid over whole systems". Fun fact: Usually their disks are Raid'ed.

    Exactly. They serve different purposes and much different markets.

    Not to mention HA storage is almost never done in app servers themselves, but rather in networked/clustered storage setups (think between SANs or dedicated storage servers with like Ceph or Gluster). App servers then mount the storage over some kind of network (usually FC/Ethernet/IB), which is always slower than local RAID.

    As everything, distributed storage has its benefits and tradeoffs - and it's certainly not some magical solution that will solve all your storage needs and problems.

  • rcxbrcxb Member
    edited July 2019

    @FHR said:
    Not to mention HA storage is almost never done in app servers themselves, but rather in networked/clustered storage setups (think between SANs or dedicated storage servers with like Ceph or Gluster). App servers then mount the storage over some kind of network (usually FC/Ethernet/IB), which is always slower than local RAID.

    You have an astonishly narrow view of the subject.

    I take it you've never heard of DRBD, vSAN, or Veeam CDP & Quick Migration?
    How about Postgresql Streaming Replication (Op specifically said a database server, you know)... All of which can be as fast as RAID, depending on the configuration with most workloads.

    it's certainly not some magical solution that will solve all your storage needs and problems.

    Which of course I never said... It's a viable alternative to RAID, with additional advantages, in the case of highly reliable storage devices.

  • jsgjsg Member, Resident Benchmarker
    edited July 2019

    @rcxb said:
    ... All of which can be as fast as RAID, depending on the configuration with most workloads.

    No, they can't. @FHR was right.

    Besides, you missed an important point. Raid is about device availability and managing an eventual sub-device failure (usually seen as redundancy by users). DRBD and the other items you mentioned are network based and hence come with higher latency. Plus they (except DRBD) are quite specifically tailored to some purpose (e.g. DB) or to some product (e.g. VMware).

    The best case (in your sense) is DRBD which can be broken down (in terms of time axis) to "same as local access" plus network latency - which is obviously slower than local access.

    Btw., I'm a big PostgreSQL fan but I would not use their SR because handling shadowing in the application (core) is always better and offers more options but I see that for some use cases (like hosting) PGSR is better suited. At the same time I'm wondering why you are putting database centric HA into the same pot as DRBD and clustered FS.

    Thanked by 1bikegremlin
  • rcxbrcxb Member

    @jsg said:
    No, they can't. @FHR was right.

    Certainly they can. A rewrite of several blocks on disk can be a tiny undo log or similar sent asyncronously over the network (so the latency is irelevent). That can allow the primary/master to operate faster than RAID, at least in some cases.

    Besides, you missed an important point. Raid is about device availability

    I didn't miss it at all. OP said he just wants to protect his "database files". Depending on his exact situation, it can be done far better than just RAID, and just about as inexpensively.

  • ramnetramnet Member, Host Rep
    edited July 2019

    Either Raid 1 or Raid 5 would be fine. A 3 drive Raid 5 will perform better than a 2 drive Raid 1 while yielding double the usable capacity with the same fault tolerance.

    If you have 4 drives, raid 10 would perform much better than 1 or 5 with the same capacity and fault tolerance as the 3 drive raid 5.

    And yes, use software raid.

    One thing to note with nvme is that unlike with sata/sas, hot plugging devices is not universally supported yet. So if you are counting on using raid with nvme for high availability, you'll want to check that on your system.

  • jsgjsg Member, Resident Benchmarker
    edited July 2019

    @rcxb

    It seems to me that you just have to be "right", no matter the facts. The facts however are that a local write is always faster than a remote write (given roughly the same machines, disks, etc) because the latter can be considered a local write + network transfer.
    But that may not even be the relevant point because local storage always is a SPOF while storing data on two machines is not that kind of SPOF. But then again that is relevant only from a certain perspective because of the use case, users priorities, budget, etc.

    Plus, and more importantly, you are wrong (well here in this case, not generally) because your perspective and your answer go far beyond what has been asked. Raid is a solution to the problem "what if the disk breaks down?" - and well noted even there are two perspectives, the providers ("I need storage to be available") and the end users ("I do not want my data to be destroyed") - and both are concerned about 1 system, which may be "the server" or "the storage unit" which may or may not be shared.

    Let me put it differently to make my point understood. There are companies explicitly providing some kind of remote store, be it for files or be it for DBs or whatever. And even they still need their storage hardware to stay available even if a disk breaks, so they use Raid themselves.

    But there are other aspects of availability too and one shouldn't ignore those. When leading a software dev. team I almost "terrorize" my people to avoid any complexity that is not absolutely necessary. Simple reason: complexity is the worst enemy of safety and reliability and one major factor in that equation is that complexity makes verifiability harder (as well as really and fully understanding what some mechanism does).

    Looking at users on VPS and dedis it's probably not unfair to say that the majority has plenty enough on their plate when needing to properly configure one system. Much of what you propose is far beyond what they understand and what they can sensibly and properly set up, configure, and maintain. So, does that add to availability, safety, redundancy, reliability? Certainly not.

    Thanked by 1bikegremlin
  • Has anyone configured NVMe Raid on a threadripper?

    I want to move my work PC over to a new threadripper.

    I backed up my drive using Acronis.
    I added raid drivers to Acronis MVP bootdisk and Disk Director sees the raid drive. I restore the image but it blue screens.

    So I put the original drive in, put MB back into AHCI mode and boot up the original drive and install all the threadripper MB drivers.

    I backup the original drive again, and put MB back into NVMe raid mode. Again, Acronis can see the raid volume, and I can restore the image, but now when I go to boot, the bios won't let me select Windows Manager (AMD RAID) as a boot drive, even after it did the first time around. I can only assume bios bug and tonight I'll default it and try again. The raid manager reports no issues with the array...

    Just curious if anyone else has played with Threadripper NVme raid and ran into this as well.

  • cloudservercloudserver Member, Patron Provider

    @jsg said:
    @rcxb

    It seems to me that you just have to be "right", no matter the facts. The facts however are that a local write is always faster than a remote write (given roughly the same machines, disks, etc) because the latter can be considered a local write + network transfer.
    But that may not even be the relevant point because local storage always is a SPOF while storing data on two machines is not that kind of SPOF. But then again that is relevant only from a certain perspective because of the use case, users priorities, budget, etc.

    Plus, and more importantly, you are wrong (well here in this case, not generally) because your perspective and your answer go far beyond what has been asked. Raid is a solution to the problem "what if the disk breaks down?" - and well noted even there are two perspectives, the providers ("I need storage to be available") and the end users ("I do not want my data to be destroyed") - and both are concerned about 1 system, which may be "the server" or "the storage unit" which may or may not be shared.

    Let me put it differently to make my point understood. There are companies explicitly providing some kind of remote store, be it for files or be it for DBs or whatever. And even they still need their storage hardware to stay available even if a disk breaks, so they use Raid themselves.

    But there are other aspects of availability too and one shouldn't ignore those. When leading a software dev. team I almost "terrorize" my people to avoid any complexity that is not absolutely necessary. Simple reason: complexity is the worst enemy of safety and reliability and one major factor in that equation is that complexity makes verifiability harder (as well as really and fully understanding what some mechanism does).

    Looking at users on VPS and dedis it's probably not unfair to say that the majority has plenty enough on their plate when needing to properly configure one system. Much of what you propose is far beyond what they understand and what they can sensibly and properly set up, configure, and maintain. So, does that add to availability, safety, redundancy, reliability? Certainly not.

    @jsg I really appreciate all your input here I think I am getting closer to answers I am looking for.

    I am looking at this currently to put 4 x 2TB nvme drives or two units to raid 8 x 2TB nvme drives: https://highpoint-tech.com/USA_new/series-ssd7101a-1-overview.htm

    What I have a hard time understanding is what my best option is since NVMe is still kind of new. I am building a node to host KVM VPS on it, do you suggest using a card like this? Or am I better off going a different direction? Thanks in advance for your input!

  • NDTNNDTN Member, Patron Provider, Top Host

    @cloudserver said:

    @jsg said:
    @rcxb

    It seems to me that you just have to be "right", no matter the facts. The facts however are that a local write is always faster than a remote write (given roughly the same machines, disks, etc) because the latter can be considered a local write + network transfer.
    But that may not even be the relevant point because local storage always is a SPOF while storing data on two machines is not that kind of SPOF. But then again that is relevant only from a certain perspective because of the use case, users priorities, budget, etc.

    Plus, and more importantly, you are wrong (well here in this case, not generally) because your perspective and your answer go far beyond what has been asked. Raid is a solution to the problem "what if the disk breaks down?" - and well noted even there are two perspectives, the providers ("I need storage to be available") and the end users ("I do not want my data to be destroyed") - and both are concerned about 1 system, which may be "the server" or "the storage unit" which may or may not be shared.

    Let me put it differently to make my point understood. There are companies explicitly providing some kind of remote store, be it for files or be it for DBs or whatever. And even they still need their storage hardware to stay available even if a disk breaks, so they use Raid themselves.

    But there are other aspects of availability too and one shouldn't ignore those. When leading a software dev. team I almost "terrorize" my people to avoid any complexity that is not absolutely necessary. Simple reason: complexity is the worst enemy of safety and reliability and one major factor in that equation is that complexity makes verifiability harder (as well as really and fully understanding what some mechanism does).

    Looking at users on VPS and dedis it's probably not unfair to say that the majority has plenty enough on their plate when needing to properly configure one system. Much of what you propose is far beyond what they understand and what they can sensibly and properly set up, configure, and maintain. So, does that add to availability, safety, redundancy, reliability? Certainly not.

    @jsg I really appreciate all your input here I think I am getting closer to answers I am looking for.

    I am looking at this currently to put 4 x 2TB nvme drives or two units to raid 8 x 2TB nvme drives: https://highpoint-tech.com/USA_new/series-ssd7101a-1-overview.htm

    What I have a hard time understanding is what my best option is since NVMe is still kind of new. I am building a node to host KVM VPS on it, do you suggest using a card like this? Or am I better off going a different direction? Thanks in advance for your input!

    For NVMe drives you should use software RAID instead of the hardware RAID. 4x2TB in software RAID will still give you very good performance.

  • jsgjsg Member, Resident Benchmarker

    @cloudserver said:
    I am looking at this currently to put 4 x 2TB nvme drives or two units to raid 8 x 2TB nvme drives: https://highpoint-tech.com/USA_new/series-ssd7101a-1-overview.htm

    What I have a hard time understanding is what my best option is since NVMe is still kind of new. I am building a node to host KVM VPS on it, do you suggest using a card like this? Or am I better off going a different direction? Thanks in advance for your input!

    I can't give a full and reliable answer because

    • I did not find any information on how that card does Raid
    • personal experience showed that Highpoint Raid controllers should be viewed with some mistrust. But the last time I encountered one was quite a few years ago; maybe things have changed.
    • I do not know how many VMs you are going to use that NVMe configuration for.
    • I do not know whether the KVMs will run a particular workload (e.g. DB) and if so what kind of workload.

    Also it seems that that Raid card targets a certain clientele that is little to do with professional DC operations (Apple Mac, high end video).
    Plus: Except for some corner cases the point in a DC is not to squeeze out the last drop of disk peak performance but rather to have a well balanced system that performs well (but doesn't aim for speed records).

    Everything serves a purpose and nothing comes for free. disks have a purpose too and SerDes traffic (e.g. PCIe) doesn't come for free either. So I warn against going to extremes with one device group (e.g. disks) and particularly so when using devices I consider questionable (like some Raid controllers).
    Plus, and that's an important reminder, The speeds you are told are virtually always max. speeds under certain (typ. optimal) ircumstances. Example: if I put a good Raid controller with, say, 2 GB memory into a modern machine then even old spindles can be said to achieve, say, 800MB/s - IF my test size is up to 2 GB (the size of the cache mem.). So one should look carefully and with some mistrust at numbers like "This NVMe does 4 GB/s" and especially at statements like "4 of those hence achieve 16 GB/s".
    The truth: NO they don't. For one Raid 1 doesn't increase speed and hence 4 of those NVMes in Raid 10 will do max. 8 GB/s. But even that's theory because NVMes are not miracle devices.

    Basically one can look at NVMe as SSDs without the Sata 3 bottleneck (6 Gb/s). Explanation: Flash storage can have a certain performance, something like 50 MB/s up to (ignoring exotics) 2 GB/s plus of course the faster the higher the price. So disk/storage manufacturers like NVMe because thanks to PCIe they can get performance above ca. 550 MB/s out of fast flash devices.
    But: One should always assume that any Sata/SAS/PCIe flash device does have some RAM cache built in for two major reasons, (a) many use case actually do profit from caches, and (b) it helps selling by making the devices look faster than they actually are under "continuous" (read: non optimal) load.

    Short version: You might try to hunt a unicorn that quickly turns into an ordinary pig under load.

    You must start at the top and your first question should be "what is the workload of those KVMs?". Your next thought should be IO, not throughput. Reason: VMs actually are worst case scenarios for storage because there usually is no pattern and no consistent usage type. One VM might do billions of small random access writes all over the place while another one might always trash the cache with very large files, etc.
    So you must ask what real world performance level you want to achieve for any and all of the workload of all those KVMs. The answer in s professional setting almost invariably is to get devices with good consistent performance, or in other words, to care less about max performance numbers but to care about good min. performance. But of course getting SLC drives is much more costly than getting stacked TLC with some RAM cache thrown in.

    I'll close with a "secret". Throw RAM at problems. My experience often showed me that throwing RAM at problems often is a cheaper and faster solution than faster processors, faster NVMes, etc. One caveat though: RAM is volatile so you'll lose data if a machine crashes. Which leads us back to your controller question: I wouldn't buy it. Because it doesn't a have battery backed up cache.

    Thanked by 1cloudserver
  • cloudservercloudserver Member, Patron Provider

    @jsg thank you very much for taking the time to post your response, it was certainly helpful!

Sign In or Register to comment.