Ways to improve performance with drives in RAID 10

vitobotta · December 2023

@risharde said:
@vitobotta said:

@host_c said:
Please give me an idea on slow speed.

A yabs , disk test only
fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/md3):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 849.00 KB/s    (212) | 12.25 MB/s     (191)
Write      | 887.00 KB/s    (221) | 12.85 MB/s     (200)
Total      | 1.73 MB/s      (433) | 25.11 MB/s     (391)
           |                      |
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 48.28 MB/s      (94) | 50.88 MB/s      (49)
Write      | 50.75 MB/s      (99) | 54.51 MB/s      (53)
Total      | 99.04 MB/s     (193) | 105.40 MB/s    (102)
@HostEONS said:

@vitobotta said:
I have a server with 4 regular hard drives in RAID 10 configuration. I choose RAID 10 because I thought I would get better performance than with a single drive and some redundancy at the same time, but write speeds are quite poor. Are there any settings or something in Linux that can help improve disk performance with this configuration? It's software RAID btw. Thanks!

You can disable bitmap

mdadm --grow --bitmap=none /dev/md

but if you disable it, it will just make rebuilding RAID slower, but overall performance will improve

This is exactly the kind of thing I was looking for! There's some improvement in the numbers:
fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/md3):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 1.26 MB/s      (316) | 16.49 MB/s     (257)
Write      | 1.30 MB/s      (325) | 17.01 MB/s     (265)
Total      | 2.56 MB/s      (641) | 33.51 MB/s     (522)
           |                      |
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 67.92 MB/s     (132) | 74.54 MB/s      (72)
Write      | 71.53 MB/s     (139) | 79.50 MB/s      (77)
Total      | 139.46 MB/s    (271) | 154.04 MB/s    (149)
And the app (Nextcloud) seems somewhat more responsive now.

Are there any other magical settings like this that can help further?
I experienced some slow performance as well on ovh sata disks with raid 1 software but I don't think it was as slow as you are reporting. Sorry I don't have the numbers off hand. If I do get a chance, I will run it so you can compare but that server is going to expire in a few days and my kid is likely not going to allow me to use my computer to get you these numbers this evening. Moved to the SSDs for my case but I understand why you might not since you're looking likely for space vs bang for buck.

Yeah it's just 21e VAT included for the whole server so it's nice price-wise.

Would be curious to see your benchmark, although I have RAID 10 not 1.

itsdeadjim · December 2023

zfs may help here, especially if you have some spare ram and data is compressible (compress/decompress is faster than I/O)

https://wiki.debian.org/ZFS

fluffernutter · December 2023

@vitobotta said:

@host_c said:
I really do not like your values, but i never used mdadm.

It is like your HDD cache is disabled or you have 5400 RPM drives, not sure I can get that low even on that.

Not even raid 5/6 is that slow, and that has overhead for parity calculation.

I would investigate this further,

In a raid 10 of 4 drives:

write speed is N-2
read speed is N

N = number of total drives.

I wonder if the drives are oldish or something like that? This is an OVH dedicated server I bought for Nextcloud only due to the amount of storage for the price (21e/mo for 4 x 2TB of storage). The app seems to work "decently" well but it would be nice if I could improve the performance further without rebuilding the array. Is there any diagnostic tool or something like that I could use to investigate if there is a performance bottleneck?

21e for 4tb usable storage isn't even that great honestly, https://www.hetzner.com/storage/storage-share is 17e for 5tb storage (while being fully managed)

Daniel15 · December 2023

@vitobotta said:

@HostSlick said:

@vitobotta said: HDD

Could only guess if its some older HD or many hours on it that one of them is going slow and causing bottleneck to the complete array.. Try check with smart

As its ovh i guess it has enough hours. But yea, not many ways to improve it with softraid.

Ouch... I checked with smartmontools and all the drives have a lot of metrics flagged with "Old_age" and others "Pre-fail"

These are just describing the category of the metric. It doesn't mean the metric is actually bad yet.

What are the model numbers of the drives? smartctl should be able to tell you that. From that, you can find the cache size and whether it's 5400, 7200, or 15000 RPM.

What I'd try is wipe all drives then benchmark each one individually.

vitobotta · December 2023

@fluffernutter said:

@vitobotta said:

@host_c said:
I really do not like your values, but i never used mdadm.

It is like your HDD cache is disabled or you have 5400 RPM drives, not sure I can get that low even on that.

Not even raid 5/6 is that slow, and that has overhead for parity calculation.

I would investigate this further,

In a raid 10 of 4 drives:

write speed is N-2
read speed is N

N = number of total drives.

I wonder if the drives are oldish or something like that? This is an OVH dedicated server I bought for Nextcloud only due to the amount of storage for the price (21e/mo for 4 x 2TB of storage). The app seems to work "decently" well but it would be nice if I could improve the performance further without rebuilding the array. Is there any diagnostic tool or something like that I could use to investigate if there is a performance bottleneck?

21e for 4tb usable storage isn't even that great honestly, https://www.hetzner.com/storage/storage-share is 17e for 5tb storage (while being fully managed)

Yeah but with this I have a server too, not just the storage, so it can run Nextcloud independently

@Daniel15 said:

@vitobotta said:

@HostSlick said:

@vitobotta said: HDD

Could only guess if its some older HD or many hours on it that one of them is going slow and causing bottleneck to the complete array.. Try check with smart

As its ovh i guess it has enough hours. But yea, not many ways to improve it with softraid.

Ouch... I checked with smartmontools and all the drives have a lot of metrics flagged with "Old_age" and others "Pre-fail"

These are just describing the category of the metric. It doesn't mean the metric is actually bad yet.

What are the model numbers of the drives? smartctl should be able to tell you that. From that, you can find the cache size and whether it's 5400, 7200, or 15000 RPM.

What I'd try is wipe all drives then benchmark each one individually.

The drives are HGST Ultrastar 7K6000 with 7200 rpm

jar · December 2023

How long have you had the array running? It may just be syncing still. Sorry if I missed someone saying that already.

Seriously though I don’t think I’m going to do hardware RAID on any new server ever again. I’ll shout from the mountain tops for everyone else to avoid them where they can, performance be damned. They cause more damage when they fail, and you have to keep compatible spares on hand. I always thought HW RAID10 failures were a lie and everyone who claimed them was a liar covering for their own failures. Oh to be young again. Sorry that was unrelated 🤣

vitobotta · December 2023

@jar said:
How long have you had the array running? It may just be syncing still. Sorry if I missed someone saying that already.

I bought the server yesterday, so that's when the raid was set up with the OVH control panel and there is no syncing in progress

jar · December 2023

@vitobotta said:

@jar said:
How long have you had the array running? It may just be syncing still. Sorry if I missed someone saying that already.

I bought the server yesterday, so that's when the raid was set up with the OVH control panel and there is no syncing in progress

Interesting. Well, anything above 4k blocks in your tests were within what I would call fine for a single tenant server at least. Not amazing just “fine.”

itsdeadjim · December 2023

For 4k read/writes tests on fio, most probably disk itself is not the bottleneck.

So considering this is a kimsufi KS-9 I do not think you can go faster with an mdadm array + ext4 on this machine than this. So yeah these speeds are fine.

crunchbits · December 2023

@jar said:
How long have you had the array running? It may just be syncing still. Sorry if I missed someone saying that already.

Seriously though I don’t think I’m going to do hardware RAID on any new server ever again. I’ll shout from the mountain tops for everyone else to avoid them where they can, performance be damned. They cause more damage when they fail, and you have to keep compatible spares on hand. I always thought HW RAID10 failures were a lie and everyone who claimed them was a liar covering for their own failures. Oh to be young again. Sorry that was unrelated 🤣

Software RAID is the way to go in [current year]. Or a zfs/ceph solution--which we'll just chalk up to software. This is a relatively recent 'switch' in operations, though. HW won't necessarily be bad (performance-wise) but basically having to keep exact replica of the controller on-hand (and hope it works) and that they seem less recoverable/transplantable than mdadm (or similar) means I'd rather run mdadm and throw some SSDs as a dedicated cache layer at it to make up/exceed the performance that way.

@vitobotta said:

@jar said:
How long have you had the array running? It may just be syncing still. Sorry if I missed someone saying that already.

I bought the server yesterday, so that's when the raid was set up with the OVH control panel and there is no syncing in progress

Yeah your mdstat shows all sync'd. It's so hard to say specific to HDDs because data written further out vs closer to the spindle will also have a strong correlation to synthetic benchmarking speed. That being said, I'd expect a bit better out of those drives in RAID-10 but I'm not sure if they're way out of line. Real world experience probably won't change over 20-40MB/s so I'd just use it . If I were testing YABS on something like that it wouldn't also be running OS/system overhead so it sort of skews my expectations, though.

vitobotta · December 2023

@jar said:

@vitobotta said:

@jar said:
How long have you had the array running? It may just be syncing still. Sorry if I missed someone saying that already.

I bought the server yesterday, so that's when the raid was set up with the OVH control panel and there is no syncing in progress

Interesting. Well, anything above 4k blocks in your tests were within what I would call fine for a single tenant server at least. Not amazing just “fine.”

I can live with it. I did some more tests and while not crazy fast, Nextcloud seems to work OK and syncing especially is working fine.

@itsdeadjim said:
For 4k read/writes tests on fio, most probably disk itself is not the bottleneck.

So considering this is a kimsufi KS-9 I do not think you can go faster with an mdadm array + ext4 on this machine than this. So yeah these speeds are fine.

Yeah I think it was the KS-9

@crunchbits said:

@jar said:
How long have you had the array running? It may just be syncing still. Sorry if I missed someone saying that already.

Seriously though I don’t think I’m going to do hardware RAID on any new server ever again. I’ll shout from the mountain tops for everyone else to avoid them where they can, performance be damned. They cause more damage when they fail, and you have to keep compatible spares on hand. I always thought HW RAID10 failures were a lie and everyone who claimed them was a liar covering for their own failures. Oh to be young again. Sorry that was unrelated 🤣

Software RAID is the way to go in [current year]. Or a zfs/ceph solution--which we'll just chalk up to software. This is a relatively recent 'switch' in operations, though. HW won't necessarily be bad (performance-wise) but basically having to keep exact replica of the controller on-hand (and hope it works) and that they seem less recoverable/transplantable than mdadm (or similar) means I'd rather run mdadm and throw some SSDs as a dedicated cache layer at it to make up/exceed the performance that way.

@vitobotta said:

@jar said:
How long have you had the array running? It may just be syncing still. Sorry if I missed someone saying that already.

I bought the server yesterday, so that's when the raid was set up with the OVH control panel and there is no syncing in progress

Yeah your mdstat shows all sync'd. It's so hard to say specific to HDDs because data written further out vs closer to the spindle will also have a strong correlation to synthetic benchmarking speed. That being said, I'd expect a bit better out of those drives in RAID-10 but I'm not sure if they're way out of line. Real world experience probably won't change over 20-40MB/s so I'd just use it . If I were testing YABS on something like that it wouldn't also be running OS/system overhead so it sort of skews my expectations, though.

Thanks, 20-40MB doesn't sound like a big difference. Somehow I thought that these disks would be enterprise disks that could do more than 200 MB/sec sequential and perform decently in random ops but I guess I was expecting too much

Daniel15 · December 2023

You'll have a better experience if you can add two SSDs in a separate RAID1 or ZFS mirror (or a single one if you like living on the edge), and use that for the OS and all your apps. Keep the slow spinning disks only for large data. Given how cheap SSDs are at the moment, I doubt that's cost much more? Your host might let you pay a once-off fee to add SSDs.

Adding more RAM can help a lot too, especially if you use ZFS. Any RAM that is unused by apps on your server will be used for caching files, so more RAM = more caching. ZFS can also use an SSD for caching.

jackb · December 2023

@vitobotta said:

@darkimmortal said:
Perf looks fine to me, don’t forget YABS tests read/write simultaneously so HDDs will appear significantly slower than when read/write are tested separately

Any better benchmark?

Well, if you want to...

@jar said:
But if you need help destroying a RAID10 array let me know.

You can enable writeback cache without a bbu.

hdparm -W 1 /dev/sdX

Pros: good benchmark

Cons: you lose the array when you lose power.

If you like to live this dangerously you could alternatively use a ramdisk as a writeback cache with lvmcache or bcache.

vitobotta · January 1

Looks like writeback cache was already enabled

NewToTheGame · January 1

Short stroke. 2tb drives. Make each partition 6.25% of the space, 125GB partitions, if you have 2 drives that's 250GB. What happens to your IOPS then? Especially on the 1st partition?

vitobotta · January 1

I found was causing a slow down with Nextcloud

When I set up a server I use LVM if I can, so I can take consistent snapshots of the data directories (in this case Postgres and Redis) in addition to regular database dumps, so that in case of recovery I can recover the very latest data if possible (as opposed to backups which are down periodically so they might not be as fresh).

When I set up the OVH server I used their control panel to install the OS and this didn't configure LVM, so rather than installing manually from rescue mode I thought about using a loop device as a quick and dirty alternative, so that I could use fsfreeze to periodically freeze the loop device and take crash consistent backups just like LVM snapshots, at the expense of some writes delayed a little while doing this. I have done this before on servers with NVME drives and it worked pretty well with no performance issues at all. freezing + rsyncing the data directories to another location + unfreezing the loop device was so fast that it seemed like nothing was happening.

However with this server that has HDDs instead of SSDs, it seems that there is a significant performance penalty using loop devices on top of the regular filesystem, slowing down especially writes.

I just moved he data directories to the regular filesystem and removed the loop device, and after doing that Nextcloud now flies, with no difference at all compared to when it was running on the server with NVME drives

The actual benchmarks with yabs are not much better than before, but the I/O wait has gone down dramatically and there is indeed a difference in responsiveness. I actually thought about removing the loop device when I was investigating the high I/O wait and I was running out of ideas, and Google didn't help. I'm glad I tried this. What a difference!

It seems that with the number of users I have (growing to around 20 I think) even a setup with HDDs in RAID is just fine.

The question I have now is, why does a loop device cause so high I/O wait, with a noticeable write performance penalty? I am still Googling to find a possible explanation but if anyone knows I would love to hear it.

stefeman · January 1

ZFS is the way. Also enable zstd compression with it.

You should see pretty nice performance even with spinning rust.

zpools offers parity options too, so you can ditch the RAID10.

host_c · January 1

Just don’t do raid 5 or z1

Google it, there is almost a 2 decade old paper on it why.

itsdeadjim · January 1

@vitobotta said: The question I have now is, why does a loop device cause so high I/O wait, with a noticeable write performance penalty? I am still Googling to find a possible explanation but if anyone knows I would love to hear it.

You will find many reasons on google why a loop device won't play nice. The curiosity would be if it wouldn't cause I/O wait.

I guess in this case it's the synced writes and inefficient usage of cache, which makes the disk latency unbearable.

fluffernutter · January 1

@host_c said:
Just don’t do raid 5 or z1

Google it, there is almost a 2 decade old paper on it why.

That's mostly for big drives, for 1-2tb drives z1 is fine since rebuilds aren't as killer.

stefeman · January 1

@fluffernutter said:

@host_c said:
Just don’t do raid 5 or z1

Google it, there is almost a 2 decade old paper on it why.

That's mostly for big drives, for 1-2tb drives z1 is fine since rebuilds aren't as killer.

This.

raza19 · January 1

@CalmDown said:
No offense, but maybe it's time to open a 1 thread, where you could ask your 101 question as also share some goods?

Plz stop trolling vitobotta, he always adds knowledgeable insights to this forum. If you are so bothered by his threads then kindly dont open or comment on them. Its time to calm down @CalmDown.

raza19 · January 1

Seriously though I don’t think I’m going to do hardware RAID on any new server ever again. I’ll shout from the mountain tops for everyone else to avoid them where they can, performance be damned. They cause more damage when they fail, and you have to keep compatible spares on hand. I always thought HW RAID10 failures were a lie and everyone who claimed them was a liar covering for their own failures. Oh to be young again. Sorry that was unrelated 🤣

@jar for a new setup I have been planning a raid10 in production and a much cheaper raid0 with lesser number of drives as incremental backup: isn't that an ideal setup with the caveat that in case of raid10 failure u go straightaway to the backup? if raid10 isn't a good option then wt shud one go for?

vitobotta · January 1

@raza19 said:

isn't that an ideal setup with the caveat that in case of raid10 failure u go straightaway to the backup?

What do you mean? Do you mean failure of multiple disks or why would you need to restore from backups if it's just one drive that failed for example and you can just replace it?

fluffernutter · January 1

@raza19 said:

Seriously though I don’t think I’m going to do hardware RAID on any new server ever again. I’ll shout from the mountain tops for everyone else to avoid them where they can, performance be damned. They cause more damage when they fail, and you have to keep compatible spares on hand. I always thought HW RAID10 failures were a lie and everyone who claimed them was a liar covering for their own failures. Oh to be young again. Sorry that was unrelated 🤣

@jar for a new setup I have been planning a raid10 in production and a much cheaper raid0 with lesser number of drives as incremental backup: isn't that an ideal setup with the caveat that in case of raid10 failure u go straightaway to the backup? if raid10 isn't a good option then wt shud one go for?

Now instead of just having to worry about disk failure, you also need to worry about controller failure. Adds an extra point of failure. Properly tuned ZFS is "fast enough" so the hardware raid's performance advantage is also pretty much gone. Not a huge reason to run HW raid anymore in 2024.

raza19 · January 1

@vitobotta said:

@raza19 said:

isn't that an ideal setup with the caveat that in case of raid10 failure u go straightaway to the backup?

What do you mean? Do you mean failure of multiple disks or why would you need to restore from backups if it's just one drive that failed for example and you can just replace it?

So, I am basically working on a brand's store where the data doesn't change much and we plan to have incremental 30 minutes backups. The idea was that in case of a disaster we cud just replace raid10 with our active backup's raid0 in a way where we wud have a performance hit but almost negligible down time giving us enough time to fix the raid10 issues. But its just still a plan, don't know how workable it cud be.

itsdeadjim · January 1

@raza19 said:
@jar for a new setup I have been planning a raid10 in production and a much cheaper raid0 with lesser number of drives as incremental backup: isn't that an ideal setup with the caveat that in case of raid10 failure u go straightaway to the backup? if raid10 isn't a good option then wt shud one go for?

If you have important data, don't go yolo raid on the backup. Or have a second backup location.
It's almost 100% certain that it will fail too when you will need it.

tentor · January 1

@raza19 said:

@vitobotta said:

@raza19 said:

isn't that an ideal setup with the caveat that in case of raid10 failure u go straightaway to the backup?

What do you mean? Do you mean failure of multiple disks or why would you need to restore from backups if it's just one drive that failed for example and you can just replace it?

So, I am basically working on a brand's store where the data doesn't change much and we plan to have incremental 30 minutes backups. The idea was that in case of a disaster we cud just replace raid10 with our active backup's raid0 in a way where we wud have a performance hit but almost negligible down time giving us enough time to fix the raid10 issues. But its just still a plan, don't know how workable it cud be.

I think the best active backup is having two servers behind loadbalancer, each having own copy of website and database.

raza19 · January 1

@tentor said:

@raza19 said:

@vitobotta said:

@raza19 said:

isn't that an ideal setup with the caveat that in case of raid10 failure u go straightaway to the backup?

What do you mean? Do you mean failure of multiple disks or why would you need to restore from backups if it's just one drive that failed for example and you can just replace it?

So, I am basically working on a brand's store where the data doesn't change much and we plan to have incremental 30 minutes backups. The idea was that in case of a disaster we cud just replace raid10 with our active backup's raid0 in a way where we wud have a performance hit but almost negligible down time giving us enough time to fix the raid10 issues. But its just still a plan, don't know how workable it cud be.

I think the best active backup is having two servers behind loadbalancer, each having own copy of website and database.

But isn't that just complex with mysql replication issues and other stuff, granted it wud be an ideal solution. I wish there was an easy turnkey high availability solution like that catering to simple LAMP stacks.

tentor · January 1

@raza19 said:

@tentor said:

@raza19 said:

@vitobotta said:

@raza19 said:

isn't that an ideal setup with the caveat that in case of raid10 failure u go straightaway to the backup?

What do you mean? Do you mean failure of multiple disks or why would you need to restore from backups if it's just one drive that failed for example and you can just replace it?

So, I am basically working on a brand's store where the data doesn't change much and we plan to have incremental 30 minutes backups. The idea was that in case of a disaster we cud just replace raid10 with our active backup's raid0 in a way where we wud have a performance hit but almost negligible down time giving us enough time to fix the raid10 issues. But its just still a plan, don't know how workable it cud be.

I think the best active backup is having two servers behind loadbalancer, each having own copy of website and database.

But isn't that just complex with mysql replication issues and other stuff, granted it wud be an ideal solution. I wish there was an easy turnkey high availability solution like that catering to simple LAMP stacks.

High availability never meant to be easy one-click solution unfortunately. It is all about availability and reliability requirements.

Howdy, Stranger!

Categories

In this Discussion

Ways to improve performance with drives in RAID 10

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Ways to improve performance with drives in RAID 10

Comments