ZFS with hdds and SSD - whats your experience

Anayx · July 2022

Please share your experience when using zfs in following cases --
a. Using all ssds
b. Using all hdds
c. Using ssd for cache for hdds
d. using ssds with raid 10 vs raid 1

I read that zfs was made for hdds in mind so it has arc ram based, for ssd it is not very beneficial ? what are you experience if used with all ssds, does it perform better with regular ssd setup like mdraid array of raid 1 or raid 10 . Or it performs relatively better if used with hdds bcz of its ram based arc ?

Or why bother about zfs, go for mdraid + xfs + lvm cache ?

do share your real life experiences

jugganuts · July 2022

The only downside of ZFS is no native kernel module for Linux kernel distros, and no hdd pool expansion yet. Other then that it performs better than any other software raid solution/filesystem. I'm sure there are some caveats but it's been damn reliable over the years.

Erisa · July 2022

@Anayx said: a. Using all ssds

Works great, though sometimes the memory usage fails to deflate in time when memory pressure spikes. I recommend capping ARC to a lower size if that is a concern, since it doesn't provide as much benefit as with other situations.

@Anayx said: b. Using all hdds

If you have a high amount of allocated ARC it's very comfortable to use, but if you have low amount of ARC or L2ARC you may run into issues with I/O speed and ARC starvation depending how you have it set up.

@Anayx said: c. Using ssd for cache for hdds

I've always had positive experiences with this, but keep in mind it isn't bulletprof. You can and will still run into I/O throttle due to HDD frequently no matter how much cache you have, especially when reading or writing sequentially.
Make sure to have a SLOG configured as well as cache to make writes more bearable.
ARC tries to be smart so if you read something frequently it should end up in cache for a long while. But it also frequently removes things just before you start to need them again.

@Anayx said: d. using ssds with raid 10 vs raid 1

Haven't tried 10 vs 1, only 0 vs 1. And as expected, using 0 doubled the write perf compared to 1. That's about it.

@Anayx said: I read that zfs was made for hdds in mind so it has arc ram based, for ssd it is not very beneficial ?

I suppose this is sort of true, you get more benefit out of ARC when your pool is on a HDD. But that doesn't mean ZFS is bad on SSD, just less useful as with HDD.

@Anayx said: if used with all ssds, does it perform better with regular ssd setup like mdraid array of raid 1 or raid 10 .

From my anecodotal experience with no numbers to back it up, I did find bettr performance using ZFS with SSDs compared to other methods. Some part of this may be because of the realtime compression (Modern CPUs can take it) decreasing the I/O usage for some workloads. A PostgreSQL database I have for example has around 2x compression and thus much less I/O usage is required to read from it.

@Anayx said: Or it performs relatively better if used with hdds bcz of its ram based arc ?

HDDs on ZFS when cache is being utilised effectively is significantly faster than HDDs on their own. SSDs on ZFS is less faster, so yes you lose one of the benefits, but there are other benefits (Compression, snapshots, dataset organisation, zvols)

@Anayx said: Or why bother about zfs, go for mdraid + xfs + lvm cache ?

For me I use ZFS for the sheer flexibility it gives. If any of these points overlap with LVM I apologise because I have not used it extensively.

Compression saves me disk space and I/O usage (At the expense of some CPU but algos are pretty good these days)
Snapshots help me quickly rollback mistakes or even just prepare for migrations by creating an instant snapshot in case things go wrong. This should be in addition to a backup plan.
Sending and receiving snapshots can be done on a byte level, avoiding the slow "lets scan and read every small file in this folder individually" problem that tools like rsync and traditional backup solutions have. I can also create incremental snapshots and send those. Tools like zrepl and zfs_uploader are great for this form of backups, along with whatever custom solution you deploy.
When working HDDs, an L2ARC can increase performance considerabily in some cases. This should not the always relied on for free fast storage (That's not what it is) but if you need HDDs for a project it can help offset the downsides a little.

I'm sorry I don't have any real numbers for you, but that's roughly my experience using ZFS for servers over the past ~2 years. I wouldn't have it any other way because the experience is so great to me once I understood how to work it.

You should always go with the solution that feels the best for you and your use-case. Try things out, experiment, and stick with whatever works for you.

Anayx · July 2022

@Erisa
This is absolutely nice that you take out time and shared your experience. It is exactly what I as asking for. I will try out setups in my leisure time to have conclusions myself too.

tjn · July 2022

I've used ZFS on both SSDs and HDDs - all super positive experiences.

Anayx · July 2022

I was doing some reading and wanted to play with it.
I have understood that discs or disk partitions are used to create vdev , and over vdev we get pools. On pool we have dataset which we can mount. I want to understand what I am seeing in output below for zfs list :

zp0 is my pool, and it is showing its size as 1.73T and then zp0/zd0 is dataset inside it ? I am reading it right ?

root@mysSystem:~# zfs list
NAME      USED  AVAIL     REFER  MOUNTPOINT
zp0       740K  1.73T       96K  none
zp0/zd0   104K  1.73T      104K  legacy

root@mySystem:~# zpool status
  pool: zp0
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
config:

        NAME                              STATE     READ WRITE CKSUM
        zp0                               ONLINE       0     0     0
          mirror-0                        ONLINE       0     0     0
            wwn-0x5000cca24bc94ff2-part5  ONLINE       0     0     0
            wwn-0x5000cca22de24991-part5  ONLINE       0     0     0

errors: No known data errors

Erisa · July 2022

@Anayx said:
I was doing some reading and wanted to play with it.
I have understood that discs or disk partitions are used to create vdev , and over vdev we get pools. On pool we have dataset which we can mount. I want to understand what I am seeing in output below for zfs list :

zp0 is my pool, and it is showing its size as 1.73T and then zp0/zd0 is dataset inside it ? I am reading it right ?

root@mysSystem:~# zfs list
NAME      USED  AVAIL     REFER  MOUNTPOINT
zp0       740K  1.73T       96K  none
zp0/zd0   104K  1.73T      104K  legacy

root@mySystem:~# zpool status
  pool: zp0
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
config:

        NAME                              STATE     READ WRITE CKSUM
        zp0                               ONLINE       0     0     0
          mirror-0                        ONLINE       0     0     0
            wwn-0x5000cca24bc94ff2-part5  ONLINE       0     0     0
            wwn-0x5000cca22de24991-part5  ONLINE       0     0     0

errors: No known data errors

Correct, zfs list shows you the datasets and in that case zp0/zd0 is a dataset instead the zp0 pool

Anayx · July 2022

So why zp0 is in "zfs list" ?

letlover · July 2022

I remember that more than 10 years ago I bought several Sun Ultrasparc IV servers, and a free Solaris 10 CD. At the time I spent less than one thousand grand for all, but in the early 2000, they might cost about $100k each.
I still remember that I had to use the serial port to install the OS and do all the management. Very inconvenient to me, as I was used to the desktop server type.
IT hardware industry has been fiercely competitive.

netswitch · July 2022

We run raidz2 on full ssd pools, works wonders but beware of write amplification on some consumer drives, we had huge issues with crucial mx500. Otherwise it is blazing fast and reliable on 8+2 pools.

darkimmortal · July 2022

The robustness when things go wrong scares me away from ZFS. It’s one thing to checksum data on read, but it is another to continue working when these checksums report errors. Based on Linus Tech Tips experience, ZFS is no better than MD or hardware raid when shit hits the fan.

For me I will only use Btrfs for important data. It doesn’t shy away from continuing when a disk or controller is flapping. No limit to the number of errors, no automatic eviction of ‘failed’ disks, and continues without needing immediate resync if a disk disappears and returns while even safely using that disk for reads

I’m sure ZFS is a fine choice for performance or low space overhead (raid z) use cases

Erisa · July 2022

@Anayx said:
So why zp0 is in "zfs list" ?

Because the pool itself is also a dataset. Think of zp0 as the equivalent of / in a traditional folder hierarchy, its the root mount and typically holds other mounts but can also hold data within itself.

YKM · July 2022

No issues on ssd/nvme or spinning disks. ARC will eat ram if you don’t constrain it but super quick, rock solid performance, obviously quicker on ssd/nvme.

When I started I ran for a month on old kit I had laying about, massive reads/writes/deletes etc 24x7, it didn’t even blink. ZFS1

It took some time to get my head around it, and yes things will fail but that’s what clustering and backups are for.

The dedupe is crazy good too, I tried compression expecting a big overhead but didn’t notice it, I use for Proxmox with about 20 vm’s with a mix of winx and Debian

I can’t see me going away from it, best move I have made

Anayx · July 2022

@letlover said:
I remember that more than 10 years ago I bought several Sun Ultrasparc IV servers, and a free Solaris 10 CD. At the time I spent less than one thousand grand for all, but in the early 2000, they might cost about $100k each.
I still remember that I had to use the serial port to install the OS and do all the management. Very inconvenient to me, as I was used to the desktop server type.
IT hardware industry has been fiercely competitive.

How this is relevant here ?

@YKM said:
No issues on ssd/nvme or spinning disks. ARC will eat ram if you don’t constrain it but super quick, rock solid performance, obviously quicker on ssd/nvme.

When I started I ran for a month on old kit I had laying about, massive reads/writes/deletes etc 24x7, it didn’t even blink. ZFS1

It took some time to get my head around it, and yes things will fail but that’s what clustering and backups are for.

The dedupe is crazy good too, I tried compression expecting a big overhead but didn’t notice it, I use for Proxmox with about 20 vm’s with a mix of winx and Debian

I can’t see me going away from it, best move I have made

Yes, I went through many articles and I think I only started to understand when I am doing it myself. Through some articles are saying that ZFS adds unwanted complication for sysadmin, use it if you are ready for it. How much that is true ?

Howdy, Stranger!

Categories

In this Discussion

ZFS with hdds and SSD - whats your experience

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

ZFS with hdds and SSD - whats your experience

Comments