"Objective review" - can you trust it? The truth, a *real* comparison and news
Recently there was an "objective review" of my benchmark program which actually was intended to and done to discredit my vpsbench (my software), my work for the community, and me (as basically admitted by the OP himself).
So, let's look again and this time properly. What do I mean by 'properly'?
- Said OP explained that AWS can be trusted because they would not lie to large corporate customers and hence his "review" based on some testing on AWS was somehow fair and proper. Unfortunately though even a quick search leads to quite a few results where Amazon/AWS is accused of lying; some even provide evidence. More importantly though we seem to have no reason to presume that said OP is a 'large corporate customer', so his argument fails anyway. Plus how can he verify what AWS says? Can he/we even really know the hardware and other relevant details of what his VM was running on? In short, his whole approach, basis, and argument were flaky.
- Said OP did not even make the effort to use vpsbench properly, let alone doing an even just halfway fair comparison. Unlike him I did use fio in a way that leads to similar tests.
- I used real physical hardware which I actually can really control. No one else using the node, no other who knows what software running on the node.
- My main goal is not even to attack said OP (which is why I do not name him); my main goal is to do properly what he did on shaky grounds and sloppily as well as obviously biased.
So, let's go. The hardware is a system build and installed only and exclusively for this test, an Asus mainboard with a Ryzen 1700 and 16 GB memory (DDR4, 3000MHz, 16 clk) and two physical drives which are not used for anything else other than the tests. One of those is an old OZ Vertex SSD (don't care, just think "some SSD") and the other one is a Seagate Firecuda which I intentionally chose because it's a spindle ("HDD") but one with a built-in flash cache.
The OS is devuan (basically debian 10 without systemd), freshly installed on a new and unused M2 SSD. At any point in time only the test candidates, and only one of them, were running and the system was a plain default (server) install.
In short, a really fair and unbiased test setting.
fio was a default (apt) install too, it's v. 3.21. Both vpsbench and fio were tasked to do 2048 writes and reads of 64 K blocks for a total test size of 128 MB in direct/sync mode and so was 'dd'.
Here are the 4 calls of fio, similar to what my program does, with '$1' being the path to the root of the tested device:
fio --name=wr_seq_64K --ioengine=sync -fdatasync --rw=write --bs=64K --iodepth=1 \ --numjobs=1 --size=128M --gtod_reduce=1 --direct=1 --filename=$1/test.fio --group_reporting echo fio --name=wr_rnd_64K --ioengine=sync -fdatasync --rw=randwrite --bs=64K \ --iodepth=1 --numjobs=1 --size=128M --gtod_reduce=1 --direct=1 --filename=$1/test.fio --group_reporting echo fio --name=rd_seq_64K --ioengine=sync --rw=read --bs=64K --iodepth=1 --numjobs=1 \ --size=128M --gtod_reduce=1 --direct=1 --filename=$1/test.fio --group_reporting echo fio --name=read_rnd_64K --ioengine=sync --rw=randread --bs=64K --iodepth=1 \ --numjobs=1 --size=128M --gtod_reduce=1 --direct=1 --filename=$1/test.fio --group_reporting
followed by vpsbench, then hdparm ('hdparm -tT [device]') and finally dd ('dd if=/dev/zero of=[tested devices root dir] bs=64k count=2048 oflag=sync').
Before showing the result I have to mention two things:
- vpsbench is a new and enhanced version (2.4.0) with more realistic read results (learn more later)
- 'hdparm -tT' does test the device (rather than via a file) and it does reads only, but I found it interesting and potentially valuable because it works on a deeper level.
Here are the results:
SGF | OZV MiB/s Iops MiB/s% Iops% |MiB/s Iops MiB/s% Iops% ----- *cache ON* -------------------------------------------------------- Fio WrSeq 129,6 2056 122,6 1846 Fio WrRnd 129,3 2052 158,1 3038 Fio RdSeq 131,7 2089 157,9 3038 Fio RdRnd 16,0 31,51 141,4 2388 vpsb WrSeq 4,9 98,2 vpsb WrRnd 4,9 120,2 vpsb RdSeq 129,4 138,2 vpsb RdRnd 550,9 1125,9 hdparm C 470,0 200,0 hdparm NC 129,8 204,4 dd sync 6,2 120,0 ----- *cache OFF* -------------------------------------------------------- Fio WrSeq 129,2 1977 99,7 96,2 100,4 2043 81,9 110,7 Fio WrRnd 130,7 2068 101,1 100,8 161,4 3129 102,1 103,0 Fio RdSeq 128,9 2048 97,9 98,0 164,0 3200 103,9 105,3 Fio RdRnd 17,1 35,89 106,9 113,9 144,5 2494 102,2 104,5 vpsb WrSeq 0,9 18,4 96,2 98,0 vpsb WrRnd 0,8 16,3 119,9 99,8 vpsb RdSeq 122,6 94,7 132,4 95,8 vpsb RdRnd 467,2 84,8 1120,5 99,5 hdparm C 61,0 13,0 200,1 100,1 hdparm NC 129,3 99,6 203,3 99,5 dd sync 0.91 14.68 80 - 120 ??
First an apology in case I f_cked up formatting (in my editor it looks fine).
What you see are two result sets. The top half shows the results with the OS cache not disabled that is, what most users work with. The lower half is with linux' disk caching disabled ('hdparm -W 0 [device]'). On the left ("SGF") are the Seagate Firecuda results, on the right ("OZV") are the results of the OZ Vertex SSD.
'hdparm -tT [device]' does two tests, one with cache ("C") and one ("CD") with cache disabled, so you see 2 result lines.
Only the fio results use all columns because (so far) only fio shows IOPS.
Note the additional columns in the lower half, which show the results as a percentage of the results in the upper half that is cache disabled results vs. cache enabled results expressed in percent, so a value below 100 means that the result is lower than that with cache enabled and one with above 100 one that is higher.
No matter with linux' disk caching enabled or disabled fio largely shows quite similar results, which clearly indicates that something weird is going on. Note that not only vpsbench but also hdparm and dd show quite different results depending on the linux disk cache being active or not. dd funnily delivers varying results on the SSD, but that's no problem because I used it just as a basic orientation and check point.
Only fio delivers pretty much the same results which is not credible. All other tests, deliver drastically lower results when the OS cache is disabled with only one exception on only one disk.
But there is more. fio also tells us that random reading is slower on the drive with a built-in cache than random writing, no matter whether the OS cache is enabled or disabled. And we are not talking about a small difference but about reading not even achieving 20% of write speed! With the SSD it's the sequential writes that drop significantly with the OS cache disabled, random writes though, which are far more in need of caching just keep their speed but for the sake of fairness all other tests don't lose speed either with cache disabled, which leads me to the the assumption that that SSD actually has some (DRAM I guess) cache built-in too without telling (or me having forgotten it).
But one thing is clear: No, fio does not seem to be "the reference". It rather is the outlier. In fact I'm quite content with at least the new version of vpsbench. Spindles (at least consumer HDDs) are bloody much slower than SSDs and the still high but (now) in a reasonable region read numbers from vpsbench do match my experience with the tested disks quite well. Turn the cache off and read speed goes significantly down while write speed goes brutally down.
But the maybe most important lesson is that there is no reference. Use 4 benchmarks and you get 4 different results, simple as that. Second point: Every benchmark shows a few weird results; that's not even surprising when considering the different approaches, parameters and flags used as well as the complexity of modern systems.
And well noted, we are talking about a dedi here. On a VPS yet another layer of caching weirdness and layers are added on top and one can usually not change anything on the node level and even less so if a Raid or disk controller with its own cache enters the game as is often the case with node hardware.
But here is the good news: I have carefully checked the vpsbench source code many times and the really important point is this: It's quite meaningless whether vpsbench (or any other benchmark) shows read or write results higher or lower than another benchmark. The relevant point is that it does consistently whatever it does so that the result sets between providers and VPSs are fair and comparable.