All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
"Objective review" - can you trust it? The truth, a *real* comparison and news
Recently there was an "objective review" of my benchmark program which actually was intended to and done to discredit my vpsbench (my software), my work for the community, and me (as basically admitted by the OP himself).
So, let's look again and this time properly. What do I mean by 'properly'?
- Said OP explained that AWS can be trusted because they would not lie to large corporate customers and hence his "review" based on some testing on AWS was somehow fair and proper. Unfortunately though even a quick search leads to quite a few results where Amazon/AWS is accused of lying; some even provide evidence. More importantly though we seem to have no reason to presume that said OP is a 'large corporate customer', so his argument fails anyway. Plus how can he verify what AWS says? Can he/we even really know the hardware and other relevant details of what his VM was running on? In short, his whole approach, basis, and argument were flaky.
- Said OP did not even make the effort to use vpsbench properly, let alone doing an even just halfway fair comparison. Unlike him I did use fio in a way that leads to similar tests.
- I used real physical hardware which I actually can really control. No one else using the node, no other who knows what software running on the node.
- My main goal is not even to attack said OP (which is why I do not name him); my main goal is to do properly what he did on shaky grounds and sloppily as well as obviously biased.
So, let's go. The hardware is a system build and installed only and exclusively for this test, an Asus mainboard with a Ryzen 1700 and 16 GB memory (DDR4, 3000MHz, 16 clk) and two physical drives which are not used for anything else other than the tests. One of those is an old OZ Vertex SSD (don't care, just think "some SSD") and the other one is a Seagate Firecuda which I intentionally chose because it's a spindle ("HDD") but one with a built-in flash cache.
The OS is devuan (basically debian 10 without systemd), freshly installed on a new and unused M2 SSD. At any point in time only the test candidates, and only one of them, were running and the system was a plain default (server) install.
In short, a really fair and unbiased test setting.
fio was a default (apt) install too, it's v. 3.21. Both vpsbench and fio were tasked to do 2048 writes and reads of 64 K blocks for a total test size of 128 MB in direct/sync mode and so was 'dd'.
Here are the 4 calls of fio, similar to what my program does, with '$1' being the path to the root of the tested device:
fio --name=wr_seq_64K --ioengine=sync -fdatasync --rw=write --bs=64K --iodepth=1 \
--numjobs=1 --size=128M --gtod_reduce=1 --direct=1 --filename=$1/test.fio --group_reporting
echo
fio --name=wr_rnd_64K --ioengine=sync -fdatasync --rw=randwrite --bs=64K \ --iodepth=1 --numjobs=1 --size=128M --gtod_reduce=1 --direct=1 --filename=$1/test.fio --group_reporting
echo
fio --name=rd_seq_64K --ioengine=sync --rw=read --bs=64K --iodepth=1 --numjobs=1 \
--size=128M --gtod_reduce=1 --direct=1 --filename=$1/test.fio --group_reporting
echo
fio --name=read_rnd_64K --ioengine=sync --rw=randread --bs=64K --iodepth=1 \ --numjobs=1 --size=128M --gtod_reduce=1 --direct=1 --filename=$1/test.fio --group_reporting
followed by vpsbench, then hdparm ('hdparm -tT [device]') and finally dd ('dd if=/dev/zero of=[tested devices root dir] bs=64k count=2048 oflag=sync').
Before showing the result I have to mention two things:
- vpsbench is a new and enhanced version (2.4.0) with more realistic read results (learn more later)
- 'hdparm -tT' does test the device (rather than via a file) and it does reads only, but I found it interesting and potentially valuable because it works on a deeper level.
Here are the results:
SGF | OZV
MiB/s Iops MiB/s% Iops% |MiB/s Iops MiB/s% Iops%
----- *cache ON* --------------------------------------------------------
Fio WrSeq 129,6 2056 122,6 1846
Fio WrRnd 129,3 2052 158,1 3038
Fio RdSeq 131,7 2089 157,9 3038
Fio RdRnd 16,0 31,51 141,4 2388
vpsb WrSeq 4,9 98,2
vpsb WrRnd 4,9 120,2
vpsb RdSeq 129,4 138,2
vpsb RdRnd 550,9 1125,9
hdparm C 470,0 200,0
hdparm NC 129,8 204,4
dd sync 6,2 120,0
----- *cache OFF* --------------------------------------------------------
Fio WrSeq 129,2 1977 99,7 96,2 100,4 2043 81,9 110,7
Fio WrRnd 130,7 2068 101,1 100,8 161,4 3129 102,1 103,0
Fio RdSeq 128,9 2048 97,9 98,0 164,0 3200 103,9 105,3
Fio RdRnd 17,1 35,89 106,9 113,9 144,5 2494 102,2 104,5
vpsb WrSeq 0,9 18,4 96,2 98,0
vpsb WrRnd 0,8 16,3 119,9 99,8
vpsb RdSeq 122,6 94,7 132,4 95,8
vpsb RdRnd 467,2 84,8 1120,5 99,5
hdparm C 61,0 13,0 200,1 100,1
hdparm NC 129,3 99,6 203,3 99,5
dd sync 0.91 14.68 80 - 120 ??
First an apology in case I f_cked up formatting (in my editor it looks fine).
What you see are two result sets. The top half shows the results with the OS cache not disabled that is, what most users work with. The lower half is with linux' disk caching disabled ('hdparm -W 0 [device]'). On the left ("SGF") are the Seagate Firecuda results, on the right ("OZV") are the results of the OZ Vertex SSD.
'hdparm -tT [device]' does two tests, one with cache ("C") and one ("CD") with cache disabled, so you see 2 result lines.
Only the fio results use all columns because (so far) only fio shows IOPS.
Note the additional columns in the lower half, which show the results as a percentage of the results in the upper half that is cache disabled results vs. cache enabled results expressed in percent, so a value below 100 means that the result is lower than that with cache enabled and one with above 100 one that is higher.
Interpretation:
No matter with linux' disk caching enabled or disabled fio largely shows quite similar results, which clearly indicates that something weird is going on. Note that not only vpsbench but also hdparm and dd show quite different results depending on the linux disk cache being active or not. dd funnily delivers varying results on the SSD, but that's no problem because I used it just as a basic orientation and check point.
Only fio delivers pretty much the same results which is not credible. All other tests, deliver drastically lower results when the OS cache is disabled with only one exception on only one disk.
But there is more. fio also tells us that random reading is slower on the drive with a built-in cache than random writing, no matter whether the OS cache is enabled or disabled. And we are not talking about a small difference but about reading not even achieving 20% of write speed! With the SSD it's the sequential writes that drop significantly with the OS cache disabled, random writes though, which are far more in need of caching just keep their speed but for the sake of fairness all other tests don't lose speed either with cache disabled, which leads me to the the assumption that that SSD actually has some (DRAM I guess) cache built-in too without telling (or me having forgotten it).
But one thing is clear: No, fio does not seem to be "the reference". It rather is the outlier. In fact I'm quite content with at least the new version of vpsbench. Spindles (at least consumer HDDs) are bloody much slower than SSDs and the still high but (now) in a reasonable region read numbers from vpsbench do match my experience with the tested disks quite well. Turn the cache off and read speed goes significantly down while write speed goes brutally down.
But the maybe most important lesson is that there is no reference. Use 4 benchmarks and you get 4 different results, simple as that. Second point: Every benchmark shows a few weird results; that's not even surprising when considering the different approaches, parameters and flags used as well as the complexity of modern systems.
And well noted, we are talking about a dedi here. On a VPS yet another layer of caching weirdness and layers are added on top and one can usually not change anything on the node level and even less so if a Raid or disk controller with its own cache enters the game as is often the case with node hardware.
But here is the good news: I have carefully checked the vpsbench source code many times and the really important point is this: It's quite meaningless whether vpsbench (or any other benchmark) shows read or write results higher or lower than another benchmark. The relevant point is that it does consistently whatever it does so that the result sets between providers and VPSs are fair and comparable.
Comments
Summary: You don't like vpsbench? Fine, no problem, simply don't use it; wrote it mostly for myself anyway. You think that tool XYZ is much better than vpsbench? No problem, just use that tool. You want to hit at me because you think that tool XYZ is much better than mine? My advice: Don't. For one it's not nice, nor is it needed. But also, you should be well prepared; a quick and dirty hit won't do and risk to lead to a reaction you might not like.
You don't like my reviews? Fine, no problem, just don't read them and find some you like.
And rest assured that I do care about vpsbench being and evolving ever more into a realiable good tool - in fact I care more than virtually anyone here and I invest the efforts and work.
Finally a word of advice: Do NOT take any benchmarks numbers as absolute or as "reference"! Rather take the results as "compared to other systems". And always look (when available like in my reviews) at the spread, too!
Isn't fio
--direct=1
parameter used to make sure the test will not use cache? Your summary for fio is wrong because you use wrong parameter.CMIIW
No. Your nonsensical ignorant "conclusion" is wrong. I did use '--direct=1' ... as can be seen in the calls which I did provide.
But thanks for yet another example of "I can't be bothered to actually read. Just willy nilly asserting something has to suffice".
where is yabs
No we neither trust you nor your shady experiments.
Oh I don't think I'm going to be reading an essay...
Dethroned king, lost credibility, shatered authority. Steve the evil has obtained his goal.
Basically the King admitted previous version of vpsbench was indeed generating erroneous results, so he dethroned himself and quickly fixed the bug.
Wil you wait for rethroning?
Uhm, the "King" himself asked for having his tag replaced by something more modest. No dethroning, sorry. I know some would have loved that.
I'm not at all surprised that the bottom dwellers of the 85% crowd fail to understand that but a PhD (or PhD in spe)? Strange.
Actually the "King" already "admitted" - and even explained the reason - not erroneous but too high read results quite a while ago. But now he has found a way to solve that problem.
But even assuming you were right: How despicable a creature must be to turn against someone that solved a problem they complained about! Or in other words, thanks for clearly and unmistakably demonstrating your true attitude and intention which is, no matter what I do, no matter how I act you will find a way to turn that against me.
yabs or gtfo!
You got wrecked on the other thread. It's time for you to move on.
So, you get to decide what software I use? I don't think so.
Try to impose your rules on someone else.
And now, to use your words, gtfo!
The end is nigh.
tldr pls?
All of his "reviews" were false and heavily favored hosts he reviewed.
Nobody on this forum was asking for dd's or hdparm test results, just yabs. I'm curious whether yabs would reproduce the fio result or show a more reasonable speed.
So what? Am I your servant? I can use whatever tools I please, without you asking or not, and I wanted more than just vpsbench and fio.
As for yabs, that is using fio anyway, albeit in a way I don't care about. If you think that yabs is great, you are free to use it - and unlike my "opponents" act towards me I do not try to bash, attack, smear, and discredit yabs users.
@jbiloh the cloudflare pop up block is happening again...
There's several things to note:
Of course the vertex ssd has 64MB cache. You probably should find that out, as well as if the firecuda has 4GB, 8GB or 16GB of cache.
You're concerned about the formatting but not the period vs commas? I can't make sense of the units.
This is epic fail. You don't understand what results to expect.
Long live the king!
That said, JSG asked that the title be changed to something more modest, and I happily complied. I was the one who came up with "Server Review King" originally.
JSG spends a lot of time in service to the community with all of his contributions. There should be some appreciation and respect for that even if you disagree with the methods.
Seems like @jsg is working to improve his systems further which is great to see.
@TimboJones please PM the details of what is happening to you regarding CF and I will do my best to fix it. Sorry for the inconvenience caused. I really try quite hard to make things as best possible when it comes to our security without getting in the way more than needed.
I guessed and said so much myself, but whatever cache it has or has not, it has for all candidates, vpsbench, fio, hdparm, dd.
Well, I happen to live in a region where it's done the other way round. Simple solution: just do what you expect half of the world to do; mentally swap commas and dots.
But it is, just like with/very similar to dd.
How pathetic ("millionth time"). And: you are wrong. Because a read call does not necessarily go to the device/via Sata.
And of bloody course you are totally biased (again) and mention only what fits your agenda. fio showing some reads to be much, much slower than writes.
THAT is what really happens " for the millionth time". Whenever I address any of your "criticism" all I get is more of the same.
That is true in the same way that someone can say that you are wrong and refuse to give proof because they are "[not] your servant". If you want to properly prove that you are right, you have to meet the burden of proof of the people you're trying to convince, and right now you're refusing to use what this forum sees as the "gold standard" for benchmarks.
I think there's a bigger issue right now though. @stevewatson301's "hit piece" provided a lot of strong evidence that you have failed to disprove which is why it garnered so much attention, and you're only countering his points by responding in bad faith - you aren't being sneaky and everyone can see through your gatekeeping and gish galloping. I think you're too defensive right now to come to a proper conclusion to this issue, and I'm tired of watching you dirty your own reputation further. Could I propose a moderated debate or something (@jbiloh I'm sure it would be great content too) where you find an impartial moderator to allay the concerns of bad faith on both sides?
Too f'kin obvious for those blinkered people who just like to attack others, for "shits & giggles".
[Fifths of an inch and millimetres, anyone?]
I think you understood him wrong. he saw that you indeed used that flag in all fio calls. therefore he (and me too) thinks your cached results are actually not cached for fio anyway. simply because fio bypasses the cache, no matter if you deactivated it via hdparm or not ;-)
so that would be a reason, why the fio result are that consistent - the direct=1 makes it that way.
would you provide your newer version of vpsbench? I'd be interested in running it again on my dedi with the regular HDDs. any details on what you changed?
on another note, the IOps shown for the spindle are to high anyway, it should be max 200 even for a very fast one. this suggests that the overall size of 128MB is too small for a reliable measurement and indeed hitting the HDDs cache only.
the write speeds seem quite slow but could actually make sense considering that it's a product of iops*bs so 4.9MB/s at 64k blocksize result in somewhat 80 IO per second (which again seems reasonable for a spindle.
that still leaves me wondering why the difference between write and read speed is that large. and especially that random reads regularly achieve higher speeds than sequential ones even on HDD will never make sense to me. it feels a bit like if your read routine suffers from parallelism or something like that, while writes do not. just sayin' ... unprofessional personal opinion that is.
Kudos @Falzo for sticking to figures, in a dignified fashion.
[Needs to find that shift key though! ]
thanks for the flowers. I am simply interested in any kind of benchmarking, especially disks as this has a long tradition and a few years back you'd only see dd everywhere and even that could be done wrong depending on some few options.
there are a few (physical) limitations and relations between a few factors. the rest is pure math, so yes I am questioning numbers (not only vpsbench, but often others too) and usually develop some curiosity about the reason.
that does not neccessarily mean any benchmark is doing anything wrong. could simply be the intention to measure something else than what I would expect when seeing the result. yet could also be some logical flaw that nobody saw before 🤷♂️
tl;dr; I like riddles.
Nope. not comparable. If I just make some statement and someone says that I'm wrong without providing proof that's OK. Not nice but OK. But here one side (me) has invested massive work, efforts, and time and to just say that I'm wrong is not enough, just like it wouldn't be enough if I did that to someone who has invested massive work, efforts, and time.
Wrong again, because this isn't about me, out of the blue, claiming to be right. It's not even an acceptable defense situation where I have to defend e.g. a dissertation and where both sides must meet certain standards. This is a situation where some just throw assertions at me, meeting no standards whatsoever, but hold me to strict standards and as default position, reject whatever I submit.
Plus, and that's important: In a dissertation defense or say a job interview there is something to be gained if one is successful. Here however I have nothing to gain, I want nothing, I'm simply sharing some work I did. If one likes it, one takes it, if one does not like it, one doesn't take it.
I have shown that not only is it not the "gold standard" but actually there is no "gold standard" at all (incl. my program). And btw, "we like/prefer XYZ" doesn't make XYZ a gold standard.
How can something that does not even stand on a solid basis provide "strong evidence"? You simply call it that because you want to see it as "strong evidence". that however doesn't make it so.
Thanks for your confession. You should keep it private when you talk to yourself.
Yet another case of "It is so, because I say so, but you, jsg, must prove everything you say - and to my "standards!". Sorry, that doesn't work with me.
Do I, really? Nope. I "dirty my reputation" only with a few who would accept nothing I do or say and who want to hold a kangaroo court session with the verdict ready before the session began.
Thank you for your more detail explanation. I hope jsg could understand my concern above.
Guys, DC and AC both have their use. They are simultaneously both the right and wrong approach for the task.
Said OP is me, in case anyone was wondering.
I guess Docker and the Linux kernel were also lying? And did you forget about the fact that the kernel limits wouldn't consider the cache as the operating system would slow down all I/O requests from that process?
Maybe. I'm human so that can happen.
As for the matter: I ran all, vpsbench, fio, and dd in sync mode. One time (top half) with linux cache not disabled, and one time with linux disk cache disabled (lower half).
One major reason for that was to find out whether linux really disables caching when asked and whether it's completely disabling them.
Right now my answer is "highly likely not. highly likely I won't publish/share any newer versions". I'll probably be willing though to sometimes later provide it to a select few on the condition that they do not give it to others. (Hit someone who only had good intentions and invested a ton of work and time for a community often enough and nastily enough and it'll make him think ...). You are btw among the few whom I would consider to provide newer versions to.
Yes and no. You see, I tasked vpsbench, fio, and dd to do the same job with the same drive and that's not somehow bent or unfair. But there are practical factors too. The Firecudas on-board cache is 8 GB and trust me, doing all the tests I did with a test size of say 10 GB would have taken an eternity.
Whatever, the result values I show above are exactly what the tests achieved.
Oh, I'm wondering about quite a few things too, but I can't offer more than an educated guess without investing a lot more work and doing much more specific and elaborate and extensive testing. It seems that disabling this or that cache (there are quite a few involved at different levels) actually just disables write caching. I've also seen built-in caches that are entirely different again and never cache one or the other, either read or write.
I'm afraid that no matter my good will I have to stop at some point trying to find answers to those riddles. Simple reason: vpsbench is not particularly a disk benchmark but a VPS benchmark. the disks are just one element of the mix. At the moment for example I'm working on bringing in some OpenSSL routines because many are interested in that.
But I'm willing to help you to some degree albeit sadly a limited one.