All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
jsg, the "Server Review King": can you trust him? Is YABS misleading you?
Introduction
@jsg has been doing a number of reviews with his own benchmarking tool "vpsbench" and was recently given the title of "Server Review King" to honor his contribution. He was also quoted by Contabo on their homepage, containing a positive statement of their NVMe VPSes.
Having said that, the conclusions on some of his threads are particularly egregious and didn't match up with the tests performed by other members. This was followed by claims from jsg such as YABS, the script used by most on LET to benchmark VPSes, was wrong.
This prompted me and another forum member to take a deeper look at vpsbench, which should hopefully put some questions to rest: Can you trust vpsbench and jsg's reviews in general to accurately benchmark systems? Is YABS inaccurate and has @MasonR been misleading Lowenders on this forum and elsewhere?
Prerequisites
In the interest of transparency and to help other people reproduce what I did, I'll go a bit into what I used to test YABS and vpsbench. I used the latest versions of YABS and vpsbench available:
As for the testing VPS, I used a AWS EC2 instance with c5.large instance with 10GB gp3 disks which are limited to 3000 IOPS or 125 MB/s, whichever is higher.
Shocked? Isn't this LowEndTalk, where I'd get much more value by using a cheap VPS picked at BF instead of the pricey enterprise-y BS that is AWS?
I agree, but AWS allows you to have guarantees of specific levels of performance that are clearly documented. c5.* instances are dedicated-core instances, and similarly, gp3 instances with those settings exactly guarantee those performance characteristics. So, this choice eliminates noisy neighbours and any sudden drops in performance due to the provider throttling CPU or IOPS.
Also, none of the EC2 instances (apart from those named *.metal) support nested virtualization, which means I can also eliminate configuration differences between nodes. With typical LET providers, you can sometimes have inconsistently configured nodes.
vpsbench runs on both FreeBSD and Linux, while YABS is Linux-only. I'll use FreeBSD 12.2-RELEASE-p7 (ami-04d776585c8aa9c80 in us-east-1) and CentOS 8.4.2105 (ami-04d776585c8aa9c80 in us-east-1). AMIs are AWS's term for disk images, and if you're following along at home you can use those disk images to make sure they're the official ones and I didn't get into some shenanigans to unfairly treat one benchmark tool over the other.
The tests
First, let's put vpsbench to the test on FreeBSD:
I immediately see an issue with the "Std. Flags" results, which suggest that nested virtualization is allowed -- as mentioned earlier, this is just not allowed with AWS. Looking at the processor flags shows that AWS doesn't enable nested virtualization on the node:
Moving on to the disk tests, the "Rd. Rnd" value almost seems about right and is close to the advertised 125 MB/s, but the "Wr. Seq"/"Wr. Rnd" seem a bit low for something that's supposed to have consistent performance. What is interesting though, is the "Rd Rnd", which is just this sky-high value of > 4000 MB/s and wouldn't represent any real disk performance. It's more likely caching or another similar optimization that's at play here.
Even after running these tests multiple times, I see similar values to the above tests and don't see any reflection of the advertised disk performance. Now AWS could be lying, but this would mean that they're lying to Fortune 500 companies -- companies with a lot of money to sue AWS for false advertising if this were the case.
So before I can reasonably make that allegation, I'll switch over to Linux to see if things are any better over there:
The "Wr. Seq" and "Wr. Rnd" on Linux are close to the advertised disk performance, which is good. But, the "Rd. Rnd" and "Rd. Seq" are way off, at 7000 and 8000 MB/s! I just don't have a reasonable explanation for these values.
Let's now compare it with YABS. I used the -gi
flags since I don't discuss CPU and network performance.
These tests are very representative of what's advertised by AWS. Nested virtualization is correctly detected as being disabled, and are the disk IOPS limits perfectly detected. The 4k test caps out at 3000 IOPS, and the 1m test shows the limit on IOPS as being 129.5 MB/s that is even closer to the 125 MB/s limit, compared to what vpsbench reports on Linux.
Again, this could be a fluke, so I tried running YABS many times over, and there is no significant deviation in these values.
Conclusion
At this time, you can't trust vpsbench (especially on FreeBSD) to give an accurate representation of CPU capabilities and disk performance. This, unfortunately, means that his reviews, by extension, cannot be trusted.
I generally don't like to assume ill intentions, but combined with the fact that @jsg is highly defensive and doesn't take well to criticism and often makes unverified claims about other tools, casts a negative light on how much you can trust his reviews. There are other issues with his reviews too, such as a lack of controlling characteristics -- comparing three different providers with different CPU caps and performance caps/characteristics -- which lead to make claims such as "AMD is not really light years ahead in performance [over Intel]", but that's a conversation for another day.
YABS, for all intents and purposes, by virtue of being a script that relies on other well tested software like fio, geekbench, and the Linux kernel, just plainly does a better job of representing performance and CPU capabilities accurately.
Comments
If you not satisfy with their replies on this thread that your own view with @redcat act like a nazi. Why need start a blame him?
https://www.lowendtalk.com/discussion/173667/nationalization-is-coming-to-chinas-data-centers#latest
Shots fired!
From asshole. Dirty job.
Your research methodology is flawed
Your link to the vpsbench is to version 1 - I believe this is v2 - https://disk.yandex.com/d/yfbjWW1Oudar6w
this was all i needed to know. your fucking awesome!
Fair enough.
I wonder what motivated you to compile this post, I don't really care what jsg has to say as mostly he is jumping in every other discussion with his supposed to be radical outlook on that topic and thus making that discussion interesting. and entertaining. He is entertaining guy.
And you don't like his entertaining posts ?
Is this a revenge thread?
I have no interest in exacting revenge. I only care about the truth, and the problem is jsg throws BS that only few can accurately detect. This is just to warn people. If jsg corrects his issues and stops throwing BS and making accusations about other people (when they're right), I wouldn't have to make posts such as this one. Even I said as much: "I generally don't like to assume ill intentions".
Also @jenkki I'm not wrong in calling communism for what it is -- communism is, by definition, a system of wealth redistribution by way of eliminating private partnership. If reiterating the definition makes me a Nazi, so be it.
They do Innocent nazi job.
Why go through all the trouble to suggest that the CPU flags on the script could be wrong when you can just check them yourself to see?
Don't take this personally, but this doesn't give me confidence that you're able to critique the script.
jsg is typing....
Fair enough, I still have the instance running so here is the output showing that vmx/svm is unavailable:
The original post is now updated as well.
I see an issue too. The results do not show the 'vmx' flag.
Well, you didn't. What you did was to compare 4 tests against 2 tests which are quite different. For example it makes no sense to compare sequential writes against random writes.
That is no 'conclusion' but what actually was your goal.
Tough luck that of all things you focused on disk testing because in fact @MasonR (whom I have no problem or quarrel with) the author of yabs himself told me, publicly and here on LET, something like ('like' because I do not remember his exact wording) that he'd be interested in getting my disk testing code as a "library" (usable from bash) because he isn't really somehow an expert on disk testing.
Uhum, that's why I enhanced vpsbench about a dozen times and changed certain things both in vpsbench and in the result set compiler based on (real and constructive) criticism and user wishes.
In fact I'm working right now on further vpsbench enhancements based on criticism. So if I, as accused 'don't take well to criticism' I must have a really strange way to do so. We obviously live in different universes, but let me tell you that in mine 'listening to constructive criticism and user wishes and in fact investing efforts and work in designing and implementing them' != 'doesn't take well to criticism'.
I am indeed not taking well to a few users who occasionally even gang up on me and try to throw sh_t at me. Oh, what an evil guy I am; certainly you would politely smile when some people again and again threw lies and insults at you and would go to great length to bash you and your work ...
Well, the truth is that in fact I tried to not name names and to stay vague wrt the specific names of other tools. Simple reason: I did and do value when someone does work for our community and provided services to us, like e.g. MasonR. I always vaguely spoke of "scripts" or "other benchmarks" because my point wasn't to paint anyone (or any benchmark) as bad, but rather to explain why I wrote mine.
But I admitt that at some point I slipped and did point at a weakness of yabs.
Btw, I also apologized whenever I made an error or mistakenly said something wrong, especially when I was mistaken wrt other benchmarks.
Now, let's look at your methodology. Basically you are saying that vpsbench must be wrong because Amazon certainly wouldn't lie. Well, that's one way to look at things. But it's not mine. A benchmark is about getting the data, the facts, not about trusting company statements.
Besides, what exactly does Amazon's statement mean? What disk performance precisely? From what I see they could make their statement based on any number, any test. If only 1 out of the 4 or 8 numbers my benchmark delivers is equal to or above their number they could say that they are right.
There are also quite a few technical details but I won't discuss those as an honest look and test of vpsbench wasn't your topic and goal anyway.
It looks heavily like your read tests are hitting RAM to me. There's no way Amazon are giving people 8GB/s disk read (sync).
Solution to this problem is just host jsg tool(s) on github and those who see flaws - commit some changes. Now, this is just public rebuke of poor king
I simply don't believe any reviews from any kings.
Me as a normal person doesn't have the same angle of view as the kings.
I trust YABS, normal people use it, and it shows correctly what I can calculate from my usage statistics.
she/he: "i am right and everything else is shit, dont even challenge me!"
tons of people say: "hell no".
she: "this is cyber bully sob"
Evidently, yes. But testing VMs necessarily means testing a virtual machine which translates to e.g. diverse sorts and levels of caches beneath the VM.
@some others
Of course you are back at going against the person ...
Re. "King": As I've already said, "Server Review King" was not a tag chosen or desired by me. I was a bit shocked myself when I saw it the first time. But I didn't and will not complain to @jbiloh, because I'm certain that his intentions were friendly and good and because after all it's just a tag. As long as there is "reviewer" in it I'm OK with it.
Then why did you not deny this being the case on your previous thread?
https://www.lowendtalk.com/discussion/173348/contabo-new-raw-nvme-speed-product-line-benchmark-review/p2
As pointed out in the Hostsolutions thread, the Intel vs AMD with no controls and elsewhere.
What's unclear about "3000 IOPS or 125 MB/s", whichever is greater lower? Did you read the AWS documentation?
And I hope that I didn't have to waste hours and money (none of these instances are free and cost a lot, unlike *cough* the special treatment that someone gets here *cough*) to perform free QA for your app and on correcting your BS.
I cannot confirm the conversations that you and MasonR may have had, I'll give you the benefit of doubt and assume they're true.
edited to update: it should be lower, not greater.
Any performance penalty or improvement is already included within the IOPS performance as AWS advertises the IOPS that the user can see, as already evidenced through the use of fio (by YABS).
relevant Simpson prediction
Read or write performance? Buffered or direct/sync? Sequential or random? There are other factors too like e.g. what data (well cacheable or random and if the latter then what quality).
And that's not just me. Everyone in the field worth his salt knows that one can - and in marketing usually does - bend the numbers. Typical example: of course one publishes the best result out of diverse tests.
Surely a benchmark tool testing a cache should be listed as such? Fio is giving what looks like the correct result and your tool is giving a cached result. You keep evading this point - which I've raised before - rather than taking the feedback and fixing your tool.
If this was down to VM or host config both fio and your tool should be equally impacted.
Oh man, plese don't post thesis or dissertation. Make it simple
WHO?
Buffering at the OS level is, well, an OS level concern and I'm not sure why you want to indicate that AWS would make guarantees about the OS. Of course it's possible to have buffering at the hardware level too, but that is part of the hardware and all IOPS are limited there.
I'll post findings below this reply that shows that this is the case.
I'm not sure why AWS wouldn't want to pick the 4000MB/s or 7000MB/s over their measly 125MB/s. Wouldn't it make them look better?
Regardless of sequential or random, read or write, all IO is limited at 3000 IOPS and 125MB with gp3 disks like I said. Of course there is a slight deviation to 130-135 MB/s and some level of error is expected, but it otherwise closely follows the advertised specs.
And of course I'm not gonna test buffering, as explained earlier this is just an OS level concern which AWS wouldn't have any control over.
Random reads and writes in direct mode (this is what yabs implements)
Random reads only in direct mode
Random writes only in direct mode
Sequential reads only in direct mode
Sequential writes only in direct mode