Upcoming LET/LEB Review Series

jbiloh · April 2020

Falzo said: will watch and wait, meanwhile let's see what @jsg has to offer in terms of reviews...maybe that makes me look at LEB and actually read something after long time again - if it's gonna be readable at all that is :-P

April is going to surprise a lot of people - I've got tons of content already written and scheduled starting in a few days. I've actually got a huge amount of content (tutorials, interviews, polls, Op-ed, etc) written all the way through end of August now.

lowscamboxsad · April 2020

Please review my VPS scam service. I will give you 50% of every dollar I steal from customers referred. SparkVPS.com thanks

angstrom · April 2020

@jbiloh said:

Falzo said: will watch and wait, meanwhile let's see what @jsg has to offer in terms of reviews...

maybe that makes me look at LEB and actually read something after long time again - if it's gonna be readable at all that is :-P

April is going to surprise a lot of people - I've got tons of content already written and scheduled starting in a few days. I've actually got a huge amount of content (tutorials, interviews, polls, Op-ed, etc) written all the way through end of August now.

Alas, the quoting mechanism is still broken (and has been for years)

I guess that you didn't even notice when you quoted @Falzo?

jbiloh · April 2020

@angstrom said:

@jbiloh said:

Falzo said: will watch and wait, meanwhile let's see what @jsg has to offer in terms of reviews...

maybe that makes me look at LEB and actually read something after long time again - if it's gonna be readable at all that is :-P

April is going to surprise a lot of people - I've got tons of content already written and scheduled starting in a few days. I've actually got a huge amount of content (tutorials, interviews, polls, Op-ed, etc) written all the way through end of August now.

Alas, the quoting mechanism is still broken (and has been for years)

I guess that you didn't even notice when you quoted @Falzo?

Yeah certain formatting buggers up the quote plug-in. Not sure how to fix it though. I quoted him intentionally.

TimboJones · April 2020

@Lee said:

seriesn said: An idea/suggestion would be, once again, just a suggestion, purchase the VPS/Dedi anonymously, with "LEB expense" account, test it out for a month. At the end of the month, reach out to the provider privately (after the review has been posted) asking if they could refund the fee. Pretty sure between the mods/jsg/you, there's plenty of users that can signup without causing suspicion/getting flagged as "Oh the review cop is here".

Indeed, the way it is being done is totally nonsensical, none of the reviews will result in any sort of accurate representation of the support, performance etc ~~when they know who is using the service and for what purpose.~~ PERIOD.

So much can and will be manipulated to ensure only positive results.

The results will be totally unreliable and unreproducible if jsg keeps using his own vps benchmark app that outputs nonsensical numbers. He never QA'd it or verified it does what he says it does. I provided numerous results from a number of shitty VPS', and they would show GB/s on just shit hardware and results were not consistent at all. One result was over 13GB/s on Cloud at Cost VPS(!), higher than RAM speeds, FFS. No one could be bothered to even run his vps benchmark (maybe because its a hassle to wget from Yandix instead of properly hosting it for scripting usage), instead sticking with YABS or previous scripts. You have a free Oracle VPS? The one with 50MB/s disk speed limits and a CPU load that reports 70 in top? Yea, that reported 4.7GB/s random read speed using jsg's vps benchmark app.

Having mainly tested it on Centos 7 and jsg mainly uses freebsd, I even went and tried it on freebsd (I couldn't believe the numbers could be so unrealistic and he hadn't noticed was mind blowing to me) and still got silly results. 1.67 GB/s sequential read on Cloud at Cost VPS... I provided the syntax I used to run the test, just in case I was doing it wrong (spoiler alert, jsg never said shit about the commands used).

I nearly fell out of my chair when I read that he thinks he'd know when a provider was playing tricks on him and he's sure its never happened (just another sign he isn't aware of basic test conditions. He tests a sample of ONE, making it impossible for him to be certain he isn't getting a special VPS). He hasn't a f'n clue, as evidenced by his single benchmark measurement (never stating what time of day he took them) and his "analysis" of the poor and always questionable results (e.g. major differences in sequential and random results). Useful review content like reasons he needed to contact Support are never detailed. He intentionally barely uses it or lightly uses it, so knowing how well it performs at various real world tasks or even over a small period of time is an unknown. He wouldn't be aware of issues if one doesn't use it! Running into VPS limits or anything like that would have insight to benefit the user. I mean, a good measure of a provider quality is how they interact with you (with or without issues occurring) and if none of that is included, a good portion of review value is thrown out.

I appear to be in the minority in caring that these "reviews" are not useless puff pieces, but I strongly urge people who actually give a shit to try his vps benchmark app on their VPS and see if the results are useful and reflects their experience, or if its just nonsensical and useless as I have found it to be.

@poisson does a MUCH better job of providing useful review for every day person. Consistently tested, documents what he does, useful and reproducible results, etc. My only issue, in stark comparison with jsg's reviews, was poisson reporting bandwidth results in MByte/s instead of the standard MBit/s that the interface speeds are always defined in. Not the end of the world.

On the other note, this place really sticks with the "Low End" when < $10 needs to be refunded after several hours of (free) human labour and content creation for LET. The idea of requesting/demanding a refund for anonymous review is completely silly if not douchey. Imagine a food critic going back to the owner and asking for a refund after they just reviewed them in their paper/website/magazine, etc. The price of fair testing is usually MUCH higher than $10.

Lee · April 2020

TimboJones said: The results will be totally unreliable and unreproducible if jsg keeps using his own vps benchmark app that outputs nonsensical numbers. He never QA'd it or verified it does what he says it does.

Interesting, not that I would have known, I never run these benchmarks but those are some crazy results, surely he would have known they were wrong...

uptime · April 2020

BSDGUY DID NOTHING WRONG!

jbiloh · April 2020

@TimboJones certainly appreciate your input and suggestions here. We want to make the reviews as meaningful as possible and the critiques are understood. I'll keep the issues you mentioned regarding the testing irregularities in mind and we'll do a sanity check on the data.

I really enjoy and appreciate seeing community members willing to step up and help make sure that content is as valuable as possible.

Regarding the payment/budget of LEB/LET, I've been spending more than the website takes in recently on development, hosting, ddos protection, content creation, the list goes on, so I am just trying to be mindful of expenses.

Overall let's thank @jsg for being willing to dedicate time for the betterment of the community and help support him in his endeavor to provide the most detailed and reliable data possible.

Thanks everyone

skorous · April 2020

@Lee said:

TimboJones said: The results will be totally unreliable and unreproducible if jsg keeps using his own vps benchmark app that outputs nonsensical numbers. He never QA'd it or verified it does what he says it does.

This should be entertaining. We all know how well he takes criticism.

Lee · April 2020

@jsg - any comment on what @TimboJones has said?

jsg · April 2020

@Lee

Of course. And also of course you try to turn the heat up ...

@TimboJones

I will not waste my time to show once more that you do not even really know what you are talking about. I will also not discuss who actually wanted a library of my new benchmark version (and said so publicly); you'd be quite surprised though. Nor will I discuss the - very clear - ratio of positive feedbacks on my work vs. your ranting.

Instead I will bow before your wisdom and experience and suggest that we change roles that is, that not I but you do the benchmarks and reviews while I'll have a close look at your work and results.

Of course, I do not know whether the community and the providers will respect you and offer VPSs for testing after they have seen your first one or two benchmarks and reviews, nor do I know whether @jbiloh would accept you, but well, I guess we'll find out.

I for one get two big advantages out of that: (a) a lot of work that I need not do, and (b) getting rid of the suspicion that I buttered up "looking for opportunities" while in reality I accepted a request to do a lot of work for our community.

Good luck, Mr. KnowsEverythingBetter

Lee · April 2020

jsg said: Of course. And also of course you try to turn the heat up ...

Not at all, I will be the first to tell him he was wrong if that is the case. That said, his comments related to the script appear concerning and I know he is knowledgeable in these sort of things. But if he is wrong then so be it.

Unbelievable · April 2020

rage quit was used by another member on this site, let's all simmer down. Anyone who does reviews is a sitting duck for criticism. Too long, too short, no benchmarks, but the server was empty, can you run this special benchmark, please? And the best one- that's not a review. Its your first post you are a shill. We are just seeing the expected replies. I still have no idea how one person, however thorough, nice, polite, well documented, can do a review that will go unquestioned. Even the great bakeoff and consumer reports has more than one reviewer.

TimboJones · April 2020

@jsg said:
@Lee

Of course. And also of course you try to turn the heat up ...

@TimboJones

I will not waste my time to show once more that you do not even really know what you are talking about.

You mean, not bother to look into how your app doesn't work as you intended? You don't have to know anything about anything to run your script and get nonsense results.

I will also not discuss who actually wanted a library of my new benchmark version (and said so publicly); you'd be quite surprised though.

What relevance does it matter? What was the point you were trying to make? Seeing your library might be to wonder wtf snakeoil you're peddling. That was the thought that crossed my mind to wonder how in the blue hell you can get 13 GBps results on a clearly limited VPS. But then when no one gave 2 shits about your benchmark app, I let it go.

Nor will I discuss the - very clear - ratio of positive feedbacks on my work vs. your ranting.

What is the ratio of 0:infinity? Holy shit, you just boasted about positive feedback when THERE IS NO ONE HERE BUT ME who gave your app a test in almost two years. And all my feedback was negative. I posted about all the strange results I had and how many people stepped in and posted their results showing how well it reflected their VPS? Right, zero people. I wouldn't be surprised if I've ran your benchmark app more than all the other people, combined.

Instead I will bow before your wisdom and experience and suggest that we change roles that is, that not I but you do the benchmarks and reviews while I'll have a close look at your work and results.

"If you don't like how I'm doing it, do it yourself" rather than accept critical feedback for doing a subpar job. This is what I call the "dirty dishes excuse". Just because you do a shitty job at washing the dishes, doesn't mean a) we should eat from your dirty dishes and act like dishes are normally dirty, b) that I want or should have to wash the dishes rather than for YOU to wash dishes properly in the first place.

I don't have the time or energy to be the Mayor, Premier, Prime Minister, etc, but I'm still going to point out when they are doing a poor job of things. I don't necessarily want them to stop, just do a better job at what they signed up for. But in your case, you really can feel free to stop if you're always going to be do ignorant of your faults instead of trying to do a better job. You can go on any review site and in the comments you'll find criticism of methodology. It's to be expected. Sometimes it educates the user, sometimes it leads to improved testing methods. Other times, its perfectly fine to say that its not within time or budget, but don't misrepresent what testing was done and the conclusion of the testing.

jsg · April 2020

@TimboJones said:
[lots of the usual "I know everything better and you are stupid and incapable" bla bla bla]

>

"If you don't like how I'm doing it, do it yourself" rather than accept critical feedback for doing a subpar job. This is what I call the "dirty dishes excuse". Just because you do a shitty job at washing the dishes, doesn't mean a) we should eat from your dirty dishes and act like dishes are normally dirty, b) that I want or should have to wash the dishes rather than for YOU to wash dishes properly in the first place.

Short version: No, you won't do the benchmarking and reviews.

No surprise there ...

_[bark, bark] _I'm still going to point out when they are doing a poor job of things. I don't necessarily want them to stop, just do a better job at what they signed up for.

As you just don't get it: it is utterly irrelevant whether you want me to stop or to continue, and so is your "feedback". Simple reason: your "criticism" is based on your lack of understanding, knowledge, and expertise (in the relevant field) - not in some fault of mine.

With one exception: I got it and I even understand that you (and maybe even some others who didn't speak up) don't like the way I do my disk tests. I understand - and have learned to take seriously - the need of some people to get what they, what is "commonly" felt how a disk test should like.

Had you ever put that in the form of constructive criticism, I would have let you know that I'm about to extend them to also offer something akin to the "dd" type test one sees everywhere. But you chose to bark at me and to amuse me with "criticism" that obviously lacked even basic understanding, let alone the capability to show me where my code was wrong. THAT is why I treated your "criticism" (and you) the way I did and why I asked you to take over the job.

You should learn that screaming at people and getting ever more angry != convincing. And even convincing wouldn't have been needed because I'm a friendly guy who often does small things for people. Simply asking me in a friendly way would have led to a friendly answer, very simple. But that's not TimboJones. TimboJones must(?) tear down, attack, bully, paternalize, etc. Just yesterday someelse told you something similar; iirc he said that you behave like an a__hole. You might want to think about that. And maybe, just maybe, you might also want to think about seeing that even tests that you consider shitty are the outcome of an effort, an effort not even paid for btw. Trust me "thank you, I appreciate your work, but ..." is much much better way to get what you want.

Have a good week.

uptime · April 2020

need fully managed benchmark

up to 20

Falzo · April 2020

@uptime said:
need fully managed benchmark

up to 20

PHP SELEKTOR okay?

jsg · April 2020

@uptime said:
need fully managed benchmark

up to 20

Have debian and $7?

@Falzo said:
PHP SELEKTOR okay?

Tsk, you are not a pro. A pro would know the correct term. It's PHP SELEKTER!

Falzo · April 2020

@jsg said:

You should learn that screaming at people and getting ever more angry != convincing.

man, you're full of advice to whomever criticises you. how about you don't get triggered every time someone writes something and instead think about your way dealing with things first.

cut to the core and bring on your reviews. acceptance and understanding amongst the audience will tell if it's worth a read or not, simple as that. I'd recommend this as the more appropriate way to silence critics.

uptime · April 2020

jsg said: debian and $7

spamer! off topic.

jsg · April 2020

@Falzo said:

@jsg said:

You should learn that screaming at people and getting ever more angry != convincing.

Thanks for coming out of the shadow. Feel free to offer a factual counter argument to what you quoted.

Btw, it's still PHP SELEKTER.

@uptime said:

jsg said: debian and $7

spamer! off topic.

Hell, you got me.

TimboJones · April 2020

@jsg said:
Trust me "thank you, I appreciate your work, but ..." is much much better way to get what you want.

How would that go?

Me: "thank you, I appreciate your work, but your benchmark app reports nonsensical numbers. You test performance for mere seconds and do not do any typical user activity or provide useful comparative results for the reader to take away and have some added value as a result. Interactions with support are glossed over routinely so the user doesn't get a sense of the provider interaction or attitude. Your idea of lightly testing the VPS and not putting it under any typical or higher load is like doing a bake-off and judging how good the chef is at running a restaurant by eating one bite of cake."

jsg: your lack of understanding, knowledge, and expertise (in the relevant field) - not in some fault of mine.

But I'm not thankful, don't appreciate your work, and its clear you need some bullshit fluffing up of your ego to recognize your issues. I don't expect you to instrument the thing up the wazoo and provide infinite metrics, but there really isn't anything of value from your testing and worse, misleading and wrong because of the app reporting. Just have a look here and here to get an idea of how you can provide the user with some usable takeaway from your reviews.

jsg · April 2020

OK, I'll try it again. Maybe you are reachable by facts and logic.

@TimboJones said:
Me: "thank you, I appreciate your work,

Yes. You see, either you are interested in benchmarks or not. If yes then you have reason to appreciate the fact that someone makes some efforts, even if they are not to your taste. If no, then why are you commenting on something you don't care about?

but your benchmark app reports nonsensical numbers.

No. You THINK it does but it does not. Maybe you think the numbers are unreasonable because you have a certain expectation and/or because you don't know or understand what they actually report.
Actually I multiple times provided hints e.g. by explaining why I wrote my own benchmark software. I do not want to report what someone used to "dd" based numbers expects. My goal is a different one and I did explain that quite detailed. So your whole rage is off-track because it boils down to "you don't do it the way I'm used to!".

You test performance for mere seconds

Wrong. What I report in my reviews is based on a series of benchmark tests over some time (typ. 2 or 3 days).

do not do any typical user activity

Almost no benchmark does. Simple reason: what is a "typical user activity"? That's different for almost every user and benchmark are synthetic by their very nature.
I do however provide more and better info than most VPS benchmarks so that e.g. a DB heavy user can get an impression useful for him just as a streaming heavy user can. In fact, that was one thing I wanted and one of the reasons I wrote that software.

[not] provide useful comparative results for the reader

Comparative to what? And btw., again, I wanted a different benchmark. That was not a secret and I even explained it in quite some detail.

Interactions with support are glossed over routinely so the user doesn't get a sense of the provider interaction or attitude.

Yes, that is a justified point of criticism. And unfortunately a point I can't do a whole lot about, because you see, if I always ask the questions or ask for the same kind of help, the results quickly become irrelevant. If I do different support tests with each provider the I will (justifiably) be accused of providing results that are not comparable. Plus, I happen to be quite experienced user with a strong technical/academic background. How am I supposed to emulate a clueless and inexperienced person?
So, I largely stick to some quantifiable points in that department, things like how fast, average or slow support response is (which btw. is also a subjective issue. Person A might find 3 hrs response time unbearably slow, while person B might find it pleasantly quick).

Your idea of lightly testing the VPS and not putting it under any typical or higher load is like doing a bake-off and judging how good the chef is at running a restaurant by eating one bite of cake."

(a) that is what pretty much every benchmark does, partly due to the nature of a VPS
(b) in fact, my benchmark is testing more and harder than most others. Plus, again, my reviews are based on many runs and at different times of the day.

But I'm not thankful, don't appreciate your work

That would even be OK. Not really great (socially) but OK. The problems begin when you throw rants at me that, frankly, over and over again show that you don't know enough about computers, hosting, and benchmarking and basically throw dirt at me because my benchmark doesn't meet your personal expectations.

Just have a look here and here to get an idea of how you can provide the user with some usable takeaway from your reviews.

Funny that you say that because that author respected and liked me and my expertise in benchmarking (he seems to be not here at LET anymore). I'll leave it at that because I like and respect that person and I do not want to pull someone else into your furor-mill. Btw. he (and another quite well known benchmark-author) agreed with me wrt. my view on benchmarking and why for example "dd" based benchmarks are of little use. They both wanted my disk tests as a library for their own tests.

TimboJones · April 2020

@jsg said:
OK, I'll try it again. Maybe you are reachable by facts and logic.

@TimboJones said:
Me: "thank you, I appreciate your work,

Yes. You see, either you are interested in benchmarks or not. If yes then you have reason to appreciate the fact that someone makes some efforts, even if they are not to your taste. If no, then why are you commenting on something you don't care about?

I thought you were going to talk about facts and logic, not make illogical conclusions. I visit the threads because I am interested. So much so, I am extremely disappointed by the "reviews" that I took the time to comment and provide feedback on my exact problems with it. I don't go attacking you for no reason, I always have a point I'm making. Appreciating efforts is just more of you needing your ego stroked, and not on providing useful info.

but your benchmark app reports nonsensical numbers.

No. You THINK it does but it does not. Maybe you think the numbers are unreasonable because you have a certain expectation and/or because you don't know or understand what they actually report.

You don't get it. It can report HIGHER than physically possible, not just what I'm expecting. Higher than I'm expecting just makes me question them, seeing higher than physically possible makes me realize they are garbage.

The issue is storage speeds, but it's like saying you have 3Gbps on a 1 Gbps NIC. I'm going to call shenanigans. It doesn't represent the performance or capabilities whatsoever. I don't know how else to explain this for you to understand.

Here is the output on an Oracle VM I had last year. Go and get yourself one (they are free) to test with. They are way, way, way oversubscribed and performance is shit. Here is what your benchmark reported last year (syntax benchmark is ran with, feel free to tell me I'm wrong and what it should have been), followed by a YABS benchmark (where 50MB/s is in the ballpark of tested performance and a much better indicator. Anyone here with a free Oracle VM will attest to this performance being 50MB/s or worse):

2019-11-18 10:42:23 (2.10 MB/s) - ‘vpsbench.tar.gz’ saved [2235143/2235143]
[root@pi-unicorns-com ~]# tar zxvf vpsbench.tar.gz
vpsbench/
vpsbench/vpsb.linux-x64
vpsbench/manual.txt
vpsbench/vpsb.obsd-x64
vpsbench/vpsb.fbsd-386
vpsbench/vpsb.linux-386
vpsbench/my.targets
vpsbench/vpsb.obsd-386
vpsbench/vpsb.fbsd-x64
vpsbench/README
vpsbench/source.tar.gz
vpsbench/license.txt
[root@pi-unicorns-com ~]# cd vpsbench/
[root@pi-unicorns-com vpsbench]# ./vpsb.linux-x64 my.targets
Using "my.targets" as target list
Machine: amd64, Arch.: x86_64, Model: AMD EPYC 7551 32-Core Processor
OS, version: Linux 3.10.0, Mem.: 987 MB
CPU - Cores: 2, Family/Model/Stepping: 23/1/2
Cache: 64K/64K L1d/L1i, 512K L2, 16M L3
Std. Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
          pse36 cflsh mmx fxsr sse sse2 htt sse3 pclmulqdq ssse3 fma cx16
          sse4_1 sse4_2 popcnt aes xsave osxsave avx f16c rdrnd hypervisor
Ext. Flags: fsgsbase bmi1 avx2 smep bmi2 syscall nx mmxext fxsr_opt pdpe1gb
          rdtscp lm lahf_lm cmp_legacy cr8_legacy lzcnt sse4a misalignsse
          3dnowprefetch osvw topoext perfctr_core

--- proc/mem/performance test single core ---
................................................................
64 rounds~ 1.00 GB ->  137.28 MB/s
--- proc/mem/performance test multi-core ---
................
4 times 64 rounds ~ 4.00 GB ->  141.26 MB/s
--- disk test ---
Sequential writing .................................................................................................................................
769.36 MB/s
Random writing     .................................................................................................................................
1.133 GB/s
Sequential reading .................................................................................................................................
206.67 MB/s
Random reading     .................................................................................................................................
4.772 GB/s
--- network test - target       100KB  1MB  10MB   -> 64 MB ---
http://speedtest.fra02.softlayer.com/downloads/test100.zip      DE,FRA: .......
        2.2 Mb/s   6.2 Mb/s   27.9 Mb/s    -> 27.7 Mb/s
http://speedtest.par01.softlayer.com/downloads/test100.zip      FR,PAR: .......
        2.2 Mb/s   7.0 Mb/s   32.4 Mb/s    -> 28.2 Mb/s
http://speedtest.ams01.softlayer.com/downloads/test500.zip      NL,AMS: .......
        2.3 Mb/s   7.2 Mb/s   33.4 Mb/s    -> 29.2 Mb/s
[root@pi-unicorns-com vpsbench]# curl -s https://raw.githubusercontent.com/masonr/yet-another-bench-script/master/yabs.sh | bash
# ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## #
#              Yet-Another-Bench-Script              #
#                     v2019-10-08                    #
# https://github.com/masonr/yet-another-bench-script #
# ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## #

Mon Nov 18 10:44:41 GMT 2019

Basic System Information:
---------------------------------
Processor  : AMD EPYC 7551 32-Core Processor
CPU cores  : 2 @ 1996.246 MHz
AES-NI     : ✔ Enabled
VM-x/AMD-V : ❌ Disabled
RAM        : 987M
Swap       : 8.0G
Disk       : 39G

Disk Speed Tests:
---------------------------------
       | Test 1      | Test 2      | Test 3      | Avg        
       |             |             |             |            
Write  | 55.70  MB/s | 51.30  MB/s | 51.30  MB/s | 52.77  MB/s
Read   | 51.27  MB/s | 51.06  MB/s | 51.15  MB/s | 51.16  MB/s

iperf3 Network Speed Tests (IPv4):
---------------------------------
Provider                  | Location (Link)           | Send Speed      | Recv Speed     
                          |                           |                 |                
Bouygues Telecom          | Paris, FR (10G)           | 47.4 Mbits/sec  | 45.9 Mbits/sec 
Online.net                | Paris, FR (10G)           | 47.8 Mbits/sec  | 42.2 Mbits/sec 
Severius                  | The Netherlands (10G)     | 47.8 Mbits/sec  | 44.3 Mbits/sec 
Worldstream               | The Netherlands (10G)     | 48.2 Mbits/sec  | 47.9 Mbits/sec 
wilhelm.tel               | Hamburg, DE (10G)         | 47.7 Mbits/sec  | 0.00 bits/sec  
Biznet                    | Bogor, Indonesia (1G)     | 0.00 bits/sec   | 0.00 bits/sec  
Hostkey                   | Moscow, RU (1G)           | 46.8 Mbits/sec  | busy           
Velocity Online           | Tallahassee, FL, US (10G) | 48.0 Mbits/sec  | 49.0 Mbits/sec 
Airstream Communications  | Eau Claire, WI, US (10G)  | 49.2 Mbits/sec  | 49.6 Mbits/sec 
Hurricane Electric        | Fremont, CA, US (10G)     | 47.3 Mbits/sec  | busy           

Geekbench 4 Benchmark Test:
---------------------------------
Test            | Value                         
                |                               
Single Core     | 1472                          
Multi Core      | 1582                          
Full Test       | https://browser.geekbench.com/v4/cpu/14941852

Look at the sequential and random reads. That is just such a substantial difference because they should be reverse, sequential should be higher than random, and if it isn't, it's suspect. Background processes can affect the results at the maximum performance, but the test is useless because it doesn't run for any length of time to see anything useful.

TimboJones · April 2020

Actually I multiple times provided hints e.g. by explaining why I wrote my own benchmark software. I do not want to report what someone used to "dd" based numbers expects. My goal is a different one and I did explain that quite detailed. So your whole rage is off-track because it boils down to "you don't do it the way I'm used to!".

Having a plan and an idea is different than executing the plan and idea. I've never once spoken about your idea of how the application runs in terms of all that cache and buffer bullshit rants you go on, my concern is that it doesn't work the way you intended it to work. Sometimes the app runs for like 1-2 seconds with huge numbers, other times the test runs at what looks like a steady pace, then pauses for seconds, then continues. I have no insight into wtf its doing when it finishes fast and when it pauses. They are not consistent.

You test performance for mere seconds

Wrong. What I report in my reviews is based on a series of benchmark tests over some time (typ. 2 or 3 days).

Here is a previous "review" where to me, the disk speeds are a huge red flag and highly suspect of a major bottleneck. I replied asking you to clarify your assessment that 10MB/s was good for the SSD in a dedi but in the VPS, it had 14MB/s for the same metric, and you reported "Meh, not so nice". Your reply back was useless in that you didn't respond to my post in terms of your analysis, and side stepped with making it sound like I needed to read harder. This is a repetitive theme with you.

do not do any typical user activity

Almost no benchmark does. Simple reason: what is a "typical user activity"? That's different for almost every user and benchmark are synthetic by their very nature.
I do however provide more and better info than most VPS benchmarks so that e.g. a DB heavy user can get an impression useful for him just as a streaming heavy user can. In fact, that was one thing I wanted and one of the reasons I wrote that software.

That's not true. Professional testing sites test the time to run various synthetic and real world activities all the time. You should check a few out. Compiling software, rendering stuff, installing LAMP stacks, webserver loading, latency testing to various services, etc. I've told you several times in the past that centminmod is the golden example to follow.

[not] provide useful comparative results for the reader

Comparative to what? And btw., again, I wanted a different benchmark. That was not a secret and I even explained it in quite some detail.

Comparative to any of your previous reviews or other benchmarks by you or others. Compare it to rpi's, it matters less if you have a reference point that is understandable to most. But you don't use a popular benchmark, no one wants to use yours, and your analysis is pretty much unhelpful. Different is not supposed to be pointless and useless, it's supposed to have a specific improvement over typical, and it sounds like you think you do, but you don't.

Interactions with support are glossed over routinely so the user doesn't get a sense of the provider interaction or attitude.

Yes, that is a justified point of criticism. And unfortunately a point I can't do a whole lot about, because you see, if I always ask the questions or ask for the same kind of help, the results quickly become irrelevant. If I do different support tests with each provider the I will (justifiably) be accused of providing results that are not comparable. Plus, I happen to be quite experienced user with a strong technical/academic background. How am I supposed to emulate a clueless and inexperienced person?

I swear you troll me with your replies. If a "quite experienced user with a strong technical/academic background" STILL needs to contact Support, IT'S EVEN MORE IMPORTANT THAT YOU WRITE ABOUT IT. I'm just flabbergasted at your attitude.

So, I largely stick to some quantifiable points in that department, things like how fast, average or slow support response is (which btw. is also a subjective issue. Person A might find 3 hrs response time unbearably slow, while person B might find it pleasantly quick).

Response times are good, but without knowing the reason and the level of effort, it's not as helpful as you're thinking. You don't need to say your opinion of how fast or slow the response was, that is up to the reader to decide, but knowing what the issue was about and what the response was, is all the difference.

Your idea of lightly testing the VPS and not putting it under any typical or higher load is like doing a bake-off and judging how good the chef is at running a restaurant by eating one bite of cake."

(a) that is what pretty much every benchmark does, partly due to the nature of a VPS
(b) in fact, my benchmark is testing more and harder than most others. Plus, again, my reviews are based on many runs and at different times of the day.

Yes, but a single user posting of a popular benchmark result is not the same as a "review". I thought you understood that and that was the point of making your own thread and calling it a review. How many runs are averaged? What times of the day?

But I'm not thankful, don't appreciate your work

That would even be OK. Not really great (socially) but OK. The problems begin when you throw rants at me that, frankly, over and over again show that you don't know enough about computers, hosting, and benchmarking and basically throw dirt at me because my benchmark doesn't meet your personal expectations.

To clarify, I'm thankful for when people make an honest effort and are interested in improving and getting better. Not sticking their head in the sand. You don't actually know what experience I have, what industry I'm in, or what I do for a living (cough I have more paid experience than you in testing cough). You've stated you're a "security developer", which is not in professional testing/QA. You've demonstrated you don't verify your results, so you'd be very bad at testing. Your "reviews" babble about shit that doesn't matter, but doesn't include test methodology, test procedure, or test conditions, so we know you don't provide any professional test results to colleagues or superiors. On top of everything, you test on BSD, which will perform differently than various Linux distributions of older or newer kernels, so the typical user will see even less relevance to your results than to theirs if they were to purchase the same service.

Just have a look here and here to get an idea of how you can provide the user with some usable takeaway from your reviews.

Funny that you say that because that author respected and liked me and my expertise in benchmarking (he seems to be not here at LET anymore). I'll leave it at that because I like and respect that person and I do not want to pull someone else into your furor-mill. Btw. he (and another quite well known benchmark-author) agreed with me wrt. my view on benchmarking and why for example "dd" based benchmarks are of little use. They both wanted my disk tests as a library for their own tests.

Respected? As in no longer respects you? That wouldn't surprise me. But let's review: you were doing "reviews" before him, and yet he found the need for there to be a better review standard. And how many of those interested actually got your "library", implemented it, released it, and we're all enjoying how accurate and useful they are now compared to before?

Falzo · April 2020

TimboJones said:

Sequential writing
769.36 MB/s
...

now this got me interested and I had a look at the source code in diskb.pas
@jsg please bare with me as my skills in pascal and even C are only basic, but maybe you can answer a few genuine questions or jump in for everything that I am missing...

I looked at the sequential writing only so far and I get that you run 16k iterations (SLICECOUNT) of writing a 16k block (SLICESIZE*32) to a file and measuring the time for each run.
for writing to that file you first fill a buffer via some C fancyness that creates random numbers and puts them into your tbuf array, so far so good.

is fileWrite able to handle that array as buffer correctly and/or could eventual internal converting/unnesting lead to overhead?
how do you make sure the file actually gets written to the disk directly and not being cached or even only constructed in memory before actually written?

also about the getHRtime() that you created for measuring the time. afaics you're using a small c function to pull timestamps in µs. if we now look at a result like 1GB/s for writing a total of 256MB and calculate backwards that leads to something of around 15µs for one iteration in your loop. so a deviation of +/- 1µs (rounding in the calculations) is already about 6% tolerance...

wouldn't it be better to stick to nanoseconds in the first place?
did you check on the time single iterations take, at some point in creating this to see if the values could be too volatile?

jsg · April 2020

@TimboJones

OK, I got it, your opinion is set in stone and you are not even willing to question your basis and point of view. So, I'll keep this very short:

No, this is not about getting my ego stroked. My ego gets stroked when someone who created a programming language asks me for advice on how to implement static verification. Or when I find a mathematically provable bug in an AEAD (crypto) finalist reference implementation. In other words: I get stroked in my field of expertise - not by someone here appreciating my efforts.
When I speak about appreciating some effort here then it's not about me but about the community. Without appreciation for the efforts made by contributors a community will get very shallow quickly.

As for the values: You don't get it, do you? The values are what they are. If writing x amount of data takes y time then that's the result, period.
Of course it's interesting to look at unexpected results and to research what causes them. Possible candidates typically are various caching layers, the way an OS does disk IO, etc. But to discuss those possible causes one must have a level of specific knowledge that you simply don't have, obviously. And no, that's not an insult, that's simply normal. I myself for example am quite clueless wrt Windows and a lot of other areas - but then I wouldn't be stupid enough to try to "teach the teacher" in those areas.

jsg · April 2020

@Falzo said:
now this got me interested and I had a look at the source code in diskb.pas
@jsg please bare with me as my skills in pascal and even C are only basic, but maybe you can answer a few genuine questions or jump in for everything that I am missing...

You got it about right. But the source you are looking at is not the (advanced meanwhile) version I use for testing. One example for a change is that I overhauled diskb.pas quite a lot and both SliceCount and SliceSize aren't const anymore but variables that can be set via two commandline parameters. One major reason for that change was the fact that I saw that I sometimes needed to do way more extensive (and shell scriptable) disk tests. Funnily the trigger in that case was a provider (at LET) who liked my program but needed quite extensive and more elaborate testing to make decisions on how to build (for his use case) better nodes.
So, short version: For the reviews a considerably enhanced version of the software was, is, and will be used.

is fileWrite able to handle that array as buffer correctly and/or could eventual internal converting/unnesting lead to overhead?

Well, I didn't look at the asm that my source creates but I have good reason to assume that FreePascal creates quite good code. Moreover the real difference is the size of the array elements. Every disk write/read works with arrays (on the OS level) but usually the elements are bytes while in my code it's 8 bytes (64 bits, an x86-64 word size) -plus- it doesn't actually write 64 bit words but bytes. In C (which you seem to know better) it's something like 'foo((unigned uint8_t *) testArray);' where testArray was declared as 'uint64_t testArray[SOME_SIZE];'. That is, all that's needed is to tell the compiler to consider 'testArray' to be a byte array (of 8 times it's uint64 size). So, the answer is NO, it doesn't lead to overhead.

how do you make sure the file actually gets written to the disk directly and not being cached or even only constructed in memory before actually written?

(a) in the end: I can't, because (at the level of a normal user program) I can't look into what the OS really does.
(b) In the next test I read those values back from disk.
(c) I give the OS a file that kind of blinks saying "do not try to compress me, it's not worth the effort" (by handing over purely random data).
(d) If I get mistrusting I can follow up closely via OS call tracing. What I always found so far is that the OS does do quite some tricks but in the end does write out the data.

Also keep in mind that in part that's not a problem but a feature. If a given VPS does perform very well within a reasonable spec frame then who cares whether that's due to a fast NVMe or due to extensive and smart caching? Either way the user get good performance with the spec frame.

also about the getHRtime() that you created for measuring the time. afaics you're using a small c function to pull timestamps in µs. if we now look at a result like 1GB/s for writing a total of 256MB and calculate backwards that leads to something of around 15µs for one iteration in your loop. so a deviation of +/- 1µs (rounding in the calculations) is already about 6% tolerance...

wouldn't it be better to stick to nanoseconds in the first place?

did you check on the time single iterations take, at some point in creating this to see if the values could be too volatile?

First, yes you got that right too. As for your questions, yes and no. Yes, it would be nice to work on a nanosec basis. But no, mainly due to three reasons: (a) I wanted my benchmark program to also work on old 586 architectures, and (b) Look at the context! We are talking about VPSs on a node with quite a few VPSs. In that context granularity is anyway severely limited. (c) I take musec based timing on each slice so when finally computing the results eventual tiny bumps in either direction get evened out anyway.
Btw: Even if I did my timing with millisec granularity I would still be way more precise than virtually all the benchmarking scripts because those are usually based on very poor timing sources (like time calls, OS timing (usr, real,sys)) etc.

Falzo · April 2020

@jsg said:

thanks for taking the time to answer and more or less confirming my thoughts ;-)

I still think the high count of measuring small timesteps is prone to error, as a small missed thing in a single iteration could be amplified a lot. therefore results can deviate much more I assume.
measuring an overall time in a single run of another tool even if the timing source is worse might not suffer from that ;-)

don't get me wrong, I don't disagree that doing iterations to even out bumps makes sense... just seems awful small thing multiplied by a lot.

for my naive eyes in the end you are measuring just the time it takes to complete the fileWrite() call - but I am really mistrusting that this writes something directly to the disk during that call rather than just passing your buffer data up to the given size to the handle and return without waiting for an actual flush to happen.

doesn't freepascal know something like (internal) file buffers? and shouldn't these be flushed everytime to make sure they get written to the disk during the period that gets measured?

from plain looking at the dots during the benchmark run (on a knowingly slow system), one can tell that a few run through quickly with about the same time between, then it gets stuck for a moment, a few more iterations pass by, it again stops for a moment and so on.

this behaviour makes me guess it eventually is forced to flush a buffer to disk and then wait for it - however it appears to be more random when this occurs and especially does not neccessarily end exactly with a flush like that, so I see quite different result with each run.

as written before I was just curious and wanted to try and understand the visible difference in numbers between tools. your approach is interesting, however I rather stick with fio then or measuring time on real world tasks like copying dirs/files ;-)

jsg · April 2020

> @Falzo said:
> thanks for taking the time to answer and more or less confirming my thoughts ;-)

No problem, my pleasure. I actually *value* concrete and well based questions and criticisms.

As for your two main points:

First, again, the source you are looking at has been quite considerably changed. But I get your worries and in fact I share some of them myself. The problem is that there is no "right" way. If the OS caches a write request plus in many cases the hardware does too (think e.g. buffered Raid controller) plus even the disk itself often has a cache then you get an almost instantaneous return and OK from the OS.
If you go another route and just bundle a lot of writes or even all of them then you necessarily include a whole lot of other operations too in your timing which introduces "bent" results too.

I have chosen the route I took after *a lot* of testing and experimenting with different OSs and setups (e.g. buffered controllers, diverse disk types, etc) because it seemd to be the best one, and frankly, I still quite content with that route although I'm working on some more options and enhancing some details.
Also keep in mind what my goal was and is: It is *not* to be the ultimate disk tester but to be a much better general VPS and dedi benchmarking tool.

But you are not wrong and one of the things I'm working on for v.2 is indeed more disk test types, e.g. with OS caching enabled or disabled, etc. (although btw. the OSs are often not entirely honest too and for example may or may not honor some flag you set).

As for the other point, the "eventually flushing out to disk" you are right but that's largely the OS's decision. But one may also approach that positively and say "at the end you get what has been configured and what's done by the OS as set up" which boils down to telling the user something about a node.

Howdy, Stranger!

Categories

In this Discussion

Upcoming LET/LEB Review Series

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Upcoming LET/LEB Review Series

Comments