New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
weird MariaDB slowness and high IOwait on htop output
This discussion has been closed.
Comments
Have you ever tried the EX101?
It was on my radar, but I am not very convinced about that efficient/performance core design , also heard something like 13900 is not stable due to power supply or something on reddit or something ...
All I can say is that the EX44 works wonders for me, it runs databases amongst other things.
That one also has consumer grade disks. You're lucky that you don't have too many write transactions.
Unfortunately, Hetzner decided not to have anything smaller than 1.92TB enterprise NVMe in their current lineup, and then also to put it only in more expensive server models. That's why Server Auction is still a good place to pick something performant, if you can tolerate older hardware.
maybe I should post in Request section first if I can find someone resells hetzner , AX102 is over-kill for me , I probably need only like half of the CPU cores
I mean it writes what I'd call "a lot" to, mainly, two different databases, clearly not enough to warrant an enterprise disk though. It definitely reads much more than it writes, and A LOT is cached in RAM.
I'm still surprised though. Had no idea the difference could be that big. Reading speeds isn't as different as writing?
Yes, reading is not very different between consumer/datacenter, actually newer consumer devices are probably reading faster than older enterprise ones, thanks to various improvements in the meantime (Gen3, 4, 5...).
It's not even writing per se, if you compare sequential writing of big files. Probably once again not much difference if you copy say DVD image to NVMe. Again, newer and bigger drives would be faster, but not much difference consumer/datacenter.
It's only if you make a lot of small write transactions where you ask disk to report back only when the data is safely on the disk - that is what fsync() does. Then the difference is quite big, order or even 2 orders of magnitude. But, as I already said, write intensive databases do this all the time, 24/7. Though, read intensive are much more common, that's how you survive on EX44.
Very interesting. I thought that the high end gaming/productivity drives were actually faster than enterprise (at least in bursts), had no idea that that’s a read thing and that enterprise is as you said orders of magnitude faster when it comes to lots of small writes.
Good to know if writes ever becomes a problem in the future on my EX44
For the sake of fairness: a (tiny super) capacitor and some DRAM cost pocket money - but add hundreds of $$ to the profit.
Just wait until chinese companies don't just manufacture (for western brands) but also sell under their own brand enterprise SSDs. Then, first they'll be grinned at, but after a while and with a decent track record, they'll sell, sell, sell and the price difference will shrink quickly.
Btw. the super cap is only needed because the DRAM cache is volatile, plus, of bloody course for marketing (sakkurity always sell well and with very decent margins).
Frankly, I myself would rather go for MLC without DRAM cache or, if available for a realistic price, even for SLC instead of "enterprise" (mostly) marketing BS.
Besides, if speeeeed is so super important simply buy more RAM. Solves many DB performance related problems and way better than a small-ish cache on an SSD.
Is DRAM the actual difference though? Not at my computer so I cannot check what consumer NVMe model my EX44 has, I believe it’s some Samsung NVMe, but there are plenty of consumer grade drives with DRAM, right?
Just for example: https://www.gigabyte.com/SSD/AORUS-Gen4-SSD-1TB#kf
The purpose of the ram on ssds is to store the FTL and also as a write cache to avoid writing less than one cell at a time. The cache has to be flushed on fsync unless it has physical power loss protection.
You can see where these comically low iops figures come from - a 4kb write is amplified to a full cell (8mb for example), 500 iops * 8MB = 4000 MB/s equivalent without the ridiculous amplification
SSDs truly without ram (eg BX500) are almost e-waste - don’t even need a database workload to hit this issue. They don’t have a reputation as worse than HDDs for nothing. HMB NVMes are fine
No amount of system ram or old SLC/MLC nand will avoid the fsync bottleneck. All you can do is turn off the safety at the application and/or filesystem level and pray you never lose power
Yes, albeit likely slower.
Uhm, "old" is the wrong term, they are only old because ever newer - and denser - flash became available but there's an ugly 'but': denser ~ less cycles (shorter life) and, oopsie, lower speed. A good deal for sales pushers because more capacity for less cost. A really shitty deal for customers though because what they get is crappier albeit cheaper (per GB capacity). So, "much better and faster" would be the correct term.
So, it's not by coincidence that one sees SLC (and nowadays probably even MLC) cache with TLC and especially QLC drives (the latter in particular to avoid!). Similar with DRAM cache; DRAM is way cheaper than SRAM, simple as that, but guess what (CPU) cache is. Yep it's basically SRAM.
I, not that long ago, had two, oh so great and fast and reliable, Samsung SSDs suddenly die on me first one, next day the other. Those were TLC and the really shitty one was QLC.
No problem, I'll let you have your "modern" SSD and you'll let me be happy with my "old" SLCs and a few MLCs.
You're completely right! I just checked my cheap auction server with Datacenter SSD and it gives
iops : min= 8794, max= 9980, avg=9285.21, stdev=252.08, samples=28
@maverick I'm not 100 % sure, but I do think my EX44 comes with 2x PM9A1 MZVL2512HCJQ
Can't find any specs from Samsung but found this: https://www.techpowerup.com/ssd-specs/samsung-pm9a1-512-gb.d787
DRAM Cache
Type: LPDDR4-1866
Name: SAMSUNG
Capacity: 512 MB
(1x 512 MB)
Organization: 4Gx32
Is the enterprise disk's DRAM much faster or what's the actual difference that could lead to over 10x the import time on the consumer grade one?
Although I might have gotten it wrong, or, his AX52 came with a worse drive.
DRAM is not enough, it must also be protected in case of power loss. That being cheap consumer drive, no, you don't have that.
Check this:
fio --name=fsync --rw=write --bs=4k --ioengine=posixaio --iodepth=1 --fsync=1 --size=32M --filename=__test__ --unlink=1
2 x SAMSUNG MZQL21T9HCJR-00A07 (RAID-1) as found in a AX102: 32604 KiB/s write bandwidth
2 x SAMSUNG MZVL2512HCJQ-00B00 (RAID-1) as found in a AX42: 658 KiB/s write bandwidth
Tha latter one should be exactly your model, right? Can you see the difference?
Indeed it’s not the speed of the ram cache that matters, but whether it can be used at all. You have to either opt out of power loss safety, or have physical power loss protection
@maverick @darkimmortal Ah, yes, I think I get it, thanks
@darkimmortal and @maverick
Thanks for the technical background info. It's super interesting to know the reasons behind the performance differences, especially with database workloads.
What terms, other than the label "datacenter," should someone look for in SSD specs to find out whether this feature is available? Is it possible some of the more premium, "prosumer" products offer it as well?
Needs to have a fat bank of capacitors in it (you won't miss them if you can see the board, they look very different to the controller chip and nand chips), some old prosumer drives took that approach before firmware got smarter and started using journalling to protect the FTL, as the alternative without journalling is a chance of total wipe on power loss
As for in marketing it's generally called out as "power loss protection". For example Crucial instead uses "power loss immunity" in consumer NVMes to refer to the journalling approach. But less scrupulous vendors will use "power loss protection" to refer to either one
Labeled datacenter or enterprise, and you should be fine. Vendors also like to talk about end-to-end data protection which includes power loss protection + data path protection (peeking at Micron product brief). The idea is that data won't be lost under any circumstance, all good stuff.
One final thing is drive durability, i.e. how much you can write to it before you finally destroy cells. Consumer devices are usually 0.3 DWPD (disk writes per day) meaning you can write up to 30% of drive size every day, and then the drive would survive its warranty period which is usually 3 years for consumer devices.
Datacenter/enterprise devices are always 5 years warranty, good devices start at 1 DWPD (which is called read intensive), through 3 DWPD (which is mixed use), and then even 10 DWPD or more when you expect really heavy write workload (this is called write intensive).
Now, for the sake of fairness: it's not as if drives without "power loss protection" will die, let alone "chance of total wipe on power loss", especially in a DC!
Re "will die" - no, that's pretty much BS and "selling more expensive stuff via fear mongering". Fact is that a proper DC has multiple layers of equipment to avoid any infrastructure (e.g. routers) or hosted server ever losing power.
Fact also is that servers don't just "drop dead" within microseconds on power loss. They have pretty beefy capacitors in the PSU(s), way, way larger capacity then the tiny thingies on an SSD.
FACT also is that there even are official specs on how long a PSU must hold up on partial or total power loss and that spec (in the ballpark of 10 ms iirc) is like "half an eternity" for a computer.
Finally, FACT also is that I've yet to (credibly) hear about a "total wipe (of SSD) on power loss". Makes no sense at all because flash memory simply doesn't work like that. The maximum thinkable loss would be outstanding writes in DRAM cache. That said, you do not want to experience that!
My guess is that what one really gets in expensive "enterprise" SSDs is faster and lower 'x' in "xLC", lower also meaning faster and, often more important/desired, longer life time.
So the "trick" is to go for low 'x' "xLC" - if available at a decent price - and for SLC cache (vs. DRAM cache) for non-DC settings (like home offices) which is the cheapest route for speed/performance, life time, and safety (of data), unfortunately though there is a big fat 'but': low x SSDs are increasingly hard to find and/or more expensive, sorrie.
Last point: "cost". A private user and even many prosumers look at that in terms of "price, now", while businesses tend to think in terms of "cost per [units of time]" e.g. 4 years. So enterprise disks look expensive to private user but actually tend to be cheaper to businesses. Keep that in mind when comparing prices.
Side note: a large and professional provider need not care about "DRAM cache" vs SLC because they know their DC very well.
Doesn't matter, the SSD is not signalled to start flushing data. You can actually get addon capacitors to keep SSDs powered, but again pretty useless as there is no signal for it to start flushing
If you lose or corrupt the FTL data then all data on the drive is hosed. Without journaling in firmware or capacitor power loss protection there's no guarantee the FTL remains intact on power loss. You would have to look back a very long time to find old enough low-end drives where this is a concern
Only in the 'write intensive' drives
The main differences are less background operations / more consistent latency (which translates to a lower rating of ability to retain data while powered off, as this is not seen as needed in 24/7 workloads), significantly higher idle power consumption, and an assload of capacitors
Uhm, that's not how it works. That signal is an OS thing which tells any caches to flush out to device. And the difference between sync. and buffered writing boils down to an OS write call dumping its payload into some cache vs. waiting for a "write completed (no problems)" return value.
Whatever write data arrives at the target device(!) is written out, period, there's no "OK, data received, now waiting for a flush signal". The only caveat being that the device itself can cache so as to be able to report "write done" sooner. That whole SSD cache discussion only turns about the short phase between "written to (target device) cache" and "cache written to long term memory/storage (flash, platter, whatever)" and it's even only relevant if the cache is volatile (e.g. DRAM) as opposed to non-volatile (e.g. SLC).
And btw it's not with flash memory that (device) caching came up. We already had platters with caches in the stone-age that is, decades ago - and there it actually was way more needed and offering a really significant advantage, with spinning rust access time in the low milliseconds actually considered very fast. Back then caching technically made sense. Nowadays with flash where times are measured in microseconds caching largely has become marketing BS and covering up crappy high 'x' speed, and even to covering up crappy flash life time/cycles by trying to do most work in large caches to save flash writes.
Sorry, I'm not interested in a "who knows better" pissing contest, and I particularly disinterested in fearmongering (aka "you'll lose all your data").
I'm interested in enabling people to make the best choice for their specific situation. And btw. I happen to also have a decent understanding of electronics, which sometimes helps a lot to understand what really happens and how it really works as in "down on the pcb and all those components".
P.S. Unless I see major BS or nonsense propagated here, I'm out of this echo chamber here ...
It's imminently clear from reading the posts who has knowledge about the topic at hand. And on that account, you're not in the contest. You're a bystander, like me.
We have helpful people offering advice here, and you're in the corner getting red-faced and looking for a fight.
Nobody suggested that is a concern with modern SSDs.
And that specific situation is a database server workload, for which enterprise flash storage has a very tangible performance advantage that is easily observable. That's the only type of workload that anyone in the thread is discussing.
So, please zip up your pants and stop pissing all over the floor. People are starting to stare.
it's always nice to watch people disccusing stuffs and learn from it , but we don't really need to go to wrong direction ...
I appreciate everybody who speaks out , since the original issue is solved and the tense seems getting tight here for whatever reason that is , I'd like to ask moderator to close the thread to de-escalate thing...
so only question is ... how do I tag mod ?
Don't underestimate though the value of decent firmware. Having seen many high value products firmware, anything you can do to get firmware that has some reasonable level of care put into it might pay off.
Not all firmware is created equal and in some cases having some DRAM cache makes up for the general lack of care put in.
So far: BS! And factually wrong (someone, for example, did say that under certain circumstances one can "wipe one's [SSD] drive").
I appreciate your attempt to admonish yourself, I just don't get why you write your posts in front of a mirror. But hey, you do you ...
FULL ACK and agreement.
And yes, while I do not want to put a whole nation into one box, I discovered a certain sloppiness quite often in chinese code; I don't know the reason though as I've encountered quite a few exceptionally intelligent Chinese.
The best strategy is to flag your opening post, saying briefly why you want to close your thread
Thread closed (as requested)