Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Shells Virtual Desktop
BMail.ag - Secure Email Service
Server.net
CPLicense.net
VPS Server
Buy VPN
Vultr
VMs for AI
HostDare
ReliableSite White-Label Dedicated Hosting for Resellers
InterServer VPS
BMail.ag - Secure Email Service
Best VPN
High-Performance Bare Metal Server Solutions
Karvl.com
Server Mania Cloud Hosting
DataWagon Hosting
AlphaVPS Hosting
Evoxt.com
Clouvider
VPS Hosting with NVMe
Residential IPs in the US & 4G Mobile Proxies in EU & US with Unlimited Bandwidth
ReliableSite White-Label Dedicated Hosting for Resellers
Rabisu - Hosting Solutions
Shells Virtual Desktop
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

weird MariaDB slowness and high IOwait on htop output

2

Comments

  • @rdes said:

    @rdes said:
    Maybe try fstrim?
    Had similar problem with high iowait ont NVMe some time ago and this helped.

    I am not exactly familiar with this command , could you please give a quick example ?

    fstrim --verbose --all

    In my case I had around 500 GB of unused blocks and it really slowed a drive during intensive tasks like backup.

    reinstalling , but do you really think it's my case here ? the disk is always newly formatted

  • rdesrdes Member
    edited October 2024

    I gave this idea because it helped me once. Quick formatting (installing new system via rescue etc) is rather a placebo, real deep format takes hour not minute. Run fstrim and you will get info how much has been trimmed.

  • @rdes said:
    I gave this idea because it helped me once. Quick formatting (installing new system via rescue etc) is rather a placebo, real deep format takes hour not minute. Run fsrim and you will get info how much has been trimmed.

    okay , thanks for the hint , let me try it out

  • @rdes said:
    I gave this idea because it helped me once. Quick formatting (installing new system via rescue etc) is rather a placebo, real deep format takes hour not minute. Run fstrim and you will get info how much has been trimmed.

    [root@Alma-8-latest-amd64-base ~]# fstrim --verbose --all
    /boot: 136.3 MiB (142946304 bytes) trimmed
    /: 381.8 MiB (400359424 bytes) trimmed
    

    okay , I guess it's not the case ...

  • Yeah, not really :(

  • emghemgh Member, Megathread Squad

    I’m out but it’ll be interesting tomorrow to see the results if you tried no raid and both drives.

    If it’s a drive error, it would be unlikely that both drives are just as bad.

  • @rdes said:
    Yeah, not really :(

    thanks for trying :)

    @emgh said:
    Can you skip RAID fully and try the two drives separately?

    for whatever reason , probably due to disk partition , I can not boot up or SSH in when use single disk :(

    Thanked by 1emgh
  • SplitIceSplitIce Member, Host Rep
    edited October 2024

    If you have the option and the features match your requirements seriously consider Postgresql. Of course this depends on where your application is in its development curve and your resources and ability.

    After years running Maria/MySQL as the base for projects I've been moving over to Postgresql (largely since 16 & 17 where replication has gotten alot better).

    MariaDB has often ended up having rather inconsistent performance for me. For the applications we still have on MariaDB we do a database server per application these days, and restart them yearly (if they don't get updated). Its quite fast, until sometimes its not.

    Thanked by 1jsg
  • @SplitIce said:
    If you have the option and the features match your requirements seriously consider Postgresql. Of course this depends on where your application is in its development curve and your resources and ability.

    After years running Maria/MySQL as the base for projects I've been moving over to Postgresql (largely since 16 & 17 where replication has gotten alot better).

    MariaDB has often ended up having rather inconsistent performance for me. For the applications we still have on MariaDB we do a database per application these days, and restart them yearly (if they don't get updated). Its quite fast, until sometimes its not.

    well , sadly wordpress doesn't support PGSQL ...

  • jsgjsg Member, Resident Benchmarker

    IMO the problem clearly is with the disk. But before hastily cancelling, why not simply bring the problem to their attention? Chances are (I guess but have zero experience with Hetzner) that they'll simply replace the problem drive.

  • darkimmortaldarkimmortal Member
    edited October 2024

    @jsg said:
    IMO the problem clearly is with the disk. But before hastily cancelling, why not simply bring the problem to their attention? Chances are (I guess but have zero experience with Hetzner) that they'll simply replace the problem drive.

    It's par for the course with non-enterprise NVMe - having physical power loss protection rather than relying on journalling in firmware to emulate it makes a gigantic (1-2 orders of magnitude) difference to certain write workloads

  • emghemgh Member, Megathread Squad

    @jsg said:
    IMO the problem clearly is with the disk. But before hastily cancelling, why not simply bring the problem to their attention? Chances are (I guess but have zero experience with Hetzner) that they'll simply replace the problem drive.

    It would be better to know what drive is causing issues.

  • jsgjsg Member, Resident Benchmarker

    @emgh said:

    @jsg said:
    IMO the problem clearly is with the disk. But before hastily cancelling, why not simply bring the problem to their attention? Chances are (I guess but have zero experience with Hetzner) that they'll simply replace the problem drive.

    It would be better to know what drive is causing issues.

    Yes, but it seems OP has problems using the drives without Raid.

    Thanked by 1emgh
  • @jsg said:
    IMO the problem clearly is with the disk. But before hastily cancelling, why not simply bring the problem to their attention? Chances are (I guess but have zero experience with Hetzner) that they'll simply replace the problem drive.

    @emgh said:
    It would be better to know what drive is causing issues.

    well , thanks for the hint , I contact support, they replaced server , yet on this one is still same , same database import takes about 15 seconds which really bugs me

    @darkimmortal said:
    It's par for the course with non-enterprise NVMe - having physical power loss protection rather than relying on journalling in firmware to emulate it makes a gigantic (1-2 orders of magnitude) difference to certain write workloads

    I have been trying to googling around , but seems couldn't find anything useful , is there any tuning I can test a bit ?

  • emghemgh Member, Megathread Squad

    @qtwrk said: well , thanks for the hint , I contact support, they replaced server , yet on this one is still same , same database import takes about 15 seconds which really bugs me

    Is the database dump private? If not, could you share exactly how you setup the environment and how you restore the file as well as the file?

  • emghemgh Member, Megathread Squad
    edited October 2024

    Also: I'm still very interested to see how things change with the two drives without RAID. It should definitely be possible to do.

  • jsgjsg Member, Resident Benchmarker

    @qtwrk said:

    @jsg said:
    IMO the problem clearly is with the disk. But before hastily cancelling, why not simply bring the problem to their attention? Chances are (I guess but have zero experience with Hetzner) that they'll simply replace the problem drive.

    @emgh said:
    It would be better to know what drive is causing issues.

    well , thanks for the hint , I contact support, they replaced server , yet on this one is still same , same database import takes about 15 seconds which really bugs me

    Then, sorry, chances are that the cause is somewhere in your config(s). And I strongly agree with @emgh that to properly analyze the problem you should, if any possible, un-Raid at least one partition, so as to allow for separate testing of each drive.
    Not doing that IMO boils down to continue stumbling about in the dark.

    Thanked by 1emgh
  • @qtwrk said:

    @darkimmortal said:
    It's par for the course with non-enterprise NVMe - having physical power loss protection rather than relying on journalling in firmware to emulate it makes a gigantic (1-2 orders of magnitude) difference to certain write workloads

    I have been trying to googling around , but seems couldn't find anything useful , is there any tuning I can test a bit ?

    I didn't find much online about this when I had the issue, either. But what @darkimmortal writes was my experience: On otherwise identical hardware, substituting a non-enterprise SSD for an enterprise one was the solution.

    I was running MariaDB using mdraid, just like you, and no amount of tuning or other software fixes could help. The SSD simply could not give reasonable performance when doing sustained writes. Any write-heavy operation, whether it be a database dump or running a benchmark, would cause other operations to pause.

    Unfortunately, I don't think you'll gain much by doing any further tuning. My bet is on @darkimmortal's suggestion. If you get better performance with different hardware, then save your time and go with that hardware, even if it costs more. You may not save money, but you'll save your sanity. :)

  • qtwrkqtwrk Member
    edited October 2024

    @emgh said:
    Is the database dump private? If not, could you share exactly how you setup the environment and how you restore the file as well as the file?

    it's kind of private , but simply a wordpress database with few dozens of posts and among other things

    @emgh said:
    Also: I'm still very interested to see how things change with the two drives without RAID. It should definitely be possible to do.

    I couldn't make it work with single disk system , but I create a docker container database and mount it to /dev/shm

    root@host ~ # time mysql -uroot -pxxxx --port 3307 wp_db < wp_db.sql
    
    real    0m0.681s
    user    0m0.097s
    sys     0m0.018s
    

    okay , so I guess it's about the disk or IO

    I also tried the RAID 0 (maybe in theory should double the write/read ?) , pretty much same result , 15-20 seconds importing 35MB size DB

    I don't know much about the database, I must say for long time I thought Database is more like CPU-intensive task , specially with nowadays big RAM , and all buffer, cache techniques...etc , never thought NVMe has so many difference and cause things like this , new thing learnt :)

    @aj_potc said:
    Unfortunately, I don't think you'll gain much by doing any further tuning. My bet is on @darkimmortal's suggestion. If you get better performance with different hardware, then save your time and go with that hardware, even if it costs more. You may not save money, but you'll save your sanity. :)

    yeah , understood , it was happy lesson to play around a bit to get know more about database and disks

  • egororegoror Member
    edited October 2024

    @qtwrk said: I couldn't make it work with single disk system , but I create a docker container database and mount it to /dev/shm

    root@host ~ # time mysql -uroot -pxxxx --port 3307 wp_db < wp_db.sql

    real 0m0.681s
    user 0m0.097s
    sys 0m0.018s

    Is it the same machine that does 15 seconds, but db is in docker container?

  • @egoror said:

    @qtwrk said: I couldn't make it work with single disk system , but I create a docker container database and mount it to /dev/shm

    root@host ~ # time mysql -uroot -pxxxx --port 3307 wp_db < wp_db.sql

    real 0m0.681s
    user 0m0.097s
    sys 0m0.018s

    Is it the same machine that does 15 seconds, but db is in docker container?

    yes , DB storage in /dev/shm , same machine , as shm is run on RAMdisk , so I guess it's super fast

  • @qtwrk said: yes , DB storage in /dev/shm

    Ah, oh, I've missed the /shm part, sorry.

  • maverickmaverick Member
    edited October 2024

    @qtwrk said: root@host ~ # time mysql -uroot -pxxxx --port 3307 wp_db < wp_db.sql
    real 0m0.681s
    user 0m0.097s
    sys 0m0.018s

    okay , so I guess it's about the disk or IO

    Yes, client SSD's (non-enterprise) are terribly slow when you do frequent fsync() to disk, which databases like to do a lot, to keep data safely on the disk.

    What you can try is this:

    apt-get install eatmydata
    eatmydata mysql -uroot -pxxxx --port 3307 wp_db < wp_db.sql
    

    Make sure to read eatmydata manual page and understand what it does and why it's much faster. Don't come crying here if you lose data with improper usage. :wink:

    I don't know much about the database, I must say for long time I thought Database is more like CPU-intensive task , specially with nowadays big RAM , and all buffer, cache techniques...etc , never thought NVMe has so many difference and cause things like this , new thing learnt

    One thing that no CPU or RAM can improve is fsync(). That still depends heavily on the disk, and databases use such calls a lot. That's why using only enterprise NVMe disks is a good idea, they come with small RAM or SLC flash buffer, which is protected from power loss. This allows them to take your small fsync() quickly and immediately return success to your application, then slowly flush it to the persistant storage in the background. Client SSD's are cheaper exactly because they don't have anything similar.

    You might also experiment with --single-transaction during mysqldump, and then check if it improves the speed of import. Once again, read mysqldump manual and understand the implications. But, it could help...

    Thanked by 1egoror
  • @maverick said:

    @qtwrk said: root@host ~ # time mysql -uroot -pxxxx --port 3307 wp_db < wp_db.sql
    real 0m0.681s
    user 0m0.097s
    sys 0m0.018s

    okay , so I guess it's about the disk or IO

    Yes, client SSD (non-enterprise) are terribly slow when you do frequent fsync() to disk, which databases like to do a lot, to keep data safely on the disk.

    What you can try is this:

    apt-get install eatmydata
    eatmydata mysql -uroot -pxxxx --port 3307 wp_db < wp_db.sql
    

    Make sure to read eatmydata manual page and understand what it does and why it's much faster. Don't come crying here if you lose data with improper usage. :wink:

    I don't know much about the database, I must say for long time I thought Database is more like CPU-intensive task , specially with nowadays big RAM , and all buffer, cache techniques...etc , never thought NVMe has so many difference and cause things like this , new thing learnt

    One thing that no CPU or RAM can improve is fsync(). That still depends heavily on the disk, and databases use such calls a lot. That's why using only enterprise NVMe disks is a good idea, they come with small RAM or SLC flash buffer, which is protected from power loss. This allows them to take your small fsync() quickly and immediately return success to your application, then slowly flush it to the persistant storage in the background. Client SSD's are cheaper exactly because they don't have anything similar.

    You might also experiment with --single-transaction during mysqldump, and then check if it improves the speed of import. Once again, read mysqldump manual and understand the implications. But, it could help...

    thanks mav , I have just cancelled it and going to order an AX102 instead ...

  • maverickmaverick Member
    edited October 2024

    @qtwrk said: thanks mav , I have just cancelled it and going to order an AX102 instead ...

    And that is of course the ultimately good solution. :smile:

  • emghemgh Member, Megathread Squad

    @maverick said:

    @qtwrk said: thanks mav , I have just cancelled it and going to order an AX102 instead ...

    And that is of course the ultimately good solution. :smile:

    What drives do AX52 and AX102 normally come with?

  • @emgh said:

    @maverick said:

    @qtwrk said: thanks mav , I have just cancelled it and going to order an AX102 instead ...

    And that is of course the ultimately good solution. :smile:

    What drives do AX52 and AX102 normally come with?

    according to hetzner page , AX52 is "normal" disk , AX102 is "datacenter grade" disk

  • emghemgh Member, Megathread Squad

    @qtwrk said:

    @emgh said:

    @maverick said:

    @qtwrk said: thanks mav , I have just cancelled it and going to order an AX102 instead ...

    And that is of course the ultimately good solution. :smile:

    What drives do AX52 and AX102 normally come with?

    according to hetzner page , AX52 is "normal" disk , AX102 is "datacenter grade" disk

    Ah, same with EX series, EX44 says: 2 x 512 GB NVMe SSD (Gen4)
    While EX101 says: 2 x 1.92 TB NVMe SSD Datacenter Edition (Gen 4)

  • emghemgh Member, Megathread Squad
    edited October 2024

    Still, can that difference really explain the 1,2 sec > 15-20 sec? I had no idea if that's the case.

  • on my AX101 , it's "datacenter edition" disk , which does it 1,2 seconds , I guess that was the case, according to CPU benchmark , AMD 7700 is better on single core perf than 5950X

This discussion has been closed.