Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


weird MariaDB slowness and high IOwait on htop output
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

weird MariaDB slowness and high IOwait on htop output

qtwrkqtwrk Member

Hi,

I have a weird issue I don't understand , hope I can get some wisdom and enlightenment

I have a Hetzner AX102 running , and yesterday I grabbed an AX52.


and here comes the problem

on AX102 , when I do mysqldump and mysql < dump.sql or mydumper and myloader, for database export/import , as a sample of 35MB database, it takes like 1,2 seconds

on AX52 , same operation takes like 15 - 20 seconds to import , I would expect maybe little longer as difference in hardware, but this kind of way off

I also have a Netcup vServer 2000 , in that server with same set up as AX52, it takes like 4 seconds for import, this is reasonable time as I expected.

when I run htop , enable IO for CPU , I can see at least one core gives like 80-90% IOWait status during import.

YABS IO on AX102:

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/md1):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 506.13 MB/s (126.5k) | 2.22 GB/s    (34.6k)
Write      | 507.47 MB/s (126.8k) | 2.23 GB/s    (34.8k)
Total      | 1.01 GB/s   (253.4k) | 4.45 GB/s    (69.5k)
           |                      |
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 2.50 GB/s     (4.8k) | 2.47 GB/s     (2.4k)
Write      | 2.63 GB/s     (5.1k) | 2.63 GB/s     (2.5k)
Total      | 5.14 GB/s    (10.0k) | 5.11 GB/s     (4.9k)

YABS IO on AX52

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/md2):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 644.62 MB/s (161.1k) | 2.01 GB/s    (31.4k)
Write      | 646.32 MB/s (161.5k) | 2.02 GB/s    (31.6k)
Total      | 1.29 GB/s   (322.7k) | 4.04 GB/s    (63.1k)
           |                      |
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ----
Read       | 2.50 GB/s     (4.8k) | 2.58 GB/s     (2.5k)
Write      | 2.63 GB/s     (5.1k) | 2.75 GB/s     (2.6k)
Total      | 5.14 GB/s    (10.0k) | 5.33 GB/s     (5.2k)

both are on soft RAID 1

I tried: apply exact same conf , copy-paste from my old server to new server, no change

also tried reinstall, nothing is changed, only yum update and then install database and do import/export test, all takes 15-20 seconds

tried different versions of mariadb like 10.3 , 10.6 , 11.4 ...etc , still same

I thought might be raid sync after initial system installation , but I waited overnight for it completes , still same.

that high IOWait is just so weird on NVMe for 35MB database import though ...

maybe faulty disk ? but YABS output looks alright-ish.

nvme-cli output like 7, 8% life usage , should still consider a healthy enough.

what else can I try ? or what could cause this ?

thanks in advance

«13

Comments

  • egororegoror Member
    edited October 27

    Just five cents from the top of my mind.
    Try something like iostat -x 1 (yum install sysstat) to monitor i/o and iotop (yum install iotop) or second tab of htop to see what is loading i/o. And compare data.

  • emghemgh Member, BF Ambassador

    @qtwrk said: what else can I try ?

    May not provide anything but I'd try to import a 35MB PostgreSQL database to both machines.

    That way, you can isolate if it's hardware/OS-related or directly related to a combination of the hardware/OS with MariaDB.

  • qtwrkqtwrk Member

    thanks for the hint , iotop puts me onto the right direction , it gives me high IO on jbd2/md2-8 , now let me google out what's with it...

  • emghemgh Member, BF Ambassador
    edited October 27

    @qtwrk said:
    thanks for the hint , iotop puts me onto the right direction , it gives me high IO on jbd2/md2-8 , now let me google out what's with it...

    TIL: jdb2 is ("jbd2 is a kernel thread that updates the filesystem journal."), and "md2-8" is the name of linux software RAID "md (multiple device) " and originally "Mirror Disk".

    Source: https://forums.sabnzbd.org/viewtopic.php?t=24179

    How did you install the OS on the machines? Hetzner’s ’auto installer’, rescue or some other way? Did you do it differently on the different servers? What filesystem are they running?

    Thanked by 1loay
  • qtwrkqtwrk Member

    system is installed by hetzner rescure -> installimage , I only changed hostname , partition size (comment out the swap since this is 64GB server , no really need the SWAP) , then install it , nothing is touched

    and I absolutely have no idea about file system stuff, never touched anything about it

    let me try try the automated installation Alma8 , see how it goes ...

  • emghemgh Member, BF Ambassador
    edited October 27

    @qtwrk said:
    system is installed by hetzner rescure -> installimage , I only changed hostname , partition size (comment out the swap since this is 64GB server , no really need the SWAP) , then install it , nothing is touched

    and I absolutely have no idea about file system stuff, never touched anything about it

    let me try try the automated installation Alma8 , see how it goes ...

    I don’t know about RHEL stuff, but you can change filesystem in the InstallImage config. I’d try XFS on the slow server since it’s probably EXT4 and see if that changes anything

    Edit: But yeah, try the auto installer as well

  • qtwrkqtwrk Member

    thanks for the hint , let me play around a bit both autoinstall and xfs , see how it goes

    Thanked by 1emgh
  • BackboneDirectBackboneDirect Member, Host Rep

    If still not helped, checked if there is not a resync or anything running bottlenecking you as well:

    cat /proc/mdstat

    Also, log into mariadb using 'mysql' and then do

    show full processlist

  • qtwrkqtwrk Member

    @BackboneDirect said:
    If still not helped, checked if there is not a resync or anything running bottlenecking you as well:

    cat /proc/mdstat

    Also, log into mariadb using 'mysql' and then do

    show full processlist

    I've checked the mdstat , behaves same when it was sync'ing and it has finished sync'ing

    for 2nd one .... I just reinstalled system , need little time to reinstall database to test

  • emghemgh Member, BF Ambassador

    @qtwrk said:

    @BackboneDirect said:
    If still not helped, checked if there is not a resync or anything running bottlenecking you as well:

    cat /proc/mdstat

    Also, log into mariadb using 'mysql' and then do

    show full processlist

    I've checked the mdstat , behaves same when it was sync'ing and it has finished sync'ing

    for 2nd one .... I just reinstalled system , need little time to reinstall database to test

    While you’re waiting on that, ever considered the EX-series? I’m very happy with the EX44, the price/performance is amazing.

  • qtwrkqtwrk Member

    @emgh said:

    @qtwrk said:

    @BackboneDirect said:
    If still not helped, checked if there is not a resync or anything running bottlenecking you as well:

    cat /proc/mdstat

    Also, log into mariadb using 'mysql' and then do

    show full processlist

    I've checked the mdstat , behaves same when it was sync'ing and it has finished sync'ing

    for 2nd one .... I just reinstalled system , need little time to reinstall database to test

    While you’re waiting on that, ever considered the EX-series? I’m very happy with the EX44, the price/performance is amazing.

    I need like at least about 1TB disk ... so if I order 2 1TB , it will like 80-ish , pretty much same price as AX52...

    Thanked by 1emgh
  • darkimmortaldarkimmortal Member
    edited October 27

    SSD models? Non enterprise SSDs can be very slow for database workloads / synced writes. YABS does not test for that

  • qtwrkqtwrk Member

    @darkimmortal said:
    SSD models? Non enterprise SSDs can be very slow for database workloads / synced writes

    based on model number , it should be Samsung PM9A1

    I know it may not be good as "Datacenter Edition" ,but this is really way off , I think

  • @darkimmortal said:
    SSD models? Non enterprise SSDs can be very slow for database workloads / synced writes. YABS does not test for that

    I've wasted countless hours troubleshooting a very similar issue as the OP's, and this was the answer. Apparently, the SSDs used in the server I was troubleshooting were consumer-grade Microns. They were well rated desktop SSDs for the time, but an absolute disaster for database server performance. When dealing with concurrent operations that needed IO, they were useless. Moving to Intel datacenter SSDs fixed the problem.

    I don't think Hetzner uses consumer-grade SSDs, but I would still focus on the storage as the most likely culprit here.

  • qtwrkqtwrk Member
    edited October 27

    @emgh said:
    I don’t know about RHEL stuff, but you can change filesystem in the InstallImage config. I’d try XFS on the slow server since it’s probably EXT4 and see if that changes anything

    well , things still don't add up on XFS system

    [root@alma8 ~]# time myloader -u ... -p "..." --purge-mode DROP -e -B xxx -o --directory /xxx
    
    real    0m18.383s
    user    0m0.062s
    sys 0m0.065s
    

    still taking way too long , or

    [root@alma8 ~]# time mysql -u ... -p"..." xxx < xxx.sql
    
    real    0m18.412s
    user    0m0.064s
    sys 0m0.014s
    

    now I don't see any weird kernel thread , but only mariadb itself showing up in iotop output ...

    Thanked by 2emgh Shamli
  • darkimmortaldarkimmortal Member
    edited October 27

    You can try testing worst-case synchronised 4k writes with fio:

    fio --name=x --ioengine=posixaio --rw=write --bs=4k --iodepth=1 --size=512MB --fsync=1 --filename=test.img

    Generally good consumer NVMe's such as the Samsung model you mentioned will be around 300-500 iops, enterprise around 10x faster

    Once you take parallelism and queuing out of the equation, and expect every write to be coherent across power loss, consumer NVMes aren't much better than 7200rpm hard drives. You have to pay the extra for an enterprise NVMe to see an improvement in this kind of worst-case workload

    Of course you can trade power loss safety for better performance - plenty of knobs to tweak for that

    Thanked by 3emgh eva2000 maverick
  • Will mount -o remount,commit=60 /dev/md2 help?
    If innodb_flush_log_at_trx_commit is set to 1 in mariadb config file, try setting it to 2.

  • qtwrkqtwrk Member

    @darkimmortal said:
    You can try testing worst-case synchronised 4k writes with fio:

    fio --name=x --ioengine=posixaio --rw=write --bs=4k --iodepth=1 --size=512MB --fsync=1 --filename=test.img

    Generally good consumer NVMe's such as the Samsung model you mentioned will be around 300-500 iops, enterprise around 10x faster

    Once you take parallelism and queuing out of the equation, and expect every write to be coherent across power loss, consumer NVMes aren't much better than 7200rpm hard drives. You have to pay the extra for an enterprise NVMe to see an improvement in this kind of worst-case workload

    Of course you can trade power loss safety for better performance - plenty of knobs to tweak for that

    on AX102:

    iops : min=14428, max=16140, avg=15414.06, stdev=556.25, samples=16

    on AX52:

    iops : min= 56, max= 164, avg=118.71, stdev=24.77, samples=2205

    okay , this really solves all the puzzle

    Thanked by 1darkimmortal
  • @qtwrk said: iops : min= 56, max= 164, avg=118.71, stdev=24.77, samples=2205

    That's too low, you might be running with write cache disabled.

  • qtwrkqtwrk Member

    @egoror said:

    @qtwrk said: iops : min= 56, max= 164, avg=118.71, stdev=24.77, samples=2205

    That's too low, you might be running with write cache disabled.

    [root@Alma-8-latest-amd64-base ~]# cat /sys/block/nvme0n1/queue/write_cache
    write back
    [root@Alma-8-latest-amd64-base ~]# cat /sys/block/nvme1n1/queue/write_cache
    write back
    

    emmm???

  • @qtwrk said: emmm???

    Interesting.
    Try this stuff then

    @egoror said:
    Will mount -o remount,commit=60 /dev/md2 help?
    If innodb_flush_log_at_trx_commit is set to 1 in mariadb config file, try setting it to 2.

  • qtwrkqtwrk Member

    mount -o remount,commit=60 /dev/md3
    innodb_flush_log_at_trx_commit = 2

    mount makes it bit slower , innodb param make it little faster

    but still nowhere close to AX102 performance, I guess it's fundamentally disk issue ...

  • Check the disk wearout - you will have to install some packages.

    sudo apt-get install nvme-cli
    sudo nvme smart-log /dev/nvme0n1

    Thanked by 1NetDynamics24
  • qtwrkqtwrk Member

    @lowprofile said:
    Check the disk wearout - you will have to install some packages.

    sudo apt-get install nvme-cli
    sudo nvme smart-log /dev/nvme0n1

    already checked , life was used like 7, 8 % on both

  • Your my.cnf is same on both servers ?

  • qtwrkqtwrk Member

    @lowprofile said:
    Your my.cnf is same on both servers ?

    yeah , tried same conf , default conf , but ultimately , I guess it's disk thing ...

  • rdesrdes Member

    Maybe try fstrim?
    Had similar problem with high iowait ont NVMe some time ago and this helped.

  • emghemgh Member, BF Ambassador
    edited October 27

    Can you skip RAID fully and try the two drives separately?

    Thanked by 1Falzo
  • qtwrkqtwrk Member

    @rdes said:
    Maybe try fstrim?
    Had similar problem with high iowait ont NVMe some time ago and this helped.

    I am not exactly familiar with this command , could you please give a quick example ?

    @emgh said:
    Can you skip RAID fully and try the two drives separately?

    you are about 1 second early before I clicked the cancellation button , lol

    let me try that scenario

    Thanked by 1emgh
  • rdesrdes Member
    edited October 27

    @rdes said:
    Maybe try fstrim?
    Had similar problem with high iowait ont NVMe some time ago and this helped.

    I am not exactly familiar with this command , could you please give a quick example ?

    fstrim --verbose --all

    In my case I had around 500 GB of unused blocks and it really slowed a drive during intensive tasks like backup.

    More about that: https://www.blunix.com/blog/increasing-ssd-write-performance-on-linux-with-fstrim.html

This discussion has been closed.