Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


is this acceptable for nvme?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

is this acceptable for nvme?

I bought the machine yesterday and used it, but sometimes it slows down during installation, the machine is freshly reinstalled and the nvme is in ahaci in bios, is this acceptable in the case of nvme or could it be faulty?


/dev/nvme0n1 SN204508909101 GIGABYTE GP-GSM2NE3128GNTD
1 128.04 GB / 128.04 GB 512 B + 0 B EDFM00.5
yabs:

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/nvme0n1p4):

Block Size 4k (IOPS) 64k (IOPS)
Read 157.10 MB/s (39.2k) 246.88 MB/s (3.8k)
Write 157.52 MB/s (39.3k) 248.18 MB/s (3.8k)
Total 314.62 MB/s (78.6k) 495.07 MB/s (7.7k)
Block Size 512k (IOPS) 1m (IOPS)
------ --- ---- ---- ----
Read 339.29 MB/s (662) 371.23 MB/s (362)
Write 357.32 MB/s (697) 395.96 MB/s (386)
Total 696.61 MB/s (1.3k) 767.20 MB/s (748)

How can I check what the error is? Sometimes it's very fast, sometimes it's slow.

«1

Comments

  • Use smartctl -a /dev/nvme0n1 (ot whatever device) and look at the data. The most important ones are "Percentage Use" which shows how much of the stated lifetime write capacity has occurred (note, that drives, can survive way past 100%), and "Available Spare" which is normally 100% and gets reduced as the memory cells starts degrading and the drive is using some of the spare (unadvertised) capacity to replace the dead cells. You'd have to look at the specs for the drive to know how much of an issue this is, but I'd definitely be monitoring it regularly if it was below 100% and making sure my backup strategy was in place / replacing the drive depending on how quickly the number was decreasing.

    Thanked by 2nszerver vr10
  • MikeAMikeA Member, Host Rep
    edited December 2024

    Looks normal since that NVMe drive is very low end/old.
    https://www.gigabyte.com/SSD/GIGABYTE-NVMe-SSD-128GB#kf
    Sequential read speeds up to 1550 MB/s.
    Sequential write speeds up to 550 MB/s.

    Nobody should advertise NVMe and delivery a dedicated server with that drive in it.

    Thanked by 1nszerver
  • Smart Log for NVME device:nvme0n1 namespace-id:ffffffff
    critical_warning : 0
    temperature : 26 C
    available_spare : 100%
    available_spare_threshold : 5%
    percentage_used : 5%
    data_units_read : 1,904,474
    data_units_written : 3,190,542
    host_read_commands : 26,572,446
    host_write_commands : 70,195,517
    controller_busy_time : 640
    power_cycles : 694
    power_on_hours : 3,610
    unsafe_shutdowns : 632
    media_errors : 0
    num_err_log_entries : 1,298
    Warning Temperature Time : 0
    Critical Composite Temperature Time : 0
    Temperature Sensor 1 : 52 C
    Thermal Management T1 Trans Count : 0
    Thermal Management T2 Trans Count : 0
    Thermal Management T1 Total Time : 0
    Thermal Management T2 Total Time : 0

  • looks totally fine. I suggest to rather not look at benchmarks, when you have no idea how to read them ;-)

    smartctl (smartmontools) can give you additional data, but I don't think that there will be anything wrong with it.

    that it sometimes slows down is often related to its internal cache and once this is filled rates are dropping quickly. could also be due to throttling, if it gets too hot - which totally depends on the nevironment and case where it has been built into.

    Thanked by 1nszerver
  • As for the speed, look at the specs here: https://gigabyte.com/SSD/GIGABYTE-NVMe-SSD-128GB#kf

    Write speed for sequential tops out at 550MB/s sequential and read speed tops out at 1550MB/s. Your write speed isn't far from that for 1MB blocks, but reads quite slow. It might just be that the drive has a RAM cache and the quoted read speeds are ideal if the data is already in the cache.

    Thanked by 1nszerver
  • @ralf said:
    Use smartctl -a /dev/nvme0n1 (ot whatever device) and look at the data. The most important ones are "Percentage Use" which shows how much of the stated lifetime write capacity has occurred (note, that drives, can survive way past 100%), and "Available Spare" which is normally 100% and gets reduced as the memory cells starts degrading and the drive is using some of the spare (unadvertised) capacity to replace the dead cells. You'd have to look at the specs for the drive to know how much of an issue this is, but I'd definitely be monitoring it regularly if it was below 100% and making sure my backup strategy was in place / replacing the drive depending on how quickly the number was decreasing.

    Error:
    == START OF INFORMATION SECTION ===
    Model Number: GIGABYTE GP-GSM2NE3128GNTD
    Serial Number: SN204508909101
    Firmware Version: EDFM00.5
    PCI Vendor/Subsystem ID: 0x1987
    IEEE OUI Identifier: 0x6479a7
    Total NVM Capacity: 128,035,676,160 [128 GB]
    Unallocated NVM Capacity: 0
    Controller ID: 1
    Number of Namespaces: 1
    Namespace 1 Size/Capacity: 128,035,676,160 [128 GB]
    Namespace 1 Formatted LBA Size: 512
    Local Time is: Mon Dec 9 12:23:37 2024 CET
    Firmware Updates (0x12): 1 Slot, no Reset required
    Optional Admin Commands (0x0017): Security Format Frmw_DL Other
    Optional NVM Commands (0x005e): Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Other
    Maximum Data Transfer Size: 64 Pages
    Warning Comp. Temp. Threshold: 85 Celsius
    Critical Comp. Temp. Threshold: 95 Celsius
    Supported Power States
    St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
    0 + 4.50W - - 0 0 0 0 0 0
    1 + 2.70W - - 1 1 1 1 0 0
    2 + 2.16W - - 2 2 2 2 0 0
    3 - 0.0700W - - 3 3 3 3 1000 1000
    4 - 0.0020W - - 4 4 4 4 5000 45000
    Supported LBA Sizes (NSID 0x1)
    Id Fmt Data Metadt Rel_Perf
    0 + 512 0 1
    1 - 4096 0 0
    === START OF SMART DATA SECTION ===
    Read NVMe SMART/Health Information failed: NVMe Status 0x2002

  • which provider giving such nvme

    Thanked by 1nszerver
  • @nszerver said:

    @ralf said:
    Use smartctl -a /dev/nvme0n1

    Error:
    === START OF SMART DATA SECTION ===
    Read NVMe SMART/Health Information failed: NVMe Status 0x2002

    Are you passing a partition device to smartctl or the actual drive? If it ends in something like p1 then drop the p1 part.

    Thanked by 1nszerver
  • @cybertech said:
    which provider giving such nvme

    no, I bought the machine used.
    at home.

  • @ralf said:

    @nszerver said:

    @ralf said:
    Use smartctl -a /dev/nvme0n1

    Error:
    === START OF SMART DATA SECTION ===
    Read NVMe SMART/Health Information failed: NVMe Status 0x2002

    Are you passing a partition device to smartctl or the actual drive? If it ends in something like p1 then drop the p1 part.

    === START OF SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
    Critical Warning: 0x00
    Temperature: 23 Celsius
    Available Spare: 100%
    Available Spare Threshold: 5%
    Percentage Used: 5%
    Data Units Read: 1,904,481 [975 GB]
    Data Units Written: 3,190,836 [1.63 TB]
    Host Read Commands: 26,572,642
    Host Write Commands: 70,206,548
    Controller Busy Time: 640
    Power Cycles: 694
    Power On Hours: 3,610
    Unsafe Shutdowns: 632
    Media and Data Integrity Errors: 0
    Error Information Log Entries: 1,299
    Warning Comp. Temperature Time: 0
    Critical Comp. Temperature Time: 0
    Temperature Sensor 1: 49 Celsius
    Error Information (NVMe Log 0x01, max 16 entries)
    Num ErrCount SQId CmdId Status PELoc LBA NSID VS
    0 1299 0 0x001d 0x4004 0x004

  • layer7layer7 Member, Host Rep, LIR

    @nszerver said:
    I bought the machine yesterday and used it, but sometimes it slows down during installation, the machine is freshly reinstalled and the nvme is in ahaci in bios, is this acceptable in the case of nvme or could it be faulty?


    /dev/nvme0n1 SN204508909101 GIGABYTE GP-GSM2NE3128GNTD
    1 128.04 GB / 128.04 GB 512 B + 0 B EDFM00.5
    yabs:

    fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/nvme0n1p4):

    Block Size 4k (IOPS) 64k (IOPS)
    Read 157.10 MB/s (39.2k) 246.88 MB/s (3.8k)
    Write 157.52 MB/s (39.3k) 248.18 MB/s (3.8k)
    Total 314.62 MB/s (78.6k) 495.07 MB/s (7.7k)
    Block Size 512k (IOPS) 1m (IOPS)
    ------ --- ---- ---- ----
    Read 339.29 MB/s (662) 371.23 MB/s (362)
    Write 357.32 MB/s (697) 395.96 MB/s (386)
    Total 696.61 MB/s (1.3k) 767.20 MB/s (748)

    How can I check what the error is? Sometimes it's very fast, sometimes it's slow.

    Hi,

    deeeep consumer grade hardware:

    https://www.gigabyte.com/SSD/GIGABYTE-NVMe-SSD-128GB#kf

    => Warranty: Limited 5-year or 110TBW

    Thats all normal for this kind of hardware...

    Thanked by 1nszerver
  • FalzoFalzo Member
    edited December 2024

    as said before. totally fine. might not be the fastest drive on the planet for sure. but fine for what it is.

    maybe try monitoring the temperature and if the slowing down happens when it becomes hot. if that is what happens here, check if you can vent it better.

    also might wanna check, how it actually is connected. directly on the mainboard? or with a slot adapter etc.

    still nothing really wrong with it I would say.

    Thanked by 1nszerver
  • Yeah, if you believe the numbers that looks like a basically new drive, although the "Power On Hours" might have overflowed, as it's unlikely that a drive that was only 150 days old would have had 632 unsafe shutdown and 694 power cycles. The units read/write numbers also look reasonable, at around 10x the drive capacity which would suggest it's not been used for heavy load.

    The only thing that looks suspicious to me is the 1299 error log entries. That might suggest a fault, I'm not sure. Keep running the test over the next few days and see if it increases. It's possible it's related to the high number of power cycles, and could be reporting on the same single bad block each time.

    Personally, I wouldn't care too much about drive speed from yabs, unless you notice it being an actual problem with your use case. It's quite possible there are other factors limiting performance, e.g. if you have a slow CPU as well as a slow drive. Just run smartctl every week or so and just worry if the numbers are changing too much.

    Thanked by 1nszerver
  • @ralf could be that it was in some external case and often plugged/unplugged or the likes. each unsafe shutdown might have produced at least one entry in the error log and so on... I wouldn't worry too much. maybe spent 50 bucks to replace it with a shiny new drive with much more capacity on top.

    Thanked by 1nszerver
  • Motherboard:
    Manufacturer: Gigabyte Technology Co., Ltd.
    Product Name: A320M-S2H-CF
    It is in the M2 slot on the motherboard.

  • @layer7 said:

    @nszerver said:
    I bought the machine yesterday and used it, but sometimes it slows down during installation, the machine is freshly reinstalled and the nvme is in ahaci in bios, is this acceptable in the case of nvme or could it be faulty?


    /dev/nvme0n1 SN204508909101 GIGABYTE GP-GSM2NE3128GNTD
    1 128.04 GB / 128.04 GB 512 B + 0 B EDFM00.5
    yabs:

    fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/nvme0n1p4):

    Block Size 4k (IOPS) 64k (IOPS)
    Read 157.10 MB/s (39.2k) 246.88 MB/s (3.8k)
    Write 157.52 MB/s (39.3k) 248.18 MB/s (3.8k)
    Total 314.62 MB/s (78.6k) 495.07 MB/s (7.7k)
    Block Size 512k (IOPS) 1m (IOPS)
    ------ --- ---- ---- ----
    Read 339.29 MB/s (662) 371.23 MB/s (362)
    Write 357.32 MB/s (697) 395.96 MB/s (386)
    Total 696.61 MB/s (1.3k) 767.20 MB/s (748)

    How can I check what the error is? Sometimes it's very fast, sometimes it's slow.

    Hi,

    deeeep consumer grade hardware:

    https://www.gigabyte.com/SSD/GIGABYTE-NVMe-SSD-128GB#kf

    => Warranty: Limited 5-year or 110TBW

    Thats all normal for this kind of hardware...

    then I think I should slowly add a new nvme ssd.
    Which one would be the best and fastest? medium price? m.2 2280 in.

  • entry level motherboard with entry grade NVMe.

    looks normal.

    try to allocate memory for HMB and see if it improves.

  • layer7layer7 Member, Host Rep, LIR
    edited December 2024

    @nszerver said:
    then I think I should slowly add a new nvme ssd.
    Which one would be the best and fastest? medium price? m.2 2280 in.

    Hi,

    we use Seagate Firecuda and WD SN700 mostly.

    Corsair MP510 or higher could be also an option.

    While i am not sure if thats the price range you are looking for. But they all have quiet high TBW duration and they are definitely not slow.

    But you should definitely check what is causing this:

    Controller Busy Time: 640
    Power Cycles: 694
    Power On Hours: 3,610
    Unsafe Shutdowns: 632
    

    3600 power on hours with 700 power cycles and 600 shutdowns? so every 1h an unclean shutdown with a powercycle? Are you resetting your server every 1h hard? ^^;

    This, together with the controller busy time which is according to Intel:

    "
    Controller Busy Time (in minutes)

    Contains the amount of time the controller is busy with I/O commands. The controller is busy when there is a command outstanding to an I/O Queue. (Specifically, a command was issued by way of an I/O Submission Queue Tail doorbell write and the corresponding completion queue entry has not been posted yet to the associated I/O Completion Queue.) This value is reported in minutes.
    "

    looks like your server has some very strange problem.

    Either your NVMe drive or what ever holds your M.2 drive ( PCIe card or onboard ) seems to be not OK.

  • @ralf said:
    As for the speed, look at the specs here: https://gigabyte.com/SSD/GIGABYTE-NVMe-SSD-128GB#kf

    Write speed for sequential tops out at 550MB/s sequential and read speed tops out at 1550MB/s. Your write speed isn't far from that for 1MB blocks, but reads quite slow. It might just be that the drive has a RAM cache and the quoted read speeds are ideal if the data is already in the cache.

    YABS tests mixed read/write simultaneously so the numbers will always be lower than specs

  • @nszerver said:
    is this acceptable in the case of nvme or could it be faulty?
    Product Name: A320M-S2H-CF

    Yes, for the drives like this one, such speeds are rather typical.

    Even the "major" brands have budget NVMe SSDs, which aren't much faster than the SATA ones. E.g. see Intel 600p.

  • @layer7 said: looks like your server has some very strange problem.

    it's not a server from what he wrote. at least nothing constantly powered on in some datacenter, but just some machine with used parts that now runs at his home.

    one can probably only speculate, what the usage scenario of that drive has been before it ended up in that box.

  • I'll put in a new SSD and see how it works.

  • nszervernszerver Member
    edited December 2024

    New Line error.
    Error Information (NVMe Log 0x01, max 16 entries)
    Num ErrCount SQId CmdId Status PELoc LBA NSID VS
    0 1300 0 0x0013 0x4004 0x004 0 1 -
    1 1299 0 0x001d 0x4004 0x004 0 1
    nvme error-log /dev/nvme0
    Error Log Entries for device:nvme0 entries:16
    .................
    Entry[ 0]
    .................
    error_count : 1300
    sqid : 0
    cmdid : 0x13
    status_field : 0x4004(INVALID_FIELD)
    parm_err_loc : 0x4
    lba : 0
    nsid : 0x1
    vs : 0
    .................
    Entry[ 1]
    .................
    error_count : 1299
    sqid : 0
    cmdid : 0x1d
    status_field : 0x4004(INVALID_FIELD)
    parm_err_loc : 0x4
    lba : 0
    nsid : 0x1
    vs : 0
    .................
    Entry[ 2]
    .................
    error_count : 0
    sqid : 0
    cmdid : 0
    status_field : 0(SUCCESS)
    parm_err_loc : 0
    lba : 0
    nsid : 0
    vs : 0
    .................
    Entry[ 3]
    .................
    error_count : 0
    sqid : 0
    cmdid : 0
    status_field : 0(SUCCESS)
    parm_err_loc : 0
    lba : 0
    nsid : 0
    vs : 0
    .................
    Entry[ 4]
    .................
    error_count : 0
    sqid : 0
    cmdid : 0
    status_field : 0(SUCCESS)
    parm_err_loc : 0
    lba : 0
    nsid : 0
    vs : 0
    .................
    Entry[ 5]
    .................
    error_count : 0
    sqid : 0
    cmdid : 0
    status_field : 0(SUCCESS)
    parm_err_loc : 0
    lba : 0
    nsid : 0
    vs : 0
    .................
    Entry[ 6]
    .................
    error_count : 0
    sqid : 0
    cmdid : 0
    status_field : 0(SUCCESS)
    parm_err_loc : 0
    lba : 0
    nsid : 0
    vs : 0
    .................
    Entry[ 7]
    .................
    error_count : 0
    sqid : 0
    cmdid : 0
    status_field : 0(SUCCESS)
    parm_err_loc : 0
    lba : 0
    nsid : 0
    vs : 0
    .................
    Entry[ 8]
    .................
    error_count : 0
    sqid : 0
    cmdid : 0
    status_field : 0(SUCCESS)
    parm_err_loc : 0
    lba : 0
    nsid : 0
    vs : 0
    .................
    Entry[ 9]
    .................
    error_count : 0
    sqid : 0
    cmdid : 0
    status_field : 0(SUCCESS)
    parm_err_loc : 0
    lba : 0
    nsid : 0
    vs : 0
    .................
    Entry[10]
    .................
    error_count : 0
    sqid : 0
    cmdid : 0
    status_field : 0(SUCCESS)
    parm_err_loc : 0
    lba : 0
    nsid : 0
    vs : 0
    .................
    Entry[11]
    .................
    error_count : 0
    sqid : 0
    cmdid : 0
    status_field : 0(SUCCESS)
    parm_err_loc : 0
    lba : 0
    nsid : 0
    vs : 0
    .................
    Entry[12]
    .................
    error_count : 0
    sqid : 0
    cmdid : 0
    status_field : 0(SUCCESS)
    parm_err_loc : 0
    lba : 0
    nsid : 0
    vs : 0
    .................
    Entry[13]
    .................
    error_count : 0
    sqid : 0
    cmdid : 0
    status_field : 0(SUCCESS)
    parm_err_loc : 0
    lba : 0
    nsid : 0
    vs : 0
    .................
    Entry[14]
    .................
    error_count : 0
    sqid : 0
    cmdid : 0
    status_field : 0(SUCCESS)
    parm_err_loc : 0
    lba : 0
    nsid : 0
    vs : 0
    .................
    Entry[15]
    .................
    error_count : 0
    sqid : 0
    cmdid : 0
    status_field : 0(SUCCESS)
    parm_err_loc : 0
    lba : 0
    nsid : 0
    vs : 0
    .................

  • @nszerver said:
    New Line error.
    Error Information (NVMe Log 0x01, max 16 entries)
    Num ErrCount SQId CmdId Status PELoc LBA NSID VS
    0 1300 0 0x0013 0x4004 0x004 0 1 -
    1 1299 0 0x001d 0x4004 0x004 0 1

  • finally found the error.
    EDITED ... grub add ...
    grub default cmd:
    nvme_core.default_ps_max_latency_us=0 pcie_aspm=off
    and no error.

    === START OF SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED

    SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
    Critical Warning: 0x00
    Temperature: 35 Celsius
    Available Spare: 100%
    Available Spare Threshold: 5%
    Percentage Used: 5%
    Data Units Read: 1,912,417 [979 GB]
    Data Units Written: 3,195,359 [1.63 TB]
    Host Read Commands: 26,606,154
    Host Write Commands: 70,588,430
    Controller Busy Time: 641
    Power Cycles: 694
    Power On Hours: 3,617
    Unsafe Shutdowns: 632
    Media and Data Integrity Errors: 0
    Error Information Log Entries: 1,300
    Warning Comp. Temperature Time: 0
    Critical Comp. Temperature Time: 0
    Temperature Sensor 1: 61 Celsius

    Error Information (NVMe Log 0x01, max 16 entries)
    No Errors Logged

  • Have you just rebooted? From googling, 0x4004 might be unrecognised data in some field, so probably the computer trying to speak a newer version of the NVMe protocol than your drive understands. If that's the case, and the computer is doing that twice during every boot, it neatly explains why the error count was about double the power cycle count.

    If so, try rebooting a few more times and see if it goes up by 2 every time. If so, I'd say, you don't have anything to worry about.

  • @ralf said:
    Have you just rebooted? From googling, 0x4004 might be unrecognised data in some field, so probably the computer trying to speak a newer version of the NVMe protocol than your drive understands. If that's the case, and the computer is doing that twice during every boot, it neatly explains why the error count was about double the power cycle count.

    If so, try rebooting a few more times and see if it goes up by 2 every time. If so, I'd say, you don't have anything to worry about.

    Thankyou added grub nvme_core.default_ps_max_latency_us=0 pcie_aspm=off
    And update-grub
    And reboot
    And no errors and fast again

    Thanked by 1ralf
  • layer7layer7 Member, Host Rep, LIR

    @nszerver said:

    @ralf said:
    Have you just rebooted? From googling, 0x4004 might be unrecognised data in some field, so probably the computer trying to speak a newer version of the NVMe protocol than your drive understands. If that's the case, and the computer is doing that twice during every boot, it neatly explains why the error count was about double the power cycle count.

    If so, try rebooting a few more times and see if it goes up by 2 every time. If so, I'd say, you don't have anything to worry about.

    Thankyou added grub nvme_core.default_ps_max_latency_us=0 pcie_aspm=off
    And update-grub
    And reboot
    And no errors and fast again

    Hi,

    with that you should have actually seen something in the kernel log / dmesg... just for the future.

  • @MikeA said:
    Looks normal since that NVMe drive is very low end/old.
    https://www.gigabyte.com/SSD/GIGABYTE-NVMe-SSD-128GB#kf
    Sequential read speeds up to 1550 MB/s.
    Sequential write speeds up to 550 MB/s.

    Nobody should advertise NVMe and delivery a dedicated server with that drive in it.

    If they have to replace under warranty, this is bad for provider.
    If they charge remote hands to fix this, it'll be a cash maker and provider is an asshole.

  • chargeback yesterday

    Thanked by 1zGato
Sign In or Register to comment.