Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Should I ask for a new drive in my hetzner slot?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Should I ask for a new drive in my hetzner slot?

Hi.
Lately I was having several problems in my server hetzner (maybe they are due to my inexperience and are not related to this) so I started to test the system and I found this SMART result:

https://pastebin.com/r73sJzaQ

https://pastebin.com/FVfAYJpi

Not being able to understand how bad is what the test shows me, I consult here if, since I plan to do a clean installation of debian due to the dependency errors that I can not solve, I should take advantage and request the change of drive or Should I leave it the way it is?
Thanks!

«1

Comments

  • deankdeank Member, Troll
    edited May 2019

    Your drive looks fine to me.

    But, sure, ask for a replacement. What have you got to lose? A "No" is the only bad result.
    Be careful what you wish for though cuz there is a chance that you will get an older drive.

  • @deank said:
    Your drive looks fine to me.

    But, sure, ask for a replacement. What have you got to lose? A "No" is the only bad result.
    Be careful what you wish for though cuz there is a chance that you will get an older drive.

    Thanks for answering.
    What do those 2151 errors that SMART detect mean?

  • YuraYura Member

    @deank said:
    But, sure, ask for a replacement. What have you got to lose? A "No" is the only bad result.

    also

    @deank said: there is a chance that you will get an older drive.

  • deankdeank Member, Troll
    edited May 2019

    It could simply mean how many errors it had encountered which can happen due to various factors.

    @Redondit0 said:
    Thanks for answering.
    What do those 2151 errors that SMART detect mean?

  • williewillie Member

    They typically won't replace drives unless they are really failing or failed.

  • NeoonNeoon Community Contributor, Veteran
    edited May 2019

    @willie said:
    They typically won't replace drives unless they are really failing or failed.

    You can replace all hardware anytime at Hetzner for a fee.
    But if the Hardware is clearly fucked, it gets replaced for free.

    Well, Reallocated_Sector_Ct is at 0, so Hetzner wont replace it most likely.
    If you get bad sectors on a disk, they will replace it surely.

    Thanked by 1uptime
  • @Neoon said:

    @willie said:
    They typically won't replace drives unless they are really failing or failed.

    You can replace all hardware anytime at Hetzner for a fee.
    But if the Hardware is clearly fucked, it gets replaced for free.

    Well, Reallocated_Sector_Ct is at 0, so Hetzner wont replace it most likely.
    If you get bad sectors on a disk, they will replace it surely.

    Um if it can't past a short offline test, I think it can be replace for free... that has 60k hours and is clearly showing read errors... I would be making a back-up and asking for a replacement.

    The end is nigh!

    Thanked by 2uptime alexvolk
  • deankdeank Member, Troll
    edited May 2019

    I will take 7 year-old HDD over a brand new HDD.

    Errors can and will happen due to various factors. I wouldn't care for the error count.

  • williewillie Member

    Oh ok if it's failing self test then that's not good. Yeah open a ticket.

  • NeoonNeoon Community Contributor, Veteran

    @TheLinuxBug said:
    Um if it can't past a short offline test, I think it can be replace for free... that has 60k hours and is clearly showing read errors... I would be making a back-up and asking for a replacement.

    Of course if the machines dies, 60k is a decent age, nothing wrong about.
    But yea, some values look a bit suspicious if you look at them closer..

  • rm_rm_ IPv6 Advocate, Veteran
    edited May 2019

    deank said: Your drive looks fine to me.

    No

    deank said: It could simply mean how many errors it had encountered which can happen due to various factors.

    No

    Reported_Uncorrect (187) is a very bad symptom, e.g. Backblaze replaces the disk immediately in their DC if this goes above zero.

    It's not some abstract "errors" due to "various factors", it's when you asked the disk for data, and it couldn't read data it stored from the platters anymore. Irrecoverable data loss. You're really fine with a disk doing that?

  • deankdeank Member, Troll

    Probably because the disks on my storage rig has higher numbers.

  • Thanks for the answers.
    Tonight I'll run a long test and leave it running until tomorrow to see if anything else comes out.
    Most likely ask for a disk change, total, as deank said, the worst thing I can get is a NO, because if they give me a disk with more hours, but without those errors I think I win.

    Thanked by 1Hetzner_OL
  • deankdeank Member, Troll
    edited May 2019

    Indeed, nothing to lose by asking but all disks will have errors. That has been my experience. Considering the hours on yours (60k) and the amount of read error (2k), I'd say it's actually pretty good.
    I've seen far worse cases.

  • TheLinuxBugTheLinuxBug Member
    edited May 2019

    @deank said:
    Indeed, nothing to lose by asking but all disks will have errors. That has been my experience. Considering the hours on yours (60k) and the amount of read error (2k), I'd say it's actually pretty good.
    I've seen far worse cases.

    You have some weird standards for what you think is safe for disks, even in a raid.
    If I see:
    A. Uncorrectable CRC error (Reported Uncorrected) > 1
    B. Failed short offline read test
    C. Failed long offline read test (Especially)
    D. UDMA_CRC_Error_Count > 1
    E. Current_Pending_Sector > 1

    Then I am changing the disk in my own array ASAP.

    Raw_Read_Error_Rate on Seagate drives, however, can generally be ignored because they often use this value for diagnostics and it will randomly change to different numbers, even between times running 'smartctl'.

    If you see that on a Hitachi drive though, then its starting to die.

    I also prefer to run Hitachi Enterprise drives over anything Seagate because of the weird crap their SMART will report (such as Raw_Read_Error_Rate) and generally longer lifetime (generally live to be about 70-80k hours or sometimes more with regular wear).

    My 2 cents.

    Cheers!

  • levnodelevnode Member
    edited May 2019

    Neoon said: You can replace all hardware anytime at Hetzner for a fee. But if the Hardware is clearly fucked, it gets replaced for free.

    Well, Reallocated_Sector_Ct is at 0, so Hetzner wont replace it most likely. If you get bad sectors on a disk, they will replace it surely.

    Reallocated_Sector_Ct does not always mean the disk is OK. Reallocated_Sector_Ct > 0 does not mean the disk is bad. This is the HDD, there may be some issue with the head.

    In my point of view, this disk is seriously damaged and I am pretty sure that Hetzner will replace it for you for free.

  • rm_rm_ IPv6 Advocate, Veteran
    edited May 2019

    TheLinuxBug said: UDMA_CRC_Error_Count > 1

    These can be fine and just show the disk had a bad cable connection some time in the past.

    levnode said: Reallocated_Sector_Ct > 0 does not mean the disk is bad.

    Yes, if there's just a few and they do not increase. But that's for a drive you own, in a DC setting there's arguably little reason to tolerate even that, if the DC is known to accept that as grounds for replacement.

    Thanked by 1Falzo
  • Hetzner_OLHetzner_OL Member, Top Host

    willie said: They typically won't replace drives unless they are really failing or failed.
    Neoon said: You can replace all hardware anytime at Hetzner for a fee.

    Hi everyone, it's true. We exchange hardware components if the hardware is broken or doesn't perform well. Customers can request this in a support ticket. They should have log files available from test results, because our technicians may insist on log files as proof, especially when the hardware in question is a drive. Before we replace the hardware, our technicians ask the customer whether he would like to have a new drive with up to max. 1000 hours of running time. However, this option is only available in exchange of a fee.

    Redondit0 said: Tonight I'll run a long test and leave it running until tomorrow to see if anything else comes out.

    Redondit0 said: Most likely ask for a disk change, total, as deank said, the worst thing I can get is a NO, because if they give me a disk with more hours, but without those errors I think I win.

    Make sure you send the results of your long test together with your hardware change request. Our technicians will then decide if they will replace your drive. I hope I could clarify our disk change process if you have further questions, let me know. :)
    --Julia, Marketing

  • deankdeank Member, Troll

    What happened to Katie?

  • The long test was already done, but now I have a question.
    Where do I see the results?
    Since, at the beginning of the test, only the following came out.

    === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
    Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
    Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
    Testing has begun.
    Please wait 448 minutes for test to complete.
    Test will complete after Fri May 3 04:17:59 2019

    Use smartctl -X to abort test.

    I left it like this, thinking that it would give me some path to see the complete log at the end, but it did not.

  • deankdeank Member, Troll

    Well, that's not a result. That's for sure.

    Just send the SMART output to support. That's generally enough.

  • rm_rm_ IPv6 Advocate, Veteran
    edited May 2019

    Redondit0 said: Where do I see the results?

    In smartctl -a /dev/sdX, look for Self-test execution status, and when it completes, in the section below the line saying SMART Self-test log structure.

    Thanked by 1uptime
  • @rm_ said:

    Redondit0 said: Where do I see the results?

    In smartctl -a /dev/sdX, look for Self-test execution status, and when it completes, in the section below the line saying SMART Self-test log structure.

    Thanks!

    I thank everyone for the indications, opinions and experience that you share with me.
    Seeing and considering that I planned to do a re-installation of the OS, and rclone takes all the time in the world to upload the 5tb to google, I decided to request a new server and do the data transfer directly from server to server, taking advantage to look for a processor a bit better. What dou you recommend? An i7-3770, an i7-2600 or a Xeon E3-1245?
    It is for plex (3 simultaneous transmissions, maximum), torrent and rar/unrar.

  • rm_ said: These can be fine and just show the disk had a bad cable connection some time in the past.

    This is true, but hopefully if it is in your own setup you will know if such is involved with the appearance of that result. Such as a power outage or you just did maintenance and maybe the SATA cable wasn't connected well on first boot of the drive, etc. Yes there are reason why that could go up that would be unrelated to the drive going bad. However, in a DC environment if your seeing this go up, then at minimum you should be requesting they replace the SATA cable first, then if it continues I would for sure be asking for a replacement.

    Good point though, as I didn't define that very well.

    Cheers!

  • akhfaakhfa Member
    edited May 2019

    I have bad disk with them once on my auction server, and they replace the disk with new SSD in about six hours after inquiry :)

  • BayuBayu Member
    edited May 2019

    @Redondit0 said:
    What dou you recommend? An i7-3770, an i7-2600 or a Xeon E3-1245?
    It is for plex (3 simultaneous transmissions, maximum), torrent and rar/unrar.

    In term of video encoding speed (x264), i7-3770 better than E3-1245. Also with gpu accelerated encode (intel quicksync) for plex.

  • MilonMilon Member

    Maybe somebody can help we with advice. On one home ubunty server I use hdd from old notebook. Time to time it makes sound ding (maybe head parking or similar sound). hdparm show that Advanced Power Managment is off.

    Smart:
    https://pastebin.com/5JrJaU6p
    p.s. It looks like Seek_Error_Rate and Multi_Zone_Error_Rate increase quickly.
    Is patient dying? :)

  • EddingEdding Member

    i would ask for a replacement .. that hdd is dying

  • FalzoFalzo Member
    edited May 2019

    Does not look like close to dying to me. Sectors all good and it just passed an extended offline test without problems.

    Rapidly changing numbers might depend on the firmware. Some vendors use these fields for diagnostic stuff as has been mentioned above.

    Thanked by 1Milon
  • deankdeank Member, Troll
    edited May 2019

    When a HDD is dying, you or your client(s) will know. Trust WSS.

Sign In or Register to comment.