New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Comments
Generally form my exp they will ask you for the serial of the failing drive.
Good question, I thought it was an automated approach for @Hetzner_OL.
They will ask you for the serial number of the affected disk.
Always a good idea to have current smartctl -i /dev/sdX for every drive in the server.
Because if the drive fails, you probably won't get any SMART information from it.
But you can also do it through the process of elimination.
If you have a 4 drive RAID10 and one of the drive falls out, you can still get the SMART information for the other 3. So the person swapping out the drive would know it's the drive that doesn't match any of the 3 serial numbers provided.
But how can you get noticed while one drive in raid 10 failed since the filesystem continue works like normal?
There’s actually a video on YouTube shot from the hetzner datacenter, mentioned they have some kind of special tools just to test the drives 24x7.
Write a script to periodically check
cat /proc/mdstat
If a drive drops, one of the U's will be replaced with a _
Or run a script periodically that checks for changes:
cat /proc/mdstat | grep 'blocks'
Store a copy of that output into a file. Then when the next periodical check happens, compare the output of that check to what's in the file. If it's changed - then you may have a drive drop out. Probably not an EXACT science, but can give you enough of a notice to log into the server and check things out first hand.
Add your email address to /etc/mdadm.conf and it will email you when there is a failure.
You can either:
1. Say the serial number of the dead drive
2. Say the serial numbers of the drives that are still working and they'll remove the one's that are not included in the list