All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Dedicated Server - Hardware Monitoring ?
Hello all,
I used to be a VPS/Cloud users but now have one unmanaged dedicated server - pretty old XEON:
- Xeon L5640 @ 2.27GHz (Westmere)
- RAM 48Gb ECC
- 2x SSD Samsung MZHPU256HC (maybe M2 SSD) - RAID 1
- 4x HDD 3TB Hitachi HUS724030ALE641 - RAID 10
I installed Proxmox and split it into couple of KVM and its works very well
What worried me most is that I dont have much knowledge on hardware issue. The provider doesnt provide hardware health monitoring on their panel. Again, forgive my ignorance, I'm just a VPS user before, so I never have learn how to maintain/monitor my hardware.
My questions:
1. What should I do to regular check my server to ensure my server in healthy condition ? (Hardware Monitoring and Troubleshooting)
2. Is there any dedicated server provider who provide easy hardware monitoring (auto notification if something went wrong, hardware health information etc) ? - I would prefer low end provider if possible
Thank you guys
Comments
Been using Pinguzo.com and it works like a charm. Its free in beta.
I have more than 30+ servers there being monitored for LOAD, RAM, CPU, HDD, Network usage.
A lot of hardware failure is random. Certain things like drives are easier to predict. That's the appeal of cloud over bare metal, in a worst case scenario transferring a kvm on Proxmox could be done by a 5th grader in about 3 minutes.
Hardware is still provider's responsibility, if anything dies and server goes offline you just create ticket and wait. I doubt low-end servers have any redundancy, so just monitoring that server is online will be enough.
The only thing that is IMO worth doing is setting up smartd/mdmon (or whatever controller tool in case of HW raid) to send emails, and as usual - have offsite backups for the case when something goes really, really wrong.
By definition, bare metal is a bare metal. You don’t get any redundancy, you’re free to build your own or pay for a solution to create a resilient environment for you.
Smart monitoring + email is the way to go. Looking through the messages/dmesg log from time to time wouldn’t hurt either.
What i meant are things like redundant PSU-s, redundant fans etc, which may be worth monitoring just to make sure provider does not miss it and let it run degraded and then fail.
But i highly doubt anything called "low-end server" will have this.
My KS1, from 2013, had 2-3 network issues in 5 years, no hardware failure, nothing.
If you put your stuff on a dedi, you be fine with backups.
If something fails, you have a downtime for a few hours in 2-5 years, restore the backup and you are fine.
When this crap is not mission critical, its fine.
If you need something mission critical, so you gonna loose billions of dullahs, do not take a single machine. Simple.
I had 3 hardware failures in about 5 years on my dedis, I have about 10 dedis.
To Monitor I would recommend: https://github.com/firehol/netdata
if the server is dead, they check it, you cannot do anything.
The 'low-end' E3-1270V5/V6 we sell on here have both N+1 PSU and N+1 Fans so they are quite resilient for the standard.
I presume the ancient delimiter E55XX blades also to have similar resiliency levels on PSUs and fans (what cannot be said about their DC, however...), I don't think HP was selling them without N+1 config back then.
So always worth checking with the supplier.
And this reminds me, we should add it to our marketing
This.
Zabbix is also useful for tracking historical stats, setting notification for x alert and load/usage information.
That's good to know.
And probably true for most blade/"cloud"/whatever systems when there is single chassis, which provides all the power, cooling, etc to all the systems inserted. But in this case customer also has no access to this chassis and no way to monitor PSUs and fans. And i bet it is monitored and fixed as needed without customer ever noticing a thing...
What i was thinking about are cheap 1-2U servers, which often come with single PSU (with redundancy/hotplug being optional) and desktop-grade systems, which are used in a lot of cheap offers. And what made me think about 1-2U server are 4 3.5 HDD-s + 2 SSD-s in OP...
In the MicroCloud or MicroBlade products the hardware statuses for shared components are fed directly to each blade so one can still watch it if they like, they also get IPMI access that gives them access to this diagnostic information in a clear and easy form, but I agree, good operators would monitor these chassis remotely and likely respond before the Customer ever noticing.
see long time ago, but never get chance to try it. another softaculous product.
Yes and its getting better day by day. I had some issues with it earlier but they fixed it.
I really expect for it's cost to be affordable for small vps'
Most cheaper and easier option is snmp monitoring
Just install snmp agent to host and your vm´s
Most of brand servers have additional snmp agents and/or remote management like HPE ILO Dell Idrac. They will allow to get more detailed info about hardware.
Make another vm and install some free monitoring software. All gpl licenced are demanding some knowledge, excperience and most important - time. There is really nice piece software called Manage engine opmanager. Its free for 10 devices, setup is really easy and documentation good for beginner.
So after that you are able to see and monitor bretty much everything what is going with your host hardware and vm and host os and applications.You can set also very different notifications..
With second question - IMO there are lot of providers who giving access to server remote management like ILO , iDrac etc. With better brands there are notification options.
But such options are really basic comparing monitoring system. And notifications are quite new feature with most brand, so i guess not present your server.
I'm sorry for late respond. Really appreciate the helps from you guys
Thank you.
Thank you, I think I will try snmp monitoring first
Thank you for your suggestion. I will surely try snmp, netdata and pinguzo
Its a supermicro server - already set "Warning and Above" alert level to my email. Do you think its sufficient ?