Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Dedicated Server - Hardware Monitoring ?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Dedicated Server - Hardware Monitoring ?

nvidiannvidian Member
edited April 2018 in General

Hello all,

I used to be a VPS/Cloud users but now have one unmanaged dedicated server - pretty old XEON:

  • Xeon L5640 @ 2.27GHz (Westmere)
  • RAM 48Gb ECC
  • 2x SSD Samsung MZHPU256HC (maybe M2 SSD) - RAID 1
  • 4x HDD 3TB Hitachi HUS724030ALE641 - RAID 10



    I installed Proxmox and split it into couple of KVM and its works very well

What worried me most is that I dont have much knowledge on hardware issue. The provider doesnt provide hardware health monitoring on their panel. Again, forgive my ignorance, I'm just a VPS user before, so I never have learn how to maintain/monitor my hardware.


My questions:


1. What should I do to regular check my server to ensure my server in healthy condition ? (Hardware Monitoring and Troubleshooting)


2. Is there any dedicated server provider who provide easy hardware monitoring (auto notification if something went wrong, hardware health information etc) ? - I would prefer low end provider if possible



Thank you guys

Comments

  • BlaZeBlaZe Member, Host Rep

    Been using Pinguzo.com and it works like a charm. Its free in beta.

    I have more than 30+ servers there being monitored for LOAD, RAM, CPU, HDD, Network usage.

    Thanked by 1nvidian
  • A lot of hardware failure is random. Certain things like drives are easier to predict. That's the appeal of cloud over bare metal, in a worst case scenario transferring a kvm on Proxmox could be done by a 5th grader in about 3 minutes.

  • Hardware is still provider's responsibility, if anything dies and server goes offline you just create ticket and wait. I doubt low-end servers have any redundancy, so just monitoring that server is online will be enough.

    The only thing that is IMO worth doing is setting up smartd/mdmon (or whatever controller tool in case of HW raid) to send emails, and as usual - have offsite backups for the case when something goes really, really wrong.

    Thanked by 1beagle
  • ClouviderClouvider Member, Patron Provider

    @Gamma17 said:
    Hardware is still provider's responsibility, if anything dies and server goes offline you just create ticket and wait. I doubt low-end servers have any redundancy, so just monitoring that server is online will be enough.

    By definition, bare metal is a bare metal. You don’t get any redundancy, you’re free to build your own or pay for a solution to create a resilient environment for you.

    The only thing that is IMO worth doing is setting up smartd/mdmon (or whatever controller tool in case of HW raid) to send emails, and as usual - have offsite backups for the case when something goes really, really wrong.

    Smart monitoring + email is the way to go. Looking through the messages/dmesg log from time to time wouldn’t hurt either.

  • @Clouvider said:

    By definition, bare metal is a bare metal. You don’t get any redundancy, you’re free to build your own or pay for a solution to create a resilient environment for you.

    What i meant are things like redundant PSU-s, redundant fans etc, which may be worth monitoring just to make sure provider does not miss it and let it run degraded and then fail.
    But i highly doubt anything called "low-end server" will have this.

    Thanked by 1Clouvider
  • NeoonNeoon Community Contributor, Veteran
    edited April 2018

    My KS1, from 2013, had 2-3 network issues in 5 years, no hardware failure, nothing.
    If you put your stuff on a dedi, you be fine with backups.

    If something fails, you have a downtime for a few hours in 2-5 years, restore the backup and you are fine.

    When this crap is not mission critical, its fine.

    If you need something mission critical, so you gonna loose billions of dullahs, do not take a single machine. Simple.

    I had 3 hardware failures in about 5 years on my dedis, I have about 10 dedis.

    To Monitor I would recommend: https://github.com/firehol/netdata

    if the server is dead, they check it, you cannot do anything.

    Thanked by 1nvidian
  • ClouviderClouvider Member, Patron Provider

    @Gamma17 said:

    @Clouvider said:

    By definition, bare metal is a bare metal. You don’t get any redundancy, you’re free to build your own or pay for a solution to create a resilient environment for you.

    What i meant are things like redundant PSU-s, redundant fans etc, which may be worth monitoring just to make sure provider does not miss it and let it run degraded and then fail.
    But i highly doubt anything called "low-end server" will have this.

    The 'low-end' E3-1270V5/V6 we sell on here have both N+1 PSU and N+1 Fans so they are quite resilient for the standard.

    I presume the ancient delimiter E55XX blades also to have similar resiliency levels on PSUs and fans (what cannot be said about their DC, however...), I don't think HP was selling them without N+1 config back then.

    So always worth checking with the supplier.

    And this reminds me, we should add it to our marketing :)

  • quadhostquadhost Member
    edited April 2018

    @Clouvider said:

    Smart monitoring + email is the way to go. Looking through the messages/dmesg log from time to time wouldn’t hurt either.

    This.

    Zabbix is also useful for tracking historical stats, setting notification for x alert and load/usage information.

  • Gamma17Gamma17 Member
    edited April 2018

    @Clouvider said:

    The 'low-end' E3-1270V5/V6 we sell on here have both N+1 PSU and N+1 Fans so they are quite resilient for the standard.

    I presume the ancient delimiter E55XX blades also to have similar resiliency levels on PSUs and fans (what cannot be said about their DC, however...), I don't think HP was selling them without N+1 config back then.

    So always worth checking with the supplier.

    And this reminds me, we should add it to our marketing :)

    That's good to know.

    And probably true for most blade/"cloud"/whatever systems when there is single chassis, which provides all the power, cooling, etc to all the systems inserted. But in this case customer also has no access to this chassis and no way to monitor PSUs and fans. And i bet it is monitored and fixed as needed without customer ever noticing a thing...

    What i was thinking about are cheap 1-2U servers, which often come with single PSU (with redundancy/hotplug being optional) and desktop-grade systems, which are used in a lot of cheap offers. And what made me think about 1-2U server are 4 3.5 HDD-s + 2 SSD-s in OP...

    Thanked by 1nvidian
  • ClouviderClouvider Member, Patron Provider

    In the MicroCloud or MicroBlade products the hardware statuses for shared components are fed directly to each blade so one can still watch it if they like, they also get IPMI access that gives them access to this diagnostic information in a clear and easy form, but I agree, good operators would monitor these chassis remotely and likely respond before the Customer ever noticing.

    Thanked by 1nvidian
  • @BlaZe said:
    Been using Pinguzo.com and it works like a charm. Its free in beta.

    I have more than 30+ servers there being monitored for LOAD, RAM, CPU, HDD, Network usage.

    see long time ago, but never get chance to try it. another softaculous product.

  • BlaZeBlaZe Member, Host Rep

    @andiklive said:
    see long time ago, but never get chance to try it. another softaculous product.

    Yes and its getting better day by day. I had some issues with it earlier but they fixed it.

  • @BlaZe said:

    @andiklive said:
    see long time ago, but never get chance to try it. another softaculous product.

    Yes and its getting better day by day. I had some issues with it earlier but they fixed it.

    I really expect for it's cost to be affordable for small vps'

  • wavecomaswavecomas Member, Host Rep

    Most cheaper and easier option is snmp monitoring

    Just install snmp agent to host and your vm´s
    Most of brand servers have additional snmp agents and/or remote management like HPE ILO Dell Idrac. They will allow to get more detailed info about hardware.

    Make another vm and install some free monitoring software. All gpl licenced are demanding some knowledge, excperience and most important - time. There is really nice piece software called Manage engine opmanager. Its free for 10 devices, setup is really easy and documentation good for beginner.

    So after that you are able to see and monitor bretty much everything what is going with your host hardware and vm and host os and applications.You can set also very different notifications..

    With second question - IMO there are lot of providers who giving access to server remote management like ILO , iDrac etc. With better brands there are notification options.
    But such options are really basic comparing monitoring system. And notifications are quite new feature with most brand, so i guess not present your server.

    Thanked by 1nvidian
  • I'm sorry for late respond. Really appreciate the helps from you guys

    @BlaZe said:
    Been using Pinguzo.com and it works like a charm. Its free in beta.

    I have more than 30+ servers there being monitored for LOAD, RAM, CPU, HDD, Network usage.

    Thank you.

    @wavecomas said:
    Most cheaper and easier option is snmp monitoring

    Just install snmp agent to host and your vm´s
    Most of brand servers have additional snmp agents and/or remote management like HPE ILO Dell Idrac. They will allow to get more detailed info about hardware.

    Thank you, I think I will try snmp monitoring first

    With second question - IMO there are lot of providers who giving access to server remote management like ILO , iDrac etc. With better brands there are notification options.
    But such options are really basic comparing monitoring system. And notifications are quite new feature with most brand, so i guess not present your server.

    @Neoon said:
    To Monitor I would recommend: https://github.com/firehol/netdata

    if the server is dead, they check it, you cannot do anything.

    Thank you for your suggestion. I will surely try snmp, netdata and pinguzo

    @Clouvider said:
    In the MicroCloud or MicroBlade products the hardware statuses for shared components are fed directly to each blade so one can still watch it if they like, they also get IPMI access that gives them access to this diagnostic information in a clear and easy form, but I agree, good operators would monitor these chassis remotely and likely respond before the Customer ever noticing.

    Its a supermicro server - already set "Warning and Above" alert level to my email. Do you think its sufficient ?

Sign In or Register to comment.