Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


OVH Quality!
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

OVH Quality!

AlbaHostAlbaHost Member, Host Rep

Strange, the node get offline. Buts its pingable so no ssh, ipmi, kvm or whatever is not accessable. Hired a sysadmin they cannot find any system logs error, contacted ovh support they said to do a hardware check in rescue mode which i made it there was not any error, before the node crashes i got this:

[root@server ~]# top -c top - 19:19:24 up 4:58, 2 users, load average: 27.10, 25.56, 19.41 Tasks: 371 total, 5 running, 366 sleeping, 0 stopped, 0 zombie Cpu(s): 6.8%us, 0.3%sy, 0.0%ni, 26.5%id, 65.7%wa, 0.0%hi, 0.1%si, 0.6%st Mem: 1650984k total, 1006608k used, 644376k free, 302324k buffers Swap: 6288372k total, 0k used, 6288372k free, 51952k cached

![](image![](image

«1

Comments

  • Do you have a dedicated server with OVH?

    So what your saying is that your dedicated server randomly locks up?

  • rm_rm_ IPv6 Advocate, Veteran

    Why does your 'top' show only 1.65 GB of RAM?

  • GoodHostingGoodHosting Member
    edited August 2014

    That 70%wait is because you have too much I/O going on.

    Have you checked your harddrives aren't failing?

    smartctl -a /dev/*da* | less

    Thanked by 1netomx
  • @GoodHosting said:
    That 70%wait is because you have too much I/O going on.

    Have you checked your harddrives aren't failing?

    smartctl -a /dev/*da* | less

    Indeed, looks like your server is crashing because of a corrupt/damaged/broken/whatever drive

    Thanked by 1netomx
  • Show us smart output for sda/sdb/sdc. If you find which one is faulty, you can take it offline and show the output of smartctl to OVH who should replace the disk. Try powering it off and on again a few times if no SMART errors show, and if still nothing take the disk out of raid and write 0's to it until something shows!

    Good luck trying to get OVH to replace disks showing fine in SMART...

  • Shoaib_AShoaib_A Member
    edited August 2014

    @linuxthefish said:
    Good luck trying to get OVH to replace disks showing fine in SMART...

    Either that or output of some MegaCli commands(if you got hardware RAID). Otherwise they would never replace them. Hetzner do replace them at your wish with a 49 Euro one time fee though.

  • AlbaHostAlbaHost Member, Host Rep
    edited August 2014

    At the end they decide to replace power suply:
    Date: 2014-08-05 15:34:00

    An operation was added for ns323814.ip-37*********.eu to the
    list of interventions at 2014-08-05 15:30:00.

    Here are the details of this operation: Power Suply
    replacement

    You will be notified when work will begin shortly before
    the intervention.

  • Is your dedicated server ?

  • AlbaHostAlbaHost Member, Host Rep
    edited August 2014

    @FirstVM_com said:
    Is your dedicated server ?

    Yes.

  • ProfforgProfforg Member
    edited August 2014

    said: Hired a sysadmin they cannot find any system logs error,

    Hi.

    Looks like you hired a bad sysadmin :) I can look at that if you want to. I guess you have cpu overheat with a plus of critical drive I/O - combine it and you have system stale. Your server looks like OVH "Hosting" range - so there most probably no issue with hardware (because it's not cheap).

  • AlbaHostAlbaHost Member, Host Rep

    @Profforg said:
    Looks like you hired a bad sysadmin :) I can look at that if you want to. I guess you have cpu overheat with a plus of critical drive I/O - combine it and you have system stale. Your server looks like OVH "Hosting" range - so there most probably no issue with hardware (because it's not cheap).

    No, not really. The problem is solved, there was a power supplier issue:

    The intervention on ns323814.ip-3********5.eu has been
    completed.

    This operation was closed at 2014-08-05 16:43:43

    Here are the details of this operation:
    Power Suply replacement
    Date 2014-08-05 15:39:45, gregory.dubois made Power Suply
    replacement:
    Diagnosis:
    HS power

    Actions:
    Replacing the power supply. Server restart.

    result:
    Boot OK. Server on login screen. Ping OK, services started

  • Good.

  • linuxthefishlinuxthefish Member
    edited August 2014

    AlbaHost said: gregory.dubois

    The name of the guy who replaced it? Sounds dubious (and french)!

  • linuxthefish said: The name of the guy who replaced it? Sounds dubious (and french)!

    You thought OVH hire indians?

    Thanked by 3tux Maounique Fidde
  • @Profforg said:
    You thought OVH hire indians?

    Everyone knows the French hate foreigners ;-)

  • @0xdragon said:
    Everyone knows the French hate foreigners ;-)

    Well, we know they like to surrender a lot. "Oi! Lightsabeur!" (Crappy South Park reference)

    Thanked by 20xdragon orak
  • AlbaHostAlbaHost Member, Host Rep
    edited August 2014

    Well i decide to reopen this thread about OVH shitty services, since the node starts after component change crasheing again and nothing from their side incident support have moved the ticket to commercrial support because they cant fix and find the issue and the commercial support push me to order a new server which it mean one more time fees and + server price, some logs below:

    Date: 2014-08-16 04:50:26 Dear customer, You request is related to the ticket #1751955 Which was transferred to our commercial team As already informed the commercial will assist you as a solution we proposed you to subscribe to a new server the commercial team will then apply a commercial gesture This is not on the incident level. Please continue with my colleague from the support team. Kind regards, Naoufel.

    From commecrial support:

    15/08/2014 10:28 From: support Hello, Have you ordered the new dedicated server yet? As soon as you do we will transfer the time for you. Please don't hesitate to contact us if you have any other queries regarding this or any other issue. Regards, Tom OVH.ie Support Team
    And
    Due for 2 weeks and every 2 days my server crashed, i decide to cancel my server and i need full refund or compenson! Due for your low quality server i have hired 3 times sysadmin and many times calling in your help center and lots of my customers are out from my company now, i decide to cancel and refund my money and i will leave your company because its TERRIBLE!
    06/08/2014 18:38
    From: support
    Dear customer I see that you have been corresponding with our technicians via the incident ticket 1751955. Is the issue with your server still persisting since the intervention on your server was completed? We would not be able to refund the server but you could be given a commercial gesture for the downtime of your server. Could you state the periods of server downtime so we can calculate the commercial gesture. If you have any other queries don't hesitate to contact us. Kind Regards David OVH Team

    Why the hell should i buy again a server instead of giving me a new one or to fix the current one what a shame.
    12/08/2014 11:13
    From: support
    Hello We require that you send us a log of the times the server has gone down or been inaccessible, this is so that we can calculate the commercial gesture you can receive. Once we have this log we can then calculate the how much of a commercial gesture you will receive. Kind Regards OVH.ie Support Team

    Anyone maybe a suggestion what to do with those faggots*

  • wychwych Member

    Again, what do the logs show before the server started failing again?

  • AlbaHostAlbaHost Member, Host Rep
    edited August 2014

    @wych said:
    Again, what do the logs show before the server started failing again?

    Nothing, even from their tech support didnt found anything.

  • AlbaHostAlbaHost Member, Host Rep
    edited August 2014

    I decide to write incident ticket again because it crashed again and look:

    Dear customer, Please keep in touch with our commercial support. Your request can not be handled at the incident level, we can't provide you with further asisstance. Kind regards, Zied.

  • You bought an unmanaged server, so you must understand that they can only fix hardware issues. This issues must be shown by yourself and OVH fix this issue.

    I'm sad why people do not understand what unmanaged service (dedicated server) means.

  • AlbaHostAlbaHost Member, Host Rep
    edited August 2014

    @fileMEDIA said:
    You bought an unmanaged server, so you must understand that they can only fix hardware issues. This issues must be shown by yourself and OVH fix this issue.

    I'm sad why people do not understand what unmanaged service (dedicated server) means.

    Thank you for your smart words, we all know that UNMANAGED mean and what DEDICATED SERVER mean!
    Since from the first time when i have bought this server had problems, so they found out a power error and changed power supply, after that again they found error and decide to change motherboard, then rams and the problem still presist. So there seems to be something wrong from their end and not from my end!
    Oh and glad that you UNDERSTAND what UNMANAGED mean.

  • AlbaHost said: Thank you for your smart words, we all know that UNMANAGED mean and what DEDICATED SERVER mean! Since from the first time when i have bought this server had problems, so they found out a power error and changed power supply, after that again they found error and decide to change motherboard, then rams and the problem still presist. So there seems to be something wrong from their end and not from my end! Oh and glad that you UNDERSTAND what UNMANAGED and DEDICATED SERVER is.

    You do something wrong. I personally know OVH users. There are no issues with their hardware. There are tiny chance, that they may have problems with AMD servers, since they usually work with Intel, they may fail to correctly analyze problems with AMD hardware. So, since you ask help again, i'll suggest 2 things

    1. Let someone experienced monitor your server to see that's the problem at the core. Maybe load control or some resources usage limits / some system tweaks may help to solve the issue (decrease load to hardware).
    2. Migrate all your clients from this server to another server on Intel CPUs.
  • When they have nearly changed the whole hardware a software problem is not unrealistic. There isn't much more hardware which they can change. Most of this "hangs" are power supply issues, disk issues or software problems.

  • @AlbaHost said:
    Since from the first time when i have bought this server had problems, so they found out a power error and changed power supply, after that again they found error and decide to change motherboard, then rams and the problem still presist.

    Hardware problems happen to everyone. Manufacturers do not not have separate production lines for OVH equipment that pump out lower quality parts.

    Secondly, humans have not yet evolved to be clairvoyant, so sometimes it takes a few iterations to figure what is wrong.

    Thirdly, what information do you have that proves beyond a doubt that you are experiencing a hardware problem, other than you hired a "sysadmin"?

  • what error did they find with the PSU, ram and motherboard? a willingness to swap parts is not necessarily an indication that there was actually a detectable problem - its just the only thing they can really offer to do from their side. i can say from experience we do see that with power supplies often. place them on a meter and they test just fine. but in the system even just a little variation and they can do funky things (particularly with HDD's). sometimes swapping out a PSU (or a stick of ram, or a motherboard) is a fix even though it 'tested' right. that just comes from experience. however if they've swapped every component at this point, it seems time to look elsewhere for causes...

    Thanked by 2Maounique Fidde
  • AlbaHostAlbaHost Member, Host Rep

    @Microlinux said:
    Thirdly, what information do you have that proves beyond a doubt that you are experiencing a hardware problem, other than you hired a "sysadmin"?

    What kind of proves you need more if the server crashes every 2-3 days, and hiring many sysadmins?
    And the tech admins from ovh decide to change the server and moved the ticket to commercial supports?

  • AlbaHostAlbaHost Member, Host Rep

    @datarealm said:
    what error did they find with the PSU, ram and motherboard? a willingness to swap parts is not necessarily an indication that there was actually a detectable problem - its just the only thing they can really offer to do from their side. i can say from experience we do see that with power supplies often. place them on a meter and they test just fine. but in the system even just a little variation and they can do funky things (particularly with HDD's). sometimes swapping out a PSU (or a stick of ram, or a motherboard) is a fix even though it 'tested' right. that just comes from experience. however if they've swapped every component at this point, it seems time to look elsewhere for causes...

    Well they didn't mentioned in ticket, they only notified me that they made mentioned components replacement nothing more.

  • @Profforg said:
    since they usually work with Intel, they may fail to correctly analyze problems with AMD hardware

    That is not true, they use only AMD processors in their VPS & cloud lineup.

  • the server crashes every 2-3 days

    That means nothing other than . . . . the server is crashing every 2-3 days.

    and hiring many sysadmins?

    I thought it was "a" sysadmin? You can hire 50 "sysadmins", but at least one of them has to be competent.

    the tech admins from ovh decide to change the server

    Who knows, maybe they were trying it for your sake.

    If your server is crashing on a consistent interval, that's unlikely to be a hardware problem (though not impossible).

    It's not impossible this is a hardware problem, in which case I encourage you to re-read my first two points.

    Thanked by 1Maounique
This discussion has been closed.