Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Help with OVH/Kimsufi Server Issues (Randomly restarting)
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Help with OVH/Kimsufi Server Issues (Randomly restarting)

I've been having issues with my OVH server (That I've had for 4 years) with it randomly rebooting. Well, I say rebooting but when I check the logs I can't see it shutting down only starting up - unless I'm missing something obvious.

At one stage a few months back the server went offline and wouldn't startup, I had to rebuild it. Since the rebuild though the issue continues which makes me think it isn't software related.

I reported the issues to OVH and they went on to check the power and said that they replaced the power supply and also the 'cooling system'. Unfortunately the problem still persists.

They said that they would run 'extensive diagnostics' which would result in the server being down for 'several hours'. For whatever reason they completed this test in just under 6 minutes. I'm not convinced they did any real diagnostics in this short period of time.

I was just looking for any guidance, logs to check or suggestions on what may be causing this or what I can do to narrow down the issue. I've been looking at it on/off for so long now I maybe just missing something obvious.

Normally I would just take out a new OVH server and move everything across, but despite having the server for 4 years and renewing monthly, I recently paid upfront for 12 months when the price increase came along to secure the lower price. Talk about bad timing!!

Any suggestions would be greatly appreciated.

Thanks in advance.

Comments

  • rm_rm_ IPv6 Advocate, Veteran
    edited April 2023

    Which server model, and did you check the temperatures yourself?

  • atulatul Member

    ovh is the most pathetic provider for last 2 years
    read there review for ovh.it https://www.trustpilot.com/review/ovh.it

    they removed there site ovh.com from trustpilot, it had the worst reviews on trustpilot so they removed there site

  • @rm_ said:
    Which server model, and did you check the temperatures yourself?

    It is the Kimsufi KS-10 with an Intel i5-2400 @ 3.10GHz, 16GB RAM and 2TB HDD.

    Yes, checked the temperatures, everything seemed OK Here are the results using lm_sensors:

    coretemp-isa-0000
    Adapter: ISA adapter
    Package id 0: +40.0°C (high = +80.0°C, crit = +99.0°C)
    Core 0: +30.0°C (high = +80.0°C, crit = +99.0°C)
    Core 1: +33.0°C (high = +80.0°C, crit = +99.0°C)
    Core 2: +40.0°C (high = +80.0°C, crit = +99.0°C)
    Core 3: +35.0°C (high = +80.0°C, crit = +99.0°C)

    nct6775-isa-0290
    Adapter: ISA adapter
    Vcore: +1.21 V (min = +0.00 V, max = +1.74 V)
    in1: +1.10 V (min = +0.00 V, max = +0.00 V) ALARM
    AVCC: +3.30 V (min = +2.98 V, max = +3.63 V)
    +3.3V: +3.30 V (min = +2.98 V, max = +3.63 V)
    in4: +1.02 V (min = +0.00 V, max = +0.00 V) ALARM
    in5: +1.03 V (min = +0.00 V, max = +0.00 V) ALARM
    in6: +1.06 V (min = +0.00 V, max = +0.00 V) ALARM
    3VSB: +3.36 V (min = +2.98 V, max = +3.63 V)
    Vbat: +3.31 V (min = +2.70 V, max = +3.63 V)
    fan1: 0 RPM (min = 0 RPM, div = 128)
    fan2: 0 RPM (min = 0 RPM, div = 128)
    fan3: 0 RPM (min = 0 RPM, div = 128)
    fan4: 0 RPM (div = 128)
    SYSTIN: +31.0°C (high = +0.0°C, hyst = +0.0°C) ALARM sensor = CPU diode
    CPUTIN: +29.0°C (high = +80.0°C, hyst = +75.0°C) sensor = CPU diode
    AUXTIN: -128.0°C (high = +80.0°C, hyst = +75.0°C) sensor = CPU diode
    PCH_CHIP_TEMP: +49.0°C
    PECI Agent 1: +0.0°C (high = +80.0°C, hyst = +75.0°C)
    PECI Agent 0: +36.0°C (high = +80.0°C, hyst = +75.0°C)
    cpu0_vid: +0.000 V
    intrusion0: OK

  • If it passes the rescue mode diagnostics then ask nicely for a replacement motherboard

  • adnsadns Member
    edited April 2023

    @darkimmortal said:
    If it passes the rescue mode diagnostics then ask nicely for a replacement motherboard

    +1, as I read the thread it was my first thought too.
    @happyman How much you pay for this KS-10 service? All direction of traffic caped to 100 Mbps?

  • @darkimmortal said:
    If it passes the rescue mode diagnostics then ask nicely for a replacement motherboard

    Thanks, I'll give that a go. See if they are helpful and willing to do that without me providing them with proof that is the cause.

  • @adns said:

    @darkimmortal said:
    If it passes the rescue mode diagnostics then ask nicely for a replacement motherboard

    +1, as I read the thread it was my first thought too.
    @happyman How much you pay for this KS-10 service? All direction of traffic caped to 100 Mbps?

    €15.59 per month including tax.

    Thanked by 1adns
  • rm_rm_ IPv6 Advocate, Veteran
    edited June 2023

    @happyman you still around?

    On my server (same specs as yours) I suddenly got all sorts of failures in dmesg (USB, SATA, Ethernet). Then it got a high packet loss and stopped pinging entirely. They did an intervention and as they wrote replaced the motherboard. And it all got fixed.

    Except with the new motherboard it now suddenly reboots, so far about once a day.

    How did the issue end for you? I feel there might be a comical situation, if they replaced your motherboard, put it through some shallow testing, "hum seems fine", and put it into the stockpile of spare motherboards to install into other servers. And I now maybe got yours :smiley:

  • @rm_

    I struggled to get them to replace the motherboard and got them to in the end (only about a week ago). Unfortunately the system has rebooted a few times since then unfortunately so not entirely sure what else it could be apart from the actual power cabling or supply coming into the server.

    Thanked by 1rm_
  • rm_rm_ IPv6 Advocate, Veteran
    edited June 2023

    Did you try to correlate it with CPU load (or idling)?

    One guess I had is VRM at some point in a power-saving mode could be giving too low voltage to the CPU, causing it to reboot (putting aside the root causes of that).

    If it happens often enough for you, first you can try disabling the frequency switching and see if that helps:

    for I in {0..3}; do cpufreq-set -g performance -c $I; done

    If not, then add some actual CPU load at the lowest priority and see if that affects things:

    nice -+20 dd if=/dev/zero | nice -+20 md5sum & disown

    On mine after two reboots a day apart, I did not have any more so far (so did not try these workarounds yet).

  • This issue is still occurring. It doesn't seem to be related to CPU load or idling.

    I'm at a loss what else to try and what else to suggest OVH swap out.

    I've asked if I can get a new server from them and transfer the amount prepaid over but their billing department say it isn't possible.

  • MikeAMikeA Member, Patron Provider

    When I had issues similar to this in the past (many many years ago using the old i7's) I had motherboards replaced and it worked fine afterwards. Only had that happen with their old SYS Intel Core model boards.

Sign In or Register to comment.