Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Anyone had their hosthatch server down? - Page 3
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Anyone had their hosthatch server down?

13»

Comments

  • remyremy Member
    edited November 2024
    Transferred:        1.563 TiB / 1.563 TiB
    Elapsed time:   5h57m12.4s
    

    The problem seems solved. No reboots / crashes during transfer
    Hope it lasts.
    Thanks for your answers

    Thanked by 1OsirisBlack
  • ralfralf Member
    edited December 2024

    My VM in Stockhold hard-locked again this morning. The last log entry was 03:22:52 UTC. Would be interesting to know if this correlates with anyone else.

  • @qtwrk said:

    @GUZELSHOP said:

    @kmm996 said:
    My server is offline today,It's been almost 12 hours, I submitted a ticket and still haven't received a response in 9 hours.

    Is your server in Sweden?
    Seems like there's some issues with some SE node and one of my servers there :(

    seems so , I got 2 servers in SE , 1 is good all the time , since yesterday night , the second one is constantly getting shutdown , after I manually start up , it shuts down in few minutes , and today it doesn't even start up anymore

    This is basically what I encountered the other day, three days in a row, and every day it would shut down and require a manual reboot.

    But when I opened a ticket, the staff said they hadn't heard similar feedback.

    I tried to catch this via uptime and unfortunately the downtime didn't happen again.

    Seeing you guys, I think it's definitely not an isolated case, they just solved it quietly and didn't publicize it.

  • @danblaze said:
    This is basically what I encountered the other day, three days in a row, and every day it would shut down and require a manual reboot.

    But when I opened a ticket, the staff said they hadn't heard similar feedback.

    I tried to catch this via uptime and unfortunately the downtime didn't happen again.

    Seeing you guys, I think it's definitely not an isolated case, they just solved it quietly and didn't publicize it.

    my SE storage is down since yesterday , when I manual boot up in dashboard, it boots up and will stay alive , but as soon as I mount the storage disk , it will freeze and die ...

  • @danblaze said:

    @qtwrk said:

    @GUZELSHOP said:

    @kmm996 said:
    My server is offline today,It's been almost 12 hours, I submitted a ticket and still haven't received a response in 9 hours.

    Is your server in Sweden?
    Seems like there's some issues with some SE node and one of my servers there :(

    seems so , I got 2 servers in SE , 1 is good all the time , since yesterday night , the second one is constantly getting shutdown , after I manually start up , it shuts down in few minutes , and today it doesn't even start up anymore

    This is basically what I encountered the other day, three days in a row, and every day it would shut down and require a manual reboot.

    But when I opened a ticket, the staff said they hadn't heard similar feedback.

    I tried to catch this via uptime and unfortunately the downtime didn't happen again.

    Seeing you guys, I think it's definitely not an isolated case, they just solved it quietly and didn't publicize it.

    It's not solved. It's now happened to me 5 times in a row, and it happens repeatedly within seconds of writing a reasonable amount of data to the attached storage. Just touching a new file and sync wasn't enough to trigger it but split -b1G /dev/random /home/borg/crashtest/ is enough to crash it within seconds (where /home/borg/ is the mounted /dev/vdb) but doing the same test on the NVMe drive is completely fine.

  • same observation on sweden storage node. Has been freezing every morning. it stays alive if i'm not doing any R/W to the attached storage (for some time).

  • @ralf said:

    @danblaze said:

    @qtwrk said:

    @GUZELSHOP said:

    @kmm996 said:
    My server is offline today,It's been almost 12 hours, I submitted a ticket and still haven't received a response in 9 hours.

    Is your server in Sweden?
    Seems like there's some issues with some SE node and one of my servers there :(

    seems so , I got 2 servers in SE , 1 is good all the time , since yesterday night , the second one is constantly getting shutdown , after I manually start up , it shuts down in few minutes , and today it doesn't even start up anymore

    This is basically what I encountered the other day, three days in a row, and every day it would shut down and require a manual reboot.

    But when I opened a ticket, the staff said they hadn't heard similar feedback.

    I tried to catch this via uptime and unfortunately the downtime didn't happen again.

    Seeing you guys, I think it's definitely not an isolated case, they just solved it quietly and didn't publicize it.

    It's not solved. It's now happened to me 5 times in a row, and it happens repeatedly within seconds of writing a reasonable amount of data to the attached storage. Just touching a new file and sync wasn't enough to trigger it but split -b1G /dev/random /home/borg/crashtest/ is enough to crash it within seconds (where /home/borg/ is the mounted /dev/vdb) but doing the same test on the NVMe drive is completely fine.

    You're right, it's not resolved at all, just what I thought was resolved. Yesterday it happened again.

    I wonder if the @hosthatch guys have any plans to fix this?

    I'm sure more than one user has started a ticket stating this.

    It's not unacceptable for a host to have a hardware failure, I know, it happens from time to time, it just needs to be fixed and moved forward.

    Just make sure your staff is really aware of the issue and let users know you've started to fix it to give some peace of mind.

  • ralfralf Member
    edited December 2024

    Interesting. I've had no response to my ticket (from a week ago), but just tested today and I seem to be getting successful writes again. Hopefully whatever the issue was has been properly resolved, rather than just being an intermittent fault.

    I'm assuming they're using ceph for the attached storage, and wondering if one of the OSDs failed and the rest were all overloaded trying to rebalance to other nodes.

  • remyremy Member
    edited December 2024

    I confirm, It's not solved.
    The last time this happened for me was: 2024-12-05 03:03:14
    But I only rebooted manually yesterday. It hasn't happened since. But it will most likely happen again if no action is taken...
    It's been several times now that I think the problem has been solved.

    Thanked by 1ralf
  • @ralf said:
    Interesting. I've had no response to my ticket (from a week ago), but just tested today and I seem to be getting successful writes again. Hopefully whatever the issue was has been properly resolved, rather than just being an intermittent fault.

    I'm assuming they're using ceph for the attached storage, and wondering if one of the OSDs failed and the rest were all overloaded trying to rebalance to other nodes.

    I don't think they are using Ceph, just Raid10 or ZFS equivalents. Maybe the array failure is rebuilding? Not sure that's what happened.

Sign In or Register to comment.