New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Comments
Yeah the example above is from the RHEL (tested on actual Red Hat, CentOS and CloudLinux).
I've asked a bunch of other sysadms working with RHEL based systems, which see same behaviour - so.. either we're all updating things incorrectly - or there's yet to be a new microcode release.
Online.net which keeps the list up to date also have most of them in "Pending" because they're waiting.
Sure - but those microcodes ain't really fixing spectre, they're not enabling ibpb and ibrs which is required.
Correct - but if the microcode is not there, you'll still have to reboot to get it applied when it's available
Even when they reboot, it takes from 5 min to 1h+, for what the fuck do they need so long?
GestionDBI,Virtmach & BandwagnHost needed a long period of time to boot up the servers back.
I just stay with dedis, to much hassle.
Intel's press release says most of microcode updates coming next week AFAIK
Maybe people run fsck as well :-D
Yeap, that's what I'm counting on as well - so we all just have to sit tight :-D
But the fact that people believe they're all 100% safe now is fake news
D*uq. No nodes has been done for so long. I know you don't like us, but no need to bash and say bulls**t on forums...
"Server Unity GestionDBI England went offline. Detected: 06.01.2018 19:50:10"
Just checked if my Monitoring went shit, I did not:
All nodes are up and running, excluding LAX-03 that is currently rebooting for the last 5 minutes.
So, you go, reboot the nodes, and check if they back up but you do not care if the costumer VM's are backup up? well ok then.
I had 10 restarts today, this question was asked in general, I did listed just one provider, that was maybe a bit unfair but I have updated it.
At least half of them, went down for about 1 hour, so I go for each of these and open a ticket? No. I do expect that a Provider brings up the VM's and I have not to login in each panel and reboot them by hand.
Everyone got it working, except gestionDBI.
Oh, look, it's time for some @Neoon rage. Which project are you going to abandon now?
I guess to safely switch off all VMs?
How many containers do you need on a single node that you need 60min+ to reboot it?
They're OVZ tho. If they're simfs, just reboot now and it'll work itself out if you've got a journaling filesystem.
1 Windows KVM that refuses to budge. And then you have a choice. Downtime or potential loss of data?
Well I said containers, most of them where OVZ boxes.
Take a snapshot, shutdown, restart using snapshot?
Always an option. Depends on how many of those stubborn VMs you have at scale, you may still get to these 60 minutes mentioned.
Funny fact, longest reboot time was MTL-02, with ~35min.
@davidgestiondbi Well, this is an outrage - I could (probably) be done pooping by then!
uh... ... ...... .. sudo (Holy shit that's half of it) apt-get (common sense part) update (omg that simple?) (hit enter) (only if I bought softlayer or theplanet!!)
Funny fact, the Smokeping just send emails to 1 address, I still got my VPS suspended for 20 monitored servers.
Well, last I checked neither Debian nor Ubuntu have kernel updates yet, other updates (e.g. qemu) are likely missing too. Also apt-get update only fetches the package index heh.
You should really work on this rage mate. It’s not healthy...
I am calm, its fine, I just do like to bash sometimes at something, so you do like to OVH.
We did plan for a long time to retire XenPower in the PV form and replace it with HVM, we thought this bug is a good opportunity, but it seems we may not be ready in time.
We are rebooting now the OVZ nodes one by one so it will take 24 hours, possibly more. Expected downtime for every node is below 30 minutes if everything goes well.
Some will take as low as 10 min, the e3 ones, largest some 20, up to 30. Containers may take longer to come up though in some cases, if you have been down for more tahn 30 minutes, please check the announcement and if nothing about your node, please open a ticket.
We do not expect problems, but this is not an exact science.
One of my providers did live migrations of VMs and updated the hosts.
Eg, move running VMs to a new host live, patch/reboot other host, and circle machines while patching previous host.
I've not heard from DO/Vultr/Hosthatch/ZX about it to be honest - but at the same time I've not logged in to check either lol.
This can work and is a good opportunity to see how well this works in the event of a real node failure.
IWStack with SAN storage supports this, those nodes with local ssd storage do not, though. Since the first line of lack of defense is OVZ, though, we will do those first.
Apparently one set of migrations went wrong and the VMs had to be rebooted. None of my services were affected, other than a slight loss of network to one VM for about 3 minutes.
We have an iwstack node down atm, but is not with SAN storage, it is one of the SSD nodes. It is also unrelated, 2 of the disks died and we try to recover the data now.
You know that guy Murphy, I expect a lot of unrelated failures when the workload is the highest with planned stuff.
Ramnode had a big reboot yesterday.