New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
★ VirMach ★ Black Friday & Cyber Week 2018 ★ RAID 10 SSD ★ OpenVZ & KVM ★ Check inside for offers!
This discussion has been closed.

Comments
I think it works to install Debian 7 template first and then start netboot.xyz by editing grub, but I can't test it because I haven't written down the order number lol
Hmm, I have received the mail with every piece of detail about the vm, however the order still shows pending in my client area, which is odd. So I turned to solusvm, which let me log in without an isdue, and the machine status was ONLINE, but that's not true, since inside the VNC it complained "no bootable devices". I tried re-installing the system by clicking "reinstall" with no templates at all, and by mounting some ISOes, none of which could bring the vm back online.
IP is prefixed with 176.
p.s.
Order #553266
(edit) rescue mode is fine
Order #553390 Can you help me to deploy it?thank. go sleep.
Order #553396
Ths!
Order #553390 thanks
I want I want thanks!
Invoice ID #1398377
I’d like to become a guinea pig if this offer is still valid.
Can’t find order ID though.
Your waiting time has been successfully doubled and you have been put to the end of queue. Estimated time of deploy: 26th of April 2022.
Getting error "Decoding failed" when booting on Ubuntu 20, my guess is my disk is too small...
Ubuntu 18 works but whole VPS just crashed (and offline) during fio disk test whenever I try to run, still trying to find what caused it.
I’d like to see you ass try.
Help virmach delay the orders. Let’s see how much time can you postpone.
I really need to redo all the templates at this point for Ryzen. If only I could find the time, we'll see. But that's definitely concerning. Hey, at least the whole node didn't go down with it!
fioandgeekbenchboth will crash the VPS so let me just post the YABS without them for now... (I added 1GB SWAP)(Getting ~500Mbps-1Gbps to everywhere, that's good enough for me)
However, I did an
iperfto several of my SG VPS... from 1-10Gbps ports, but not as impressive for now (I am guessing a lot of people are benchmarking at the same time)Thanks VirMach for letting me have an alpha test on it!
Yeah, probably other tests going on. Here's the one I actually did yesterday, it's probably more indicative of the performance while nothing else is going on. I took out the disk speed portion because I don't want people to not read this portion and then take it the wrong way, as the disk was on the SATA SSD and not NVMe at the time of this particular test.
Rest in peace Perejilo.
Are you on the wrong set? The delayed volunteer was responding in another post
I'm so sorry too. They are enjoying the happy calm now.
Thank you all.
This still happens, only on virtualization though. Across multiple servers at this point, which isn't good. It looks like it's a long-time kernel bug that keeps re-appearing but I'm still looking into it.
Kind of one of those things where the motherboard manufacturer, kernel developers, and NVMe manufacturer/driver developers are out of sync. Maybe I could get rid of the templates that cause the issue but that doesn't stop anyone from just loading in the wrong thing on their own and crashing the drive. Or maybe it's something new on the updated kernel, or maybe it's something to do with the motherboard's or the kernel's handling of some NVMe feature. List goes on...
I guess I'll be spending several hours trying to find any fix that's a possibility on our end and contacting everyone else who could do something about it with a BIOS update, firmware update, or kernel patch.
Next step is probably trying out a bunch of combinations of hardware at the office to see which ones do and don't replicate the issue. My hunch is that it has to do with these X570 boards not playing well with the kernel version, since the NVMe issue seems to go across 3 different manufacturers now. I'm hoping we can get it working on an older version, or find some BIOS settings to make it work out. The nice thing is that I basically have every piece of hardware imaginable so I should be able to figure it out, the only problem is that it feels so wasteful at this stage spending so much time on it, and it's quite disappointing as I was excited to finally get Japan deployed. The next batch of shipments to Japan are going to have to be the X470/Gen3 which we know works fine.
(edit) Oh man I'm kind of glad we didn't also try to add in the RAID controllers as well at this stage, it would have been another layer of drivers and potential issues.
@FAT32 @FrankZ if you're interested in absolutely decimating the drives on the node I brought up right now let me know. Just private message me the service IP or invoice ID and I'll activate and/or migrate it there.
It looks like it may have actually been one of the things I initially came across and described briefly in one of my last 50~ messages. I remembered the new X570 board has the entire "overclocking" section and remembered that it doesn't necessarily mean overclocking, it also has a lot of "underclocking" or power saving measures. Many kernel issues were related to this and the patch they had only fixed some parts of it apparently, the issues created by the "Autonomous Power State Transitions" not functioning properly on Linux. Before it was officially patched, the fix was to just to mess with the latency manually. So basically, the NVMe tries to save power, and stops responding as a result of the mismatch in communication. Then Linux says "well, looks like this is permanently broken" and kicks it off. Or at least that's how I understood it while not trying to waste time.
Anyway, I just had a light bulb moment and said hey, before I start doing 12 hours of building and testing everything here, I should take a second (probably 30th actually) look. I kept digging through the BIOS and found something under the "overclock" settings related to PCIe and power. Dynamic power management coordination for PCIe devices, or as they like to call it, Local Clock and Dynamic Power Management. This is by default set to "lower LCLK frequency" to save power with an additional "enhanced" detection for PCIe 4.0 to "optimize" it automatically. It looks like they quietly added this to try to "fix" some problems with Gen4 PCIe devices. Sounds exactly related to power saving transitions, and it sounds like another thing that they need to patch for Linux. Until then, I figured disabling it would alleviate the same problem, and fingers crossed, it looks like it did. However, I need to be sure, so if you guys are interested, I need help trying to absolutely destroy the NVMe SSD.
Short version: I did a thing that might've fixed the stuff, are you bored, and if so, do you want to cosplay as an I/O abuser?
Also here's the missing portion of YABS.
fio Disk Speed Tests (Mixed R/W 50/50): --------------------------------- Block Size | 4k (IOPS) | 64k (IOPS) ------ | --- ---- | ---- ---- Read | 482.24 MB/s (120.5k) | 2.23 GB/s (34.9k) Write | 483.51 MB/s (120.8k) | 2.24 GB/s (35.1k) Total | 965.76 MB/s (241.4k) | 4.48 GB/s (70.0k) | | Block Size | 512k (IOPS) | 1m (IOPS) ------ | --- ---- | ---- ---- Read | 2.69 GB/s (5.2k) | 3.08 GB/s (3.0k) Write | 2.83 GB/s (5.5k) | 3.28 GB/s (3.2k) Total | 5.52 GB/s (10.7k) | 6.37 GB/s (6.2k)Thanks for the invite, you mean trying to make as much IO and CPU usage as possible? Sounds interesting
To be honest I dont think I can do as well as you did since you have access to the host node, but sure I can run some command just to generate random loads
(Btw, I believe FrankZ will be interested to get access to Tokyo VPS)
And..... it already broke again, BUT my previous very long explanation definitely was an improvement and a step in the right direction as it allowed me to use it for a whole 5 minutes and even get the fio output, and for it not to immediately break on reboot.
I think I need to monitor the power state transitions more closely and see exactly where it still drops, and then perhaps add some additional kernel parameters. I'm going to spend a little bit more time on it to get it stable enough to where I don't have to reset it every 5 minutes and then I can get you guys on there to do whatever you want. It's probably better than telling you just to stress test it as you mentioned, because if you run into a problem in what you're trying to achieve I'll have more data.
@VirMach - Sorry for the delay in responding. I was out in the real world.
Double thanks for you.
As That photo is a pretty good likeness.
Hello
Hello
Bonjour
Bus

I just ate some microwave burritos. The package said beef and green chile, but it just looked like thin refried bean paste with the tortilla wrapped around it twice. I don't know what I was thinking. I feel unhealthy and sad now.