IncogNET turns 4 years old! 4GB RAM, 40GB NVMe, 4TB @ 4Gbps... 4 YEARS special inside! + FLASH DEALS

hyperblast · April 2025

@MannDude said:
Can now pre-order Pennsylvania stock. Coupons that worked for the default VPS plans will now work in PA, including a small run of the 1776 deals... The 1776 deal is for the 2GB RAM plans, changing them to just $17.76 per year. Available now in Washington State and for pre-order in Pennsylvania with the delivery date being before May 1st (hopefully!)

@JusDomhim said:
@ISP Please have a look at my support ticket 0409B94N5 which I sent on the 9th of April, I just need you guys to change the email that is registered on the vps panel, so I can actually use my vps lol. Thanks

You can SSO from the portal and review your welcome email directly from https://portal.incognet.io/clientarea.php?action=emails if unable to access your account.

It's not written yet, so I've gone ahead and updated the email. Future policy will be made so we don't update end-user emails on their portal or VPS control panel accounts as this is often used and requested when people buy/sell/trade accounts. I don't think you did that, but just a general rule we'll likely implement in the future to help prevent more abuse.

teh 1776 cult. supreme!

MannDude · April 2025

Pennsylvania now in stock. Pre-orders delivered ahead of schedule.

MannDude · April 2025

Just realized the original offer that this thread was created for was not available in PA.

It can be ordered in PA here: https://portal.incognet.io/store/limited-promotional-packages-and-plans/pa-4gb-ram-40gb-str-4tb-bw-4gbps

JusDomhim · April 2025

@MannDude said:
Can now pre-order Pennsylvania stock. Coupons that worked for the default VPS plans will now work in PA, including a small run of the 1776 deals... The 1776 deal is for the 2GB RAM plans, changing them to just $17.76 per year. Available now in Washington State and for pre-order in Pennsylvania with the delivery date being before May 1st (hopefully!)

@JusDomhim said:
@ISP Please have a look at my support ticket 0409B94N5 which I sent on the 9th of April, I just need you guys to change the email that is registered on the vps panel, so I can actually use my vps lol. Thanks

You can SSO from the portal and review your welcome email directly from https://portal.incognet.io/clientarea.php?action=emails if unable to access your account.

It's not written yet, so I've gone ahead and updated the email. Future policy will be made so we don't update end-user emails on their portal or VPS control panel accounts as this is often used and requested when people buy/sell/trade accounts. I don't think you did that, but just a general rule we'll likely implement in the future to help prevent more abuse.

Thank you for the help, no account sharing/selling here, just my own stupidity as explained in the ticket Grabbed myself the deal again in the "newly" added pensilvania location, heres to another 4 years ! (8GB for 8 years deals incoming lol)

JohnFilch123 · May 2025

@MannDude said: in the coming days or week I'll have better documentation available

Any news on this?

MannDude · May 2025

@JohnFilch123 said:

@MannDude said: in the coming days or week I'll have better documentation available

Any news on this?

I forgot. =\

I'll see what I can whip up soon. VirtFusion is proving to be confusing to many. Have even had a cancellation or two because "idk what to do or how to use this"

JohnFilch123 · May 2025

@MannDude said: I forgot. =\

No probs, I remember

JohnFilch123 · May 2025

@MannDude Any news?

ServerBachelor · May 2025

@JohnFilch123 said:
@MannDude Any news?

Domain registration reopening??

MannDude · May 2025

@ServerBachelor said:

@JohnFilch123 said:
@MannDude Any news?

Domain registration reopening??

Soon. Aiming for within the next week.

The jist of it is this:

WHMCS isn't a great platform for domain registration. I'm still auditing all of the imported TLDs, removing the ones that don't have any existing orders, and adjusting pricing across the board.

Kind of hard to explain unless you have had to deal with their system before.

For example, say you're a reseller like we are for most domain names. You can import a list of supported TLDs from the registrar, in our case, InternetBS.

You can have WHMCS auto-create pricing for these TLDs based on the pricing of reseller account with a markup in the form of X dollars or X percent. (Ex: +$5 or +20%)

Several issues occurred because of this. Many TLDs were imported that we can not support, such as random ccTLDs, either due to requiring PII that we can not supply or because they require registration fields not present in the module, causing API errors when trying to register.

Additionally, if the imported pricing had a "first year" special on the TLD, then all the subsequent years were based on that first year promotional pricing in WHMCS even if the 2nd, 3rd, 4th+ years with the registrar were much higher. A handful of lucky customers got multi-year domains with us taking a loss. There was really no way for us to audit it even after I quickly discovered it, as there were hundreds of TLDs and you'd have to use WHMCS' annoying system to manually check things one by one.

So what I have been doing now:

I exported a list of all domains we've registered that are active and took note of what TLDs people are actually buying.

I'm manually removing a ton of unused TLDs that have no interest by anyone, most of which were imported by mistake but there is no easy way to remove these once imported. (Have to scroll down / ctrl+f it on the page, check it, click the red remove button. Page reloads, rinse and repeat a hundred+ times. No way to do this for more than one at a time, and although the TLDs can be removed via a SQL query, this is dangerous because their pricing is still set in a different table in a very nonsensical structure. Safe way is to just do it manually.

Additionally, there is some weird DNS related issues and we may swap out our DNS management system. We're using the ModulesGarden PowerDNS setup ( https://www.modulesgarden.com/products/whmcs/dns-manager ) but there are issues with it, it's large and clunky, and paying $400 every year for it is overkill when it's as confusing as it is for end-users to use. It's not incredibly intuitive.

And finally, I've just been so overwhelmed with work and life that not manually reviewing the order queue and processing new domain orders everyday has been a nice relief. It's busy work that distracts from other things.

TLDR: It takes about 1 minute roughly to update a particular TLD, because I need to reference our pricing and remove multi-year options and reference what is actually being used by customers. Then I need to repeat this process about 500 times.

I'm about... 20% done but haven't been incredibly motivated to tackle it as I wait for more important things like new WA stock and the NL migration.

MannDude · June 2025

IF YOU ARE ON THE HOST NODE, "ALLEGIANCE"

Unfortunately, the situation is more serious and what was supposed to be a simple, quick window of maintenance has turned into something severe.

The VM host node, "Allegiance" was originally provisioned with only 128GB of RAM, whereas our other hostnodes are provisioned with 256GB of RAM. With all other server specs for new deployments being 1:1 matches for consistency, we decided to take this server offline to perform a quick maintenance to upgrade the RAM so that this server could operate to the same capacity as others. The server, otherwise, would be under utilized and only capable of serving half it's potential.

After the maintenance was completed, the server failed to boot. Upon review, it appears to have been an issue with the NVMe RAID array (2 disk, RAID-1). What we believe has happened is that sometime between initial deployment of the hardware node in March of 2025 and recent time is that one of the two NVMe drives died. All components were tested and working as expected upon deployment. It's believed that the second drive was beginning to suffer a similar fate, where the data was intact and there but failed to re-initiatlize on boot.

Data restoration is currently being attempted, but I've been given a low estimate of recovery by the techs working on this.

Since we've been in business, we've never had a data loss event like this, so this is a first for us. We're working hard on recovery, but failing that, we will extend all services by a month (You get a free month) after reprovisioning your account. If you created manual snapshots, these are stored offsite and should be recoverable.

Sorry for the inconvenience. We've never experienced anything like this before across dozens of nodes and I'm being told to expect the worse by those more capable than me to try to recover everything.

TLDR: Took a node offline to do a RAM upgrade so the node could actually be utilized fully. Expected this to be a quick window of maintenance. Turned into a no sleep, "shit has hit the fan" worst case scenario situation.

PineappleM · June 2025

WHMCS is ass. I've heard similar configuration limitations/nightmares with WHMCS by other providers.
Data loss is unfortunate but is also a lesson for clients to always maintain their own backups (and that RAID is not backup). Any drive can fail at any time regardless of provider reputation.

Best of luck with both endeavors!

MannDude · June 2025

@PineappleM said:
1. WHMCS is ass. I've heard similar configuration limitations/nightmares with WHMCS by other providers.

Data loss is unfortunate but is also a lesson for clients to always maintain their own backups (and that RAID is not backup). Any drive can fail at any time regardless of provider reputation.

Best of luck with both endeavors!

Thanks. While we have a variety of different monitoring things in place, looks like this node didn't have anything setup to alert us or warn us of drive health issues... So, as much as I'd like to just throw my hands in the air and say this was unexpected (which is true, it was), it's still possible that it could have been detected earlier. Tests ran back in March were all fine, however.

In any case, we're going to improve our hardware health monitoring and alert system as well.

I know we catch a lot of flack for slow support, but one thing I always did take pride in was having a pretty thorough monitoring/alert system setup so that if anything actually critical or noteworthy occurred there could be fast response. Will be reevaluating the setup to see how this can be improved as well.

MikeA · June 2025

@MannDude said:

@PineappleM said:
1. WHMCS is ass. I've heard similar configuration limitations/nightmares with WHMCS by other providers.

Data loss is unfortunate but is also a lesson for clients to always maintain their own backups (and that RAID is not backup). Any drive can fail at any time regardless of provider reputation.

Best of luck with both endeavors!

Thanks. While we have a variety of different monitoring things in place, looks like this node didn't have anything setup to alert us or warn us of drive health issues... So, as much as I'd like to just throw my hands in the air and say this was unexpected (which is true, it was), it's still possible that it could have been detected earlier. Tests ran back in March were all fine, however.

In any case, we're going to improve our hardware health monitoring and alert system as well.

I know we catch a lot of flack for slow support, but one thing I always did take pride in was having a pretty thorough monitoring/alert system setup so that if anything actually critical or noteworthy occurred there could be fast response. Will be reevaluating the setup to see how this can be improved as well.

Mind sharing what motherboard and NVMe drive models? PM me if you want to share. I'm just curious since I had the same issue in the past.

MannDude · June 2025

@MikeA PMed.

Recovery efforts are still underway, of course. I got the Crunchbits A-team on it.

I'm only announcing here because there isn't a way for me to inform only those impacted by email (yet). I love VirtFusion but it's missing some features I enjoyed from Virtualizor like being able to email users active on only a particular host node. I think this node was mostly LET promotional users as well, so this is the best I can think to do while half awake and the WHMCS service issue page.

PineappleM · June 2025

@MannDude said:
I know we catch a lot of flack for slow support, but one thing I always did take pride in was having a pretty thorough monitoring/alert system setup so that if anything actually critical or noteworthy occurred there could be fast response. Will be reevaluating the setup to see how this can be improved as well.

Shit happens, even if you think you did absolutely everything right (and quadruple-checked everything), Murphy's Law will find a way. At least know that "big name providers" don't always have it good, like OVH's infamous datacenter fire and their "activate your disaster recovery plan" meme.

Glad to see that you're motivated to address deficiencies and work on improvements. Live and learn.

PineappleM · June 2025

Not related to server hosting, but I was working on a task with a friend (who doesn't post on LET but does buy from providers here) and we were reviewing some data that we had to feed to a third party API. We had only one attempt to get this right with zero margin of error (irreversible damage if we fuck up), so we each individually checked the data at least three times over the course of a week. He eventually became irate after saying "yes it's all good" a dozen times.

Well, we fired off the data, and 5 minutes in we realized we botched the input. The culprit was Microsoft Excel being unable to properly render 64-bit integers, so opening a csv file with 64-bit integers and re-saving it with Excel will truncate some of the digits. We didn't realize this happened until we saw that we were getting the wrong results back from the API. Upon further investigation, we saw that all the 64-bit integers ended in 000, and eventually traced it back to this "quirk" in Excel. We didn't ever consider that a tool that is used all over the planet would corrupt our data like this.

I can't share specific details on what exactly we were doing, but the point is, even after exercising our maximum due diligence, we still screwed up hard. It really can happen to anyone regardless of level of expertise or competence. I thought our system was quite robust, but this experience showed that there's still more to improve on.

Motion3549 · June 2025

@PineappleM said:
Not related to server hosting, but I was working on a task with a friend (who doesn't post on LET but does buy from providers here) and we were reviewing some data that we had to feed to a third party API. We had only one attempt to get this right with zero margin of error (irreversible damage if we fuck up), so we each individually checked the data at least three times over the course of a week. He eventually became irate after saying "yes it's all good" a dozen times.

Well, we fired off the data, and 5 minutes in we realized we botched the input. The culprit was Microsoft Excel being unable to properly render 64-bit integers, so opening a csv file with 64-bit integers and re-saving it with Excel will truncate some of the digits. We didn't realize this happened until we saw that we were getting the wrong results back from the API. Upon further investigation, we saw that all the 64-bit integers ended in 000, and eventually traced it back to this "quirk" in Excel. We didn't ever consider that a tool that is used all over the planet would corrupt our data like this.

I can't share specific details on what exactly we were doing, but the point is, even after exercising our maximum due diligence, we still screwed up hard. It really can happen to anyone regardless of level of expertise or competence. I thought our system was quite robust, but this experience showed that there's still more to improve on.

Working with excel the first thing to check the locale.

JohnFilch123 · June 2025

Oh, turned out I am on the affected node but luckily there is no sensitive data on the box. Just another reminder to keep updates or at least to utilize snapshots. Hope you will recover soon.

MannDude · June 2025

Reinstalling the node, but it's showing the incorrect amount of RAM now.

When it rains, it pours.

Datarecovery was a bust. Recovery tools (ex: HDD Raw Copy, DMDE, TestDisk, ddrescue) were unable to read or image either disk.

Only option now is to reinstall the node.

JohnFilch123 · June 2025

Well, taking into the account the gravity of the situation and time already spent on it, we may need to just swallow this, learn the lessons and move on.

Rubben · June 2025

@MannDude I really appreciate that you're not staying silent about this or giving us some half assed PR announcement but give us regular, transparent updates. I know it’s a difficult situation right now but I believe in your ability to push through and fix this.

MannDude · June 2025

@Rubben said:
@MannDude I really appreciate that you're not staying silent about this or giving us some half assed PR announcement but give us regular, transparent updates. I know it’s a difficult situation right now but I believe in your ability to push through and fix this.

Believe me, as much as I understand its frustrating for a customer to be without their VM... We're without a node at the moment. Waiting for the DC to check the RAM issue.

On the bright side, this is a LTO node and the hardware replacements are not out of my pocket.

MannDude · June 2025

Oh, props to @VirtFusion for implementing the feature to mail users based on their assigned hypervisor. This feature was missing before. Something I really liked about Virtualizor was the ability to send an email informing users of a planned maintenance window or migration or something based on their host node / location. Now I can do that with VirtFusion, too.

Shoutout to @MikeA who pointed this out to me via PM.

Had I known this feature was implemented in the recent update I'd have used it earlier.

MannDude · June 2025

Still waiting for the proper build to get racked so I can get things back up for those impacted.

When I wait, you wait. Its 2AM my local time so will get some rest and check back later.

@MannDude said:
Reinstalling the node, but it's showing the incorrect amount of RAM now.

When it rains, it pours.

Datarecovery was a bust. Recovery tools (ex: HDD Raw Copy, DMDE, TestDisk, ddrescue) were unable to read or image either disk.

Only option now is to reinstall the node.

Rubben · June 2025

Mr incognet getting eepy 😴 good night diva

MannDude · June 2025

Aight, an email went out 40 minutes or so ago with how to restore your service.

Status page shows the RFO. I've never really had to write anything like this, since it's the first time we've had anything noteworthy occur:

We're all very tired, as I am sure you are as well. The timeline of events is as follows:

The host node "Allegiance" in our Washington, USA location was taken offline for a general RAM swap/upgrade. This node was provisioned with only 128GB of RAM, making it half of what the specs call for in our current production fleet. The upgrade was fast and non-problematic. This should have been a 15-20 minute maintenance window, and this is what we had planned for.

It would appear that 1 of the 2 NVMe drives in the RAID-1 configuration has died between original deployment and this incident.This node was originally deployed in March of 2025, with it passing all of our standard benchmarking / health checks before being put into production use.

It is assumed that the 2nd drive in the array suffered a similar fate, but the data was still intact as long as it was powered on, but failed to re-initilize on reboot after the RAM upgrade.

Datarecovery was attempts failed. Obviously this was our first and highest priority after realizing a quick maintenance task has resulted in something more severe. Recovery tools (ex: HDD Raw Copy, DMDE, TestDisk, ddrescue) were unable to read or image either disk.

The decision was made to reinstall the hypervisor, however, unfortunately, in a troubleshooting step related to a concern about disk firmware and larger RAM totals, a datacenter tech had removed all but a single stick of RAM. This delayed our reprovisioning of the host node by 12+ hours as we waited for DC staff to reinstall the RAM they removed during this troubleshooting step.

Finally, on 06/12/2025 at about 11PM we had completed the reinstall of the hardware node.

An email was dispatched to impacted users on how to restore their VM.

While this was incredibly sudden and unexpected, we have enhanced our hardware level health and performance monitoring. We've always kept a close eye on things like network metrics, CPU temps, fan speeds, I/O usage, and have alerting systems in place for concerning metrics. However where we did fail was not having a solid system in place for monitoring things like RAID arrays and disk health, which we are now implementing across the board. It's possible that having had a better monitoring system for drive and RAID health could have given us early warning that could have made this preventable.

In the coming day or so, all impacted users will be given one free month of service. We will manually adjust your renewal date for billing and no action or request is required by you for this to be completed.

In the 4+ years we've been in business we've never experienced a data loss event like this. Please review your own data recovery and disaster plans as this, while rare, could occur with any provider. Also, if you are not already doing so, please consider using the FREE and INCLUDED offsite backup option we provide every VM customer.

Thank you for your patience and understanding.

IncogNET

Rubben · June 2025

obligatory 'how do i enable disaster recovery plan i don’t see it in control panel' comment

JohnFilch123 · June 2025

I am back online but I do not see any Backups section where I can make a snapshot, where is it?

MannDude · June 2025

@JohnFilch123 said:
I am back online but I do not see any Backups section where I can make a snapshot, where is it?

PM me your server IP and I'll take a gander. Should just be here though:

I'm running a backup on my dev / test VM and it appears to be working. It's possible the feature may be disabled depending on what plan someone has, though. I'll be happy to check for you.

Howdy, Stranger!

Categories

In this Discussion

IncogNET turns 4 years old! 4GB RAM, 40GB NVMe, 4TB @ 4Gbps... 4 YEARS special inside! + FLASH DEALS

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

IncogNET turns 4 years old! 4GB RAM, 40GB NVMe, 4TB @ 4Gbps... 4 YEARS special inside! + FLASH DEALS

Comments