Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Shells Virtual Desktop
BMail.ag - Secure Email Service
Server.net
CPLicense.net
VPS Server
Buy VPN
Vultr
VMs for AI
HostDare
ReliableSite White-Label Dedicated Hosting for Resellers
InterServer VPS
BMail.ag - Secure Email Service
Best VPN
High-Performance Bare Metal Server Solutions
Karvl.com
Server Mania Cloud Hosting
DataWagon Hosting
AlphaVPS Hosting
Evoxt.com
Clouvider
VPS Hosting with NVMe
Residential IPs in the US & 4G Mobile Proxies in EU & US with Unlimited Bandwidth
ReliableSite White-Label Dedicated Hosting for Resellers
Rabisu - Hosting Solutions
Shells Virtual Desktop
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Providers or managers of large linux server fleets

SplitIceSplitIce Member, Host Rep

Q. How often do you see non fatal kernel OOPS'es, warnings or bug alerts. Do you monitor for them (netconsole, or other)?

Recently we have increased the number of managed servers quite significantly. On all systems we watch, log and triage.

One thing I've noticed is that the Linux kernel really isnt as defect free as one might hope.

Q. Do you find there is significant benifit in high patch number releases of LTS branches? Do you find them significantly more stable than say low (i.e <20) releases?

Comments

  • SplitIceSplitIce Member, Host Rep
    edited June 2021

    For those curious as to the spark.

    Just today I found a netconsole (or virtio?) bug. Non fatal, but a crash risk for sure (for example if an IRQ occurred during the op).

    [...]
    [194051.326140] ------------[ cut here ]------------
    [194051.326271] netpoll_send_skb_on_dev(): eth0 enabled interrupts in poll (start_xmit+0x0/0x4b0 [virtio_net])
    [194051.327739] WARNING: CPU: 0 PID: 9 at net/core/netpoll.c:351 netpoll_send_skb_on_dev+0x231/0x240
    [194051.327740] Modules linked in: [...]
    [194051.327810] CPU: 0 PID: 9 Comm: ksoftirqd/0 Tainted: G           O      5.7.5+ #22
    [194051.327810] Hardware name: Vultr VC2, BIOS
    [194051.327811] RIP: 0010:netpoll_send_skb_on_dev+0x231/0x240
    [...]
    [194051.327838] Call Trace:
    [194051.327838]  netpoll_send_udp+0x2c4/0x3e6
    [194051.327839]  write_msg+0xda/0xf0 [netconsole]
    [194051.327839]  console_unlock+0x33b/0x4b0
    [194051.327839]  vprintk_emit+0x17d/0x270
    [194051.327840]  printk+0x58/0x6f
    [...]
    

    An unsafe printk (or in this case net_warn_ratelimited) is a scary idea.

  • @SplitIce said:
    One thing I've noticed is that the Linux kernel really isnt as defect free as one might hope.

    Correct, they're not that experienced or do very little with it to know where and how often it shits the bed.

Sign In or Register to comment.