Meltdown and Spectre + KVM + Virtualizor

Umair · March 2018

Hello,

Just wondering if anyone here has updated their KVM node with latest Kernel and has seen any issue or any performance issues. Etc etc

I am running an older kernel (CentOS 7.3) and have not updated in a while. But I don't run a vps business either.

Just wondering if its safe to yum update.

Thanks

spectraip · March 2018

I have updated all my servers - no problems with it.

vish · March 2018

Performance hit is real, about 11% for me. I recommend you update with a solid backup in place

BlaZe · March 2018

Did it as soon as the proper patch came out.

What metrics did you considered for measuring the performance? please tell us so that we can try to replicate.

Umair · March 2018

@vish said:
Performance hit is real, about 11% for me. I recommend you update with a solid backup in place

How you came up with 11% there ??
Ie did you monitor it closely ?? (before and after patch ??)

Have any graphs to share ?

Also did you guys tweak anything specific after the patch ???

vish · March 2018

The 11% is ballpark based on casual observation of execution times before and after the updates. I don’t have any solid numbers or graphs unfortunately but if you google for it you should find my results are within two standard deviations of the common scenario

randvegeta · March 2018

Meh. Not a huge difference. But VPS nodes rarely seem to be limited by CPU power. The bottleneck is usually the disk and RAM. CPU idles most of the time (unless you'v got dreaded miners on your node!).

Nothing significant.

Clouvider · March 2018

@vish said:
The 11% is ballpark based on casual observation of execution times before and after the updates. I don’t have any solid numbers or graphs unfortunately but if you google for it you should find my results are within two standard deviations of the common scenario

So how did you come up with precisely 11% if you have no solid numbers ?

vish · March 2018

@Clouvider said:

@vish said:
The 11% is ballpark based on casual observation of execution times before and after the updates. I don’t have any solid numbers or graphs unfortunately but if you google for it you should find my results are within two standard deviations of the common scenario

So how did you come up with precisely 11% if you have no solid numbers ?

It was the average between my ballparks

adamluk · March 2018

@vish said:

@Clouvider said:

@vish said:
The 11% is ballpark based on casual observation of execution times before and after the updates. I don’t have any solid numbers or graphs unfortunately but if you google for it you should find my results are within two standard deviations of the common scenario

So how did you come up with precisely 11% if you have no solid numbers ?

It was the average between my ballparks

LowEndStatistics.

eva2000 · March 2018

randvegeta said: Meh. Not a huge difference. But VPS nodes rarely seem to be limited by CPU power. The bottleneck is usually the disk and RAM. CPU idles most of the time (unless you'v got dreaded miners on your node!).

Nothing significant.

Not entirely true there is a concrete impact on disk performance too though varies with the cpu family used and disk configuration.

Umair said: I am running an older kernel (CentOS 7.3) and have not updated in a while. But I don't run a vps business either.

Just wondering if its safe to yum update.

You do realise what you're saying ? Your system is currently vulnerable to Metldown/Spectre and you're wondering if you should fix that vulnerability !

The performance overhead impact for the fixes vary depending on work loads but can be anywhere between 5-50%. I've been following closely all the articles, news and benchmarks that folks have been posting regarding Meltdown/Spectre and have been posting them for my Centmin Mod users to keep them all informed regularly. You can see my thread at https://community.centminmod.com/threads/intel-processor-flaw-kernel-memory-leaking-spectre-meltdown.13632/ - it's still ongoing issue so that thread is still going. Nice read if you want. Alot of the linked articles go to Phoronix.com which has alot of benchmarks regarding Meltdown/Spectre fixes' related performance overhead.

I did on my own pre-PTI vs PTI Kernel benchmarks for my Centmin Mod Nginx performance and PTI came to around 5.5% performance overhead https://community.centminmod.com/threads/nginx-benchmarks-after-centos-linux-kernel-kpti-meltdown-spectre-fixes.13694/ and Nginx's official response https://www.nginx.com/blog/nginx-response-to-the-meltdown-and-spectre-vulnerabilities/

Once the patches are applied, processes that perform large numbers of system calls reportedly will incur a performance penalty due to the impact of the patches. NGINX and NGINX Plus, for example, may therefore require additional CPU resources; monitor the effect of the patch and be prepared to scale up or scale out if necessary.

edit: Also as you're on CentOS, like Redhat you can control Kernel related PTI patches via tunables outlined at https://access.redhat.com/articles/3311301 so you can test KPTI vs no-KPTI performance after your kernel updates at least

edit: nice read from Neflix's Brendan Gregg http://www.brendangregg.com/blog/2018-02-09/kpti-kaiser-meltdown-performance.html

WSS · March 2018

Yeah. It hurts a little more on busy nodes.

virtualserver · March 2018

@spectraip said:
I have updated all my servers - no problems with it.

m

@spectraip said:
I have updated all my servers - no problems with it.> @randvegeta said:
Meh. Not a huge difference. But VPS nodes rarely seem to be limited by CPU power. The bottleneck is usually the disk and RAM. CPU idles most of the time (unless you'v got dreaded miners on your node!).

Nothing significant.
>

@spectraip why ?

see log server

-Scan Details-
Process: 1
Trojan.BitCoinMiner, C:\WINDOWS\HHSM\CLIENT.EXE, No Action By User, [69], [462360],1.0.4220

Module: 1
Trojan.BitCoinMiner, C:\WINDOWS\HHSM\CLIENT.EXE, No Action By User, [69], [462360],1.0.4220

and more

6ixth · March 2018

@spectraip said:
I have updated all my servers - no problems with it.

How are you gents going about patching it? I haven't looked into it at all, should probably do it.

datanoise · March 2018

6ixth said: How are you gents going about patching it? I haven't looked into it at all, should probably do it.

apt-get update && apt-get dist-upgrade or whatever the command for your distro is.

If you use FreeBSD you might have to wait a few more weeks (or months) though.

iburger · March 2018

@randvegeta said:
Meh. Not a huge difference. But VPS nodes rarely seem to be limited by CPU power. The bottleneck is usually the disk and RAM. CPU idles most of the time (unless you'v got dreaded miners on your node!).

They should move these miners to another instance, so they dont hurt performance. I really wish we had more info available about with who you are sharing the server

bsdguy · March 2018

@WSS said:
Yeah. It hurts a little more on busy nodes.

THAT is the relevant point.

I find @Eva2000 "about 5.5%" number quite realistic. But we should always keep in mind that the price for the spectre/meltdown ticket brutally increases with syscall frequency.

So on a more or less idling server one will hardly notice any pain. Within the wide range of "medium load" eva's "about 5.5%" are a quite realistic ballpark figure - and not really a serious concern.
With heavy load, particularly with a high thread count and lots of async events (-> nginx and many others), however, it will become really hurting and numbers will be more in the 15% up to even 30% range.

Side note: There have been some horror numbers floating around on the internet. I don't think they are completely made up but I do think that those are strange cases, i.e. experienced only with exotic hardware constellations and certain software features (e.g. bad libuv spaghetti and malloc per accept (which btw. is not rare)).

virtualserver · March 2018

I have a problem..

>
Scan Details-
Process: 1
Trojan.BitCoinMiner, C:\WINDOWS\HHSM\CLIENT.EXE, No Action By User, [69], >>[462360],1.0.4220

Module: 1
Trojan.BitCoinMiner, C:\WINDOWS\HHSM\CLIENT.EXE, No Action By User, [69], >>>[462360],1.0.4220

>
Registry Key: 4

saibal · March 2018

virtualserver said: No Action By User, [69],

On the next scan hit "Clean" rather than "Ignore". Otherwise take a backup and reinstall Winblows.

eva2000 · March 2018

bsdguy said: I find @Eva2000 "about 5.5%" number quite realistic. But we should always keep in mind that the price for the spectre/meltdown ticket brutally increases with syscall frequency.

FYI, Centmin Mod Nginx is built using jemalloc memory allocator instead of default system Glibc malloc so that might factor into the lessened performance reduction. One of the many things I do to get better than default Nginx performance

ldd $(which nginx) | grep jemalloc
        libjemalloc.so.1 => /lib64/libjemalloc.so.1 (0x00007f90b0863000)

Yeah Brendan outlines in detail what factors influence meltdown/spectre performance impacts http://www.brendangregg.com/blog/2018-02-09/kpti-kaiser-meltdown-performance.html

MariaDB and Percona's tests also highlight what memory allocator used i.e. system default Glibc vs jemalloc vs tcmalloc etc also factor into the performance impact. At least for Nginx and MySQL jemalloc/tcmalloc saw less performance reduction post KPTI Kernel updates than system default Glibc and given it's largely impacted by system calls, I can see why.

Interestingly though, is Redis is commonly shown to have a noticeable performance impact post KPTI kernel patches, but AFAIK Redis uses jemalloc memory allocator by default - so guess it can't save every type of work loads Or that if Redis used system Glibc instead of jemalloc the reduction in performance could of been worse ?

Howdy, Stranger!

Categories

In this Discussion

Meltdown and Spectre + KVM + Virtualizor

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Meltdown and Spectre + KVM + Virtualizor

Comments