Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Does 64bit system makes sense on low ram systems (like a 128-512MB VPS)? - Page 2
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Does 64bit system makes sense on low ram systems (like a 128-512MB VPS)?

2»

Comments

  • jsgjsg Member, Resident Benchmarker

    @Maounique said:

    @jsg said: The throughput isn't the issue. Crypto is.

    Yep, crypto load depends on throughput. The higher the speed the more encryption needed.

    Kinda. What I referred to is the fact that crypto is very sensible to memory and caching (and not wasting it). But on the level we're talking about here (max. 1 or 2 Gb/s if that) a 32-bit OS and program usually will do fine (if the code is not idiotically designed and adapted).
    Now, when we're talking 25 or 40 Gb/s systems the story dramatically changes of course, but that is way beyond the LET level.

  • MaouniqueMaounique Host Rep, Veteran

    I am talking about multiple threads and multiple routes. That complicates the matter when each uses own keys and possibly cyphers.

  • pbxpbx Member

    @jsg said: "small amount"? How about ca. 50% with many applications (also on the disk btw.).

    My experience has been around 30%/40% saved. Indeed not a "small amount" in RAM limited environments.

    @Maounique said: crypto load depends on throughput.

    Too high crypto load means lower throughput.

    @jsg said: But on the level we're talking about here (max. 1 or 2 Gb/s if that) a 32-bit OS and program usually will do fine (if the code is not idiotically designed and adapted).

    @Maounique said: I am talking about multiple threads and multiple routes. That complicates the matter when each uses own keys and possibly cyphers.

    Interesting. In low load/traffic setups I assume difference should be negligible performance-wise.

  • jsgjsg Member, Resident Benchmarker
    edited January 2022

    @Maounique said:
    I am talking about multiple threads and multiple routes. That complicates the matter when each uses own keys and possibly cyphers.

    TLS is a very major brake anyway, 32-bits or 64-bits. People don't seem to realize (probably due to all the "TLS everywhere!!!" hype noise) that asymmetric crypto, especially with ever new (temporary) keys, is a very major bottleneck. On the nodes typically used for cheap VPS a single core can typically generate about 10 - 30 key pairs per second (RSA 2k) and on Epycs, Ryzens, and the newer Xeons about 50 - 75 (and key pair generation being a very major factor in terms of performance).
    Note though that on routers that factor usually is irrelevant.

    @pbx said:

    @Maounique said: I am talking about multiple threads and multiple routes. That complicates the matter when each uses own keys and possibly cyphers.

    Interesting. In low load/traffic setups I assume difference should be negligible performance-wise.

    More No than Yes. You see, it's not just about the "famous" PK algos. One of the real major problems, besides RSA and ECC, is what we call 'state size'; a very clear example are pseudo random number generators (which aren't famous but actually pretty much everywhere) and in a PRNG's quality the size of its internal size is tightly related to its quality and for cryptographically secure PRNGs it's often in the 64+ bytes (and not rarely well over 128 bytes) range. "pumping" through and processing data with larger than L1 cache line size (typ. 64 bytes) as well as in small chunks (32-bits) or in 64-bit chunks makes a very major difference. And keep in mind that the full state needs to be changed for each and every random byte. The same goes for many other crypto routines.

    Plus, the difference isn't just word size (32 vs. 64 bits) but quite some other factors too, like e.g. cache size. Let's have a quick look at a relative old Xeon Nehalem (64-bit, X5560)) and a shmick Pentium Prescott/Cedar Mills (32-bit):

                            Nehalem     Prescott/CM
    L1D                  32K             16K
    L1I                    32K             12K uops
    L2                     256K            512K
    L3                     8M               n/a
    

    plus 4+ times memory access speed, SSE 4.2, and more

    Thanked by 1pbx
  • pbxpbx Member

    @jsg very interesting, thank you very much!

    Thanked by 1jsg
Sign In or Register to comment.