New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Comments
Not really first-hand experience, but I've been involved in the design process for some router boards last month.
Generally, what matters most is plain clock speed - on the first glance. In the end you have to select a CPU based on the amount of NICs you've got in your server. Ideally, you're allocating each NIC it's own CPU using smp_affinity which makes all IRQs be handled by a certain CPU ( this is based on threads and allocated on hex ) which enhances performance. You'd still have the control plane CPU load split across all threads, but that's not really a problem and you could limit that to one particular CPU as well to leave cpu cycles available for the NICs on the other threads.
10Gbps on x86 is not really anything special anymore nowadays, plain packet processing you could push a dozen Gbps when the setup's sophisticated.
And one point that of course also matters: the actual NICs you're going to use. Moving a lot of packet processing to them ( by utilizing ringbuffers ) would generally give you way more scalability in the entire system, but that'd involve a really customized setup and probably also require you to adapt parts of the kernel and how I/O queues are handled.
TL,DR: Get a CPU with at least as much threads as NICs you're going to fit in the server and get some higher clock-speed models. And don't buy cheap ass NICs.
That's a very primitive advice. I have a dual-port NIC which uses 16 IRQs to load-balance traffic.
Most server NICs do the same.
Soooo do I still get one thread per NIC? Or 8 per NIC. Or 16. Or did you mean per port. Feels like you're just going off something "you've read", not having actually touched any of this stuff in real life.
I guess he means 1 core per queue (being RX or TX), because that's ideally what you want, assigning a core to a specific queue.