Hacker Read

johnm · 2010-12-28 13:28:12

Depending on your workload, 1 node per 2 cores is a better rule of thumb. Also, split the networking across multiple NICs.

forty | karma 2609 | avg karma 2.56 · | 2020-07-25 17:04:51+00:00

Genuine question: what's the benefit of having 2 nodes per physical machine, vs having a bigger node on each machine?

411111111111111 | karma 2374 | avg karma 2.05 · | 2023-06-03 05:42:04

Thats true, and I didn't actually want to suggest using the cheapest server available as a singular node either.

I just wanted to put the given numbers into the scale of the cheapest baremetal servers available on the market, to illustrate that this is not an amount of data you'd have to consider for scaling. Realistically speaking, you'd want to have more then 2 nodes so you don't have to worry about rebooting/updates etc

reply

adql | karma 1093 | avg karma 1.89 · | 2022-11-14 07:01:09

Well, you probably want at least 2 for redundancy, but active/passive setup on 2 nodes can cover great many use cases if you're not running some memory hogging Rails blob that needs hundreds of MB just to render a site.

martijn9612 | karma 30 | avg karma 1.88 · | 2021-10-15 12:43:15

I'm sorry, what is the exact benefit of running multiple nodes on one piece of hardware? Just software failure resilience?

halbritt | karma 1194 | avg karma 2.11 · | 2018-10-10 03:22:23

I have >100 development environments, each of which is comprised of ~23 pods or so.

I think I'm running 70 16-core nodes.

reply

0xC45 | karma 101 | avg karma 7.77 · | 2020-07-25 22:58:17

Hey, I like this idea! Actually, I only have a single node running the control plane. And, to answer the original question, it felt like overkill to dedicate an entire NUC to running just the control plane. So, I decided to run two k8s nodes on each NUC. I guess there's potentially a bit of a performance hit due to the extra overhead of running multiple kubelets, but it hopefully won't matter too much in practice.

Jweb_Guru | karma 2749 | avg karma 2.24 · | 2016-08-10 21:51:53+00:00

On the contrary, scaling and parallelism are much easier on a multicore machine (possibly with a NIC queue per socket) because most nontrivial workloads are ultimately communication bound. Single load balancers can handle most real sites' traffic and direct them to different cores, and for those concerned about (for example) DDOS attempts, services like Cloudflare will allow you to combine IP anycast with a reverse proxy to avoid this problem far from your machine.

For a large percentage of use cases, the only reason to have multiple machines at all is for fault tolerance, and for most of the others, it's just not being able to fit all your data in memory (increasingly rare). That said, modern networking has gotten good enough that in a really high-end datacenter you can sometimes still get increased performance out of the network (using RDMA, etc.), but you have to really work hard for it (usually bypassing the OS networking stack).

reply

WrtCdEvrydy | karma 5479 | avg karma 2.32 · | 2017-06-11 16:29:35

Recommended is 3 servers, 2 CPU cores, 2 GB of RAM.

However, single server configurations are perfectly acceptable for personal clusters.

I would strongly recommend taking a cluster backup when you get it up and running though, since single cluster configurations do not deal well with unexpected power failures.

reply

openasocket | karma 4328 | avg karma 3.06 · | 2015-11-02 22:11:23+00:00

I have no idea about the OP's system, but I can see some advantages to breaking components up across a network, at least for vertical scaling. First is that server prices do not scale linearly: four 10-core machines will generally be cheaper than a single 40-core machine. Second is that some programs can hog memory/CPU time; forcing programs to run on their own machine can ease some of that contention. These gains can seem rather small, but can make a big difference at large scales. If you don't mind the drop in latency it's a perfectly acceptable architecture.

rbanffy | karma 158565 | avg karma 2.97 · | 2021-02-26 12:28:35

When you approach the limits of what your kernel can handle, then it may be time to split your workload across boxes or to carve smaller boxes out of your metal (and probably directly attaching NICs to the VMs to the host OS doesn't have to deal with them). Making your workload horizontally scalable is always a sound engineering choice.

But...

Splitting a horizontally scalable workload across a dozen virtual servers that are barely larger than the smallest laptop you can get from Best Buy, you are just creating self-inflicted pain. Chances are the smallest box you can get from Dell can comfortably host your whole application.

The fact remains the odds of you needing to support more than 10K simultaneous connections are vanishingly small.

reply

mnutt | karma 2648 | avg karma 2.61 · | 2011-06-21 16:41:22

If you're eventually going to hit the upper limit of a single machine, have you really gained that much? Eventually you'll have to figure out a way to coordinate between processes, because some will be running on other nodes.

sulam | karma 4978 | avg karma 3.21 · | 2017-01-17 15:25:10+00:00

Hmm, good to know. And to update another assumption-- my guess is that this would be a pretty expensive configuration, and you may be better off picking a lower TDP server that can still get the job done, yes?

Also, do you actually have room for 40 nodes in a rack? Betwee TOR switches, storage, and miscellaneous other, it seems unlikely.

reply

ethbro | karma 13805 | avg karma 2.91 · | 2019-12-21 07:28:14

Personally, I run 2 clusters of 5 nodes each, in geographically disparate zones.

There are some shared infrastructure dependencies that aren't redundant, but generally speaking the system can survive loss of multiple nodes and/or an entire site.

reply

vetinari | karma 8298 | avg karma 1.82 · | 2019-01-15 14:46:34

How much are you gaining by having multiple interfaces, with each one going to separate machine, instead of bonding them and using a switch?

Basically you dedicate 1/N to each server, instead of allocating it dynamically on demand.

reply

helper | karma 2449 | avg karma 6.65 · | 2015-01-09 17:12:22

I'm not really sure what you mean. The number of nodes you need depends on the capacity you need and your replication factor.

Most C* deployments are in the 9-15 node range. You could safely deploy with 6 nodes and RF3 if 6 nodes gives you enough capacity in terms of both disk space and IOPS.

reply

dundarious | karma 2630 | avg karma 2.59 · | 2020-09-02 17:39:26+00:00

This is standard practice in some domains. Reserve 1 or 2 cores for admin and possibly for the interrupt daemon, and isolate/affinitize app processes to the remaining cores -- ideally giving more resources to your "hot" threads, and better resources (the core "closest" to your NIC) to your network thread, etc.

jwr | karma 18835 | avg karma 6.7 · | 2017-02-09 21:36:34+00:00

Two reasons:

* I expect a larger load in the future,

* I want multiple nodes not just for speed, but mostly for data replication.

reply

pclmulqdq | karma 14222 | avg karma 3.64 · | 2022-08-02 13:01:42

I think this is the right approach, and I really admire the work you do at ScyllaDB. For something truly critical, you really do want to have multiple nodes available (at least 2, and probably 3 is better). However, you really should want to have backup copies in multiple datacenters, not just the one.

Today, if I were running something that absolutely needed to be up 24/7, I would run a 2x2 or 2x3 configuration with async replication between primary and backup sites.

reply

dangero | karma 3150 | avg karma 4.33 · | 2022-09-05 14:52:07

You can run a thousand nodes virtually from a single pc