Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Yes, and it works fine as long as you accept that data will need to be migrated at some point when you run out of memory and/or instances.

To put this off as long as possible, you can max out the memory in your boxes and use a large number of servers - 100s in some cases - running on those servers, hopefully with less than 1 per CPU core to maximize performance.

Then if you begin maxing out memory, you can easily split those servers onto their own hardware. My calculations showed that would allow scaling to trillions of keys without problems.



sort by: page size:

Sure, if what you’re doing is easily scalable to another machine.

There’s a lot of software out there, legacy and otherwise, where the throughput ceiling is what you can do with one box.


Yes, in my experience -- but thats just with one-off servers.

I'm not sure how things work if you're a larger client with needs in the dozens of servers.


Yeah, but going from 1 to N instances is a big jump in complexity for not much more uptime.

I'd rather have a host that can do around three nines and just accept that as my upper bound.


Yeah, but not all of them allow you to scale them independently without a downtime.

For example in traditional MySQL setups to scale up the compute capacity your server has to be unavailable for some time.


Yeah, the whole 34/68GB is all on a single beefy server. There are systems that let you access remote node's memory in a NUMA type system over fast interconnects, but they're _very_ expensive and have horrible latency.

Actually..

If you had all of the instance data stored on a fast SAN, so that it was available to all of the hosts simultaneously (possible.)

And you had an ultra-high speed interconnect (40Gbit Infiniband would do) between hosts, for sharing the memory state when you migrate..

4GB of memory, at 40Gbit/s would be transferred in 0.8s. (assuming perfect throughput, all cows are spherical, etc..)


You could, but it would be slow, or at least pretty heavy per client. That said if you have a beefy enough server I suppose it might work.

I tend to agree with this approach specially given how much ram/cores you can get on a single box today. You can run what took a cluster of servers 10 years ago on a single box. Just make sure your code doesn't assume that everything is running on the same host. Use some logical names that map out to an actual host, db_write -> 127.0.01 and hopefully later db_write -> db123.cloudfoo.net

I don't think that would be that efficient of a use of computing resources. Each instance explores the same instruction space. It keeps track of where it's explored, and uses various techniques to explore different parts of space.

It's very likely that multiple instances, if run in parallel and with no data sharing, will explore a lot of the same space.

Also, making a public cluster would be a security challenge. It runs arbitrary C/C++ code, and can trigger code paths that the developers didn't even realize. How would your box stand up to multiple grabs of 4GB of memory?


It's possible, and probably advantageous to a point. Eventually you'll hit bottlenecks somewhere, at which point you throw more servers at it.

Only if you're crazy enough to put something in production running a single node instance.

Sure, but the implied message of your comment that you were saying you could replace all of your instances and containers with just 9 machines, since StackOverflow "serves a lot more traffic than you do" (i.e. "has more actual compute need"). I think most reasonable engineers would say that "thousands" of containers would be a massive mistake to use for that size of task, even if few of them would go to the extent that Stack Overflow did of using only 9 machines.

Doesn't work that way with containers - they can all still have access to all the memory if you are confident you can safely give them access to it (and if you're not, then you certainly can't co-locate them in the same process.

But I have yet to deal with a server where adding more RAM was more than a rounding error compared to getting a fast IO subsystem.

And running them all on a single server means you need to take down all of them to upgrade any one of them, and are contingent on all of them being able to run on the same database version, and with the same extensions.


Yes but the storage used by that instance is shared among 1000 users, rather than all devoted to just one user. Theoretically there would be a fair amount of duplication and therefore less net resource use.

I don't think so, provided it has the necessary resources to run everything in a single node. There are a few more moving parts which you won't really be using to any great extent.

There are still limitations to this architecture and instances when you have to run something else. If I would be doing an RTS with thousands of units, I'd still prefer the lock-step version described above, for example. Or, for a more concrete example, right now I'm revamping multiplayer architecture for a game with a huge open world — and the only way to get it to work without paying for dedicated server farm is to run a distributed authority key-value storage.

Sure. But it's $15 to run an instance on Linode and it's not too hard. If you can't do it you can probably find someone else who can. I don't think it defeats the purpose since separate instances is the basic idea.

In theory yes, and certainly if we're talking about Cassandra. In practice, most people do not distribute over multiple data centers. Doing so is very difficult and costly.

Yeah you can explicity launch multiple instances at your own control, each one uses the same amount of memory - so you could launch 10 instances with 200MB of RAM, or 1 instance with 2GB of RAM.

It's definitely worth signing up for, I've been using it for a couple of months and overall I've been impressed, though they still have too many glitches for me to be entirely happy deploying an important application to it.

next

Legal | privacy