Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Well, they could start by using something faster than Python. I would tend to use Common Lisp, but Clojure would be the more modern choice.

But yes, scaling up is far easier than scaling out. A box with 72 cores and 1.5TB of DRAM can be had for around $50k these days. I think it would take a startup a while to outgrow that.



sort by: page size:

It does cost a little RAM, but stack size is rarely the bottleneck, especially when using a language like Python.

I don't think it's inconceivable that there would be a product where you'd have ample storage and CPU power for your application and would want to optimize for speed of development instead. I've built things on MicroPython that would be just fine to ship as they were, for example, if I were to commercialize them.

I run most of my Python program on fairly low end boxes, so consuming 9MB per process vs 90 would be a pretty big deal. If you have RAM to burn, share :).

One interesting question is: if you started today, would you be able to afford boxes that cost 60k? Or would you do your software some other way that doesn't require 2TB of RAM?

Obviously when you started your stuff didn't need 2TB of RAM. If I read your history correctly I don't think you could even buy a box with 2TB of RAM back then. That's enterprise grade hardware, which today costs a fortune. Back in the day it would be a bigger fortune.

Instead, you probably started the way everyone else did, with maybe a 4GB or 8GB linux box at home then built things up from there and a bunch of curl scripts and a local instance.

So why the resource requirement? An in-ram database?

Mainly curious.


Hey, thanks for the input.

I need 3gs in ram at all times to have the models pre-loaded, otherwise the request time goes exponential. Then, ideally, memory could scale up/down with each request. Probably more like +2gs per request, e.g. RAM_GB=2r+3.

Its working fine right now with an 8GB box on DO for 1 user, but each user will add ~50% more compute power to the overhead. The cost isn't really an issue, $40/m is a microscopic fraction of the user's monthly fee, it's just that I'd prefer an auto-scaling system because I'm not a devops specialist and downtime for these users isn't really an option.

I don't think a message queue is needed right now with request pooling and socket limits,though I will need it later on.

I can shoot you an email if you'd like to chat about it. I'm no CoreNLP pro but I might be able to provide a few tips in return.


Obvious option: you might want that memory for processes. Dynamic scaling does seem sane, though.

Agree 100%.

A 96 vCPU box with > 200 GB of RAM is ~$1.55/hour with an EC2 Spot instance. Back that with a big SSD and you have a data processing monster. Use you some nice Golang with well-formed goroutines to leverage all those cores and a damn good many data processing tasks could be crushed on a single box for sure.

Metaphorically: Every gear you add to a machine (distributed this and that) is a gear that needs to be cared for (configured, managed) and could break the overall machine.

Simpler is better.


But once you achieve a critical mass - today, you could run most of the web on anything from MIPS to SPARC T2, as most of the source for pretty much everything is portable and available - you should get an avalanche effect.

Economies of scale will dictate prices, but this software availability will make processors tailored to specific roles viable. You will see ARM-based low-power servers and more designs like Tilera's in cloud-computing applications.


If the available RAM were somehow expanded to the full amount available in hardware, it seems like this could be an interesting type of HPC system. No bloat, simple execution.

The idea of running workloads on the cheapest compute platform is interesting.

How hard can it be, though? Like taking a normal CS person and making them versatile with hadoop and so on? Could it be done for 20K$?

It may also spell trouble for horizontal scaling. A 128 core computer with a few terabytes of RAM could handle loads that that would otherwise need dozens of computers. There are huge advantages in terms of ease of management and programming.

Yes, if you want to participate in rooms with >10K users or >500 servers you need quite a large box (several GB of RAM) - although over the last few weeks we had several massive algorithmic performance breakthroughs which should help this a lot. these are currently being tested and implemented in Synapse (the python impl).

You can build a $6,000 machine with 12 channels DDR5 memory that's big enough to hold an 8bit quantized model. The generation speed is abysmal of course.

Anything better than that starts at 200k per machine and goes up from there.

Not something you can run at home, but definitely within the budget of most medium sized firms to buy one.


14 million? Okay, that's not very much. A RAM-based approach is fine there.

Also, benchmarks would have to be carefully set up. Sure you can do more MIPS on a cluster of generic underpowered x86 boxes, but what happens if you remove a RAM stick, shoot a bullet through an SSD or are hit by an earthquake? The numbers alone are hard to compare if the guarantees are totally different.

A second thing to count is that it’s vastly cheaper to run thousands of services out of a small group of very large machines than to run the same workload on a large number of cheap machines.


Eh, it's not really that consequential because anything big will need way more horsepower than you're gonna get on any mobile GPU to be able to done in a reasonable amount of time. We built a CLI tool for our stuff on AWS and our gaming/ML desktop at the office specifically because everyone is on laptops and training or evals are so slow.

Thanks, that's a good data point.

I'm wondering especially what the tangent for future developments will be. Most of the large language models are out of my league anyways (e.g. the new yandex russian-english model was trained on 800 A100s and needs 200GB of GPU ram to fine tune).

So maybe it would be more effective to go for high speed instead of large capacity. But then you probably end up with a custom chassis and PSU since 4x 400 watts are not something that you can use on most off-the-shelf workstations.


the biggest problem first of all might be the memory requirements given so many parameters. It couldn't be as cheap as a high end computer in the foreseeable future.
next

Legal | privacy