> With tools like message queues and Docker making it so easy to scale horizontally you don't even have to go vertically.
That depends entirely on the workload. It's not always a good idea to move from one sql instance, to a cluster of them. Just buy the better machine that gives you time to make a real scalable solution.
> Yes, hardware has limits and you just cannot have a single machine with unlimited number of cores and disks.
Ok, but it still makes me wonder how far you can go today with a good old-fashioned RDBMS, running native on modern server hardware. I believe you can go a lot further than most developers seem to think, without having to go to more complex distributed setups.
In the environments I work in, it seems people often first check on scaling horizontally instead of checking how far they could scale vertically... my preference is to first scale vertically before trying to scale horizontally. (I am aware both have different advantages, but I'm mainly thinking from a performance point of view.)
> This can easily veer off in another direction, but I astounds me how many people shun the database and chose to reimplement innate features in procedural code outside of the database.
I've done this from time to time, and it's usually a question of the database clients are much easier to scale than the database. What can the clients do to reduce database i/o and cpu, because I can add more clients easily, but turning a database into a cluster is relatively more difficult, so database machines have to scale up instead.
Otoh, the limits of machine scaling are quite high these days. You can get a single socket epyc with 64 cores and 3TB of ram and tons of lanes of nvme.
> SQL Server (the db you mention) is no slouch when paired with additional CPUs (for vertical linear scaling) and allows for transactionally safe replication to distributed clusters of SQL Servers
Additional CPUs are $7,000 USD per core, and replication is labor intensive. Transactional replication has a nasty habit of breaking as the source tables are changed, and Availability Groups have a ton of bugs (as evidenced by any recent Cumulative Update.)
Saying that SQL Server scales is like saying your wallet scales to hold any amount of money. Sure, it might, but it’s up to you to put the coin in, and it’s gonna for a lot of coin - compared to scaling out app servers, who have generally near no license cost, and code is synchronized at deploy time.
As to recommending you a place to read, I hate to say this, but you could start with my blog. Pretty much every week, I’ve got real life examples turned into abstract tutorials on there from companies who hit scaling walls and had to hire me for help. (Past client examples: Stack Overflow, Google.)
>- about scaling: you have to get very far before saturating a single postgres server. A lot of applications certainly do get to that point, but most don't. And once you get there, scaling postgres is definitely more work than scaling a stateless service, but it also gives you a lot more in terms of performance, reliability and further scalability.
Thank you for saying this. Mostly when postgres as a platform is brought up, the horizontally scaling ppl will often mention the parent thread. I have being developing since Apple computer had floppy disk, and there is rarely many situations where I need to saturating a single postgres server.
And even if one did get to the situation where that happens, with introduction of hydra, or other postgers columnar db, we can just put that in. Most user will never get to a point where they need to saturating a single postgres server. And also keep in mind when processing large row data, writting stuff in middleware is just not as efficent or fast as in postgres when it has native access to data and data manipulation.
>Each tier is either easy to reason about scaling out horizontally except for the database.
Vitess [1], A database clustering system for horizontal scaling of MySQL, or Planetscale [2] which is the SaaS version. Of course everything is good on paper until you run into edge cases. But I am convinced within this decade scaling problem or hassle will be a thing of the past for 95% of us.
> I think the only really scalable part here is the database
That sums it up pretty well.
The great irony here is that those thin, stateless web servers that are the only things to easily autoscale are pretty much guaranteed to never be your performance bottleneck, unless you're doing something really strange.
>"try all the common vertical scale-up approaches and tricks. Try to avoid using derivative Postgres products, or employing distributed approaches, or home-brewed sharding at all costs – until you have, say, less than 1 year of breathing room available."
Very healthy approach. I've always followed the idea of vertical scalability when writing my modern C++ app servers with local Postgres. Since I do not sell those to FAANG I've never failed finding decent very reasonably priced piece of hardware be it dedicated hosting or on prem that would not satisfy client's need for any foreseeable future. More then that. I've never needed even top of the line hardware for that. I concentrate on features and robustness instead. Using C++ also gives nice speedup.
> I tried to scale Postgres few months ago but I found that it is quite tedious.
If your application is big enough that you really need to scale postgres, I assure you, you will have a whole lot of tedious problems - not just sharding.
> If I can move CPU to the frontend from the db, that's a win, because scaling database servers is harder than frontends.
This may only true for data manipulation that's necessarily [1] more CPU-intensive/bound than I/O-intensive.
It's difficult to imagine a situation where that would actually be true, however, in the context of a manipulation like a join.
Given how much faster CPU power has increase relative to I/O, over the entire history of computing, CPU on database servers has been progressively less of an issue.
This isn't to say that CPU power is infinite, and I've seen naive analyses attribute attribute performance problems to inadequate CPU power, even when it's an I/O issue masquerading as (system, not user) CPU time.
In that respect, it's hard to scale a database server, in that now-rare and undervalued-by-programmers Ops knowledge of things like hardware and I/O and offloading [2] but that difficulty doesn't actually go away by moving the data around in a partly-distributed or even fully distributed system. It just drives up the cost.
See also Amdahl's Law and the Fallacies of Distributed Computing (in particular the ones about bandwidth and latency).
[1] as opposed to merely as-implemented, such as due to PEBKAC, as a sibling comment points out
[2] e.g. why ZFS might have wonderful features, but harware RAID cards can offload a remarkably significant pressure from the CPU and main memory, when it matters most
> I may be a bit of an old fart, but this is the exact reasoning behind my decision to never go with "distributed X" if there's a "single-machine X" where you can just vertically scale.
I use this same reasoning to use an RDBMS until I find a reason it will not work. An ACID compliant datastore makes so many things easier until you hit a very large scale.
> I have a mobile game API that does ~415rps and about 2100qps. (About 1500qps are in MySQL, more below.) We do that with one DB server with 122m RAM. While we can continue to scale RAM, we can see the handwriting on the wall - indexes will continue to grow, and we won't be able to scale vertically forever.
I don't know how many users you have now and how quick you grow, but you can buy a server with more than 10x as much RAM as you have now. You may not "be able to scale vertically forever", but if you hit that spot you reached at least 5x Stack Overflow level. Congrats!
> We use NoSQL to store certain types of info, just as SO does - Redis, Elasticsearch, and so on. The more data we store and the more we use it in new ways, the more types of non-relational DBs we use.
> I don't think we'll ever have a situation where MySQL goes away entirely, but it stops being the singular solution pretty quickly as you scale up.
Sure why not. Use what fits your needs. But I guess you used Redis and Elasticsearch for it's features, not because it scales better - for that, some more RAM would have done it.
> Sure you shouldn't care about scaling at the beginning. But why should you start using a system that you already know won't scale in the future?
Because it's well supported and solid otherwise? There's a wealth of documentation, resources of many kinds, software built around it (debugging, tracing, UIs, etc.). Because there's a solid community available that can help you with your problems?
What alternative technology is there that scales better? I guess MySQL could be it, but doesn't MySQL also come with a ton of its own footguns?
> Very few organizations in the world truly need to scale out their database.
Everyone organization that has a database wants to scale out to at least a 2 node cluster just for not having a single point of failure. Scaling is not always about performance, at first, it's more about availability.
> If you’ve created a database before, you probably had to estimate how many servers to use based on the expected traffic.
The answer is "one". If you have less than 10k req/s you shouldn't even start to think about multiple DB servers or migrating from bog-standard MySQL/MariaDB or Postgres.
I will never understand this obsession with "scaling". Modern web dev seriously over-complicates so many things, it's not even funny anymore.
> In my experience, the DB layer is always the hardest to scale. At a certain point, vertically scaling system resources starts producing ever diminishing returns.
You can get very, very far with vertical scaling. Much farther than people seem to think. A dual EPYC 9654 system has 384 logical cores, can hold 12 TB of DDR5 RAM, and has enough PCIe lanes to support 20 NVMe SSDs at full speed (over 1 Tbps) and a 200 Gbps NIC at line rate. All without bottlenecking the memory controller, mind you.
As long as you're properly using indices and batching inserts/updates, a DB hosted on a system like that will handle millions of row reads/writes per second without a sweat.
> Eh, when the time comes to scale I’d still very likely pick a RDBMS but hand my team copies of Eric Evans and ask them to design data schema and sharding scheme such that any given entity has a bounded context that doesn’t join across the network.
That will only help you scale out to N servers, where N is the number of bounded contexts. And it might end up being less helpful than it sounds, if the scaling needs are dominated by data in one or two bounded contexts.
> You don't want to scale vertically once you hit millions of users
Hundreds of the top websites in the world (the majority, I would hazard, though without the data to support it) have scaled vertically to millions and tens of millions of entities just fine. Far more than have scaled horizontally. It works. It's been done. And it doesn't give up the sort of transactional niceties that make problems like this easier.
> most other web companies like Google, Facebook, Yahoo and Microsoft
You're confusing "the very biggest web companies" with "most other web companies." Most other web companies continue to use commodity products, and Google/Facebook/Yahoo!/MS certainly would (and do) insofar as it's possible at that scale. Expending resources now to be as horizontally scalable as Google is wasteful premature optimization.
Notably, Yahoo! runs the largest PostgreSQL installation in the world, and Google and Facebook both continue to use MySQL.
> horizontally scaling without big iron is the way to go.
You can get 32-core machines with 128GB of ram from Dell (a mildly tweaked R910) for $30k these days. Is that big iron? How does its price compare with the amount of developer salary and benefits you'll have to spend to grok a non-relational data store, migrate your data to it, and reimplement the ACID features of a relational store in the code for your app? How many developer-days will you spend maintaining that code and how many developer-nights will you spend triaging a crashed site because of the complexity and likely bugginess of that reimplementation? How many users' feature requests will you have to reject as "too difficult to implement" because you feel the need to scale to Google/Facebook levels despite having only a few million users now and predicted growth which shows you'll never in a million years catch up to them?
> Currently the only way to scale a relational database such as MySQL or Postgre is by sharding and partitioning
It will be years before the vast majority of startups exhaust reasonable, cost-effective options for vertical scaling. The recent fervor for non-relational, horizontally scalable data stores is simply the new way of scratching the intellectually masturbatory premature optimization itch that programmers have had since ENIAC.
That depends entirely on the workload. It's not always a good idea to move from one sql instance, to a cluster of them. Just buy the better machine that gives you time to make a real scalable solution.
reply