Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

> * our enterprise db was bursting at the seams containing Literally Everything. Now, every part of the split up monolith has it's own self contained data store tailored to what is appropriate for that particular thing. (some use MariaDB, others Redis etc etc)

Why do you consider an enterprise DB "bursting at the seams" to be a bad thing? Isn't that what enterprise DBs are built for? Seems like you traded having everything in one large database to having everything scattered in different databases. You probably sacrificed some referential integrity in the process.

> * developing, building, testing and deploying took ages. Eg if I only needed to capture some new detail about a business partner user (eg their mfa preference app vs sms) I would still have to do deal with the unwieldy monolith. Now, I can do it in the dedicated business partner user service, which is much easier and faster.

You traded a clean codebase with a solid toolchain for probably a template repository that you hope your users use or everyone is reinventing some kind of linting/testing/deployment toolchain for every microservice

> * the whole monolith, including business partner facing operations, could go down because of issues to do with completely unrelated, non critical things like eg internal staff vacation hours.

This could apply to any software. Sure, a monolith can have a large blast radius, but I can guarantee one of your microservices is critical path and would cause the same outage if it goes offline.

> The few callers that do need to obtain both pieces of data just make concurrent calls to both and them zip them into a single result.

Almost like a database join?



sort by: page size:

>I am curious what HN thinks as major reasons for why everyone seems to have moved away from 1 big SQL database

For the places I worked:

1. We transitioned to microservices

2. Performance, 1 BIG database slows that

3. Ops/maintenance is very hard in a huge DB

4. In a huge DB there can be a lot of junk no one uses, no one remembers why is there, but no one is certain whether that junk is still needed

5. We had different optimization strategies for reads and writes

6. Teams need to have ownership on databases/data stores so we can move fast instead waiting for DBAs to reply to tickets.


>Probably an unpopular opinion, but I think having a central database that directly interfaces with multiple applications is an enormous source of technical debt and other risks, and unnecessary for most organizations. Read-only users are fine for exploratory/analytical stuff, but multiple independent writers/cooks is a recipe for disaster.

In my org I've felt the pain of having centralized DBs (with many writers and many readers) a lot of our woes come because of legacy debt some of these databases are quite old - a number date back to the mid 90's so over time they've ballooned considerably.

The Architecture I've found which makes things less painful is to transition the the centralized database into two databases.

On Database A you keep the legacy schemas etc and restrict access only to the DB writers (in our case we have A2A messaging queues as well as some compiled binaries which directly write to the DB). Then you have data replicated from database A into database B. Database B is where the data consumers (BI tools, reporting, etc) interface with the data.

You can exercise greater control over the schema on B which is exposed to the data consumers without needing to mass recompile binaries which can continue writing to Database A.

I'm not sure how "proper" DBAs feel about this split but it works for my usecase and has helped control ballooning legacy databases somewhat.


> They took great pains to keep data in sync across A and B datastores and I'm not so sure that extra cost was worth the perceived stability of this approach.

Such great pains come with huge systems. What's the alternative?

Taking the platform offline for a few hours? Management will say no. Or maybe Management will say yes once every three years, severely limiting your ability to refactor.

Doing a quick copy, and hope nobody complains about inconsistencies? Their reputation would suffer severely.


> BTW, monolith DB has its own form of eventual consistency. Process A puts data into DB, process B picks it up later and affects the DB. There's no reasonable way that everything in such a multi-use DB is always logically in agreement. You're just guaranteeing that B sees what A writes immediately, which comes at a cost

Transactions have existed for years, and DB execution engines have been able to concurrently handle non-dependent transactions since the mid 90's.

> It gives you a smaller blast radius when something goes wrong, avoids over-stressing a single DB, alleviates single points of human dependency like the DB curator, lets you scale separate pieces independently, and yes forces you to silo the data. There are several good reasons larger orgs have been doing things this way for a long time.

Vertical scaling of Db's is a non issue. If your DB is complex, get a DBA...

Have you actually done any of this in production before? It doesn't seem like it honestly.


> Of course you run into issues with transaction occurring across multiple databases but these problems are hard but solvable.

The only thing you need to do to fix this is run all the services on the same DB.

> This sounds crazy. I don't know any large companies that have successfully implemented it. This is basically arguing for a giant central database across the entire company. Good luck getting the 300 people necessary into a room and agreeing on a schema.

You don't need every service to use the same schema. You only need transactions that span all services. They can use any data schema they want. A single DB is only used for the ACID guarantees.


> One of the selling points, which is now understood to be garbage, is that you can use different databases.

It was a major selling point back in the days. You can say that's now a legacy but it was definitely a thing pre-cloud / SAAS.

Lots of software used ORM to offer multi-database support, which was required when you sell a license and users installed it on-premise. Some organizations strictly only allowed a certain brand of database.

You couldn't spin up a random database of flavor in AWS, Azure or GCP. There were in-house DBAs and you were stuck with what they supported.


> Use One Big Database.

I emphatically disagree.

I've seen this evolve into tightly coupled microservices that could be deployed independently in theory, but required exquisite coordination to work.

If you want them to be on a single server, that's fine, but having multiple databases or schemas will help enforce separation.

And, if you need one single place for analytics, push changes to that space asynchronously.

Having said that, I've seen silly optimizations being employed that make sense when you are Twitter, and to nobody else. Slice services up to the point they still do something meaningful in terms of the solution and avoid going any further.


> My impression from the article is that this is a single SQL database being discussed.

Even if it's initially single, it's bad to assume that it will be so forever and that you are not going to use third party providers in the future.

How well does ON UPDATE CASCADE work if there's millions of existing relations to that entity?


> The long term solutions end up being difficult to implement and can be high risk because now you have real customers (maybe not so happy because now slow db) and probably not much in house experience for dealing with such large scale data; and an absolute lack of ability to hire existing talent as the few people that really can solve for it are up to their ears in job offers.

This is a problem of having succeeded beyond your expectations, which is a problem only unicorns have.

At that point you have all this income from having fully saturated the One Big Server (which, TBH, has unimaginably large capacity when everything is local with no network requests), so you can use that money to expand your capacity.

Any reason why the following won't work:

Step 1: Move the DB onto it's own DBOneBigServer[1]. Warn your customers of the downtime in advance. Keep the monolith as-is on the current OriginalOneBigServer.

Step 2: OriginalOneBigServer still saturated? Put copies of the monolith on separate machines behind a load-balancer.

Step 3: DBOneBigServer is still saturated, in spite of being the biggest Oxide rack there is? Okay, now go ahead and make RO instances, shards, etc. Monolith needs to connect to RO instances for RO operations, and business as usual for everything else.

Okay, so Step 3 is not as easy as you'd like, but until you get to the point that your DBOneBigServer cannot handle the loads, there's no point in spending the dev effort on sharding. Replication doesn't usually require a team of engineers f/time, like a distributed DB would.

If, after Step 3, you're still saturated, then it might be time to hire the f/time team of engineers to break up everything into microservices. While they get up to speed you're making more money than god.

Competitors who went the distributed route from day one have long since gone out of business because while they were still bugfixing in month 6, and solving operational issues for half of each workday (all at a higher salary) in month 12, and blowing their runway cash on AWS for the first 24 months, you had already deployed in month 2, spending less than they did.

I guess the TLDR is "don't architect your system as if you're gonna be a unicorn". It's the equivalent of you, personally, setting your two-year budget to include the revenue from winning a significant lottery.

You don't plan your personal life "just in case I win the lottery", so why do it with a company?

[1] backedup/failover as needed


> why not use different databases? They cost nothing and provide perfect separation.

I understand the sentiment, but This is a pretty simplistic take that I very much doubt will hold true for meaningful traffic. Many databases have licensing considerations that arent amenable. Beyond that you get in to density and resource problems as simple as IO, processes, threads etc. But most of all theres the time and effort burden in supporting migrations, schema updates, etc.

Yes layered logical separation is a really good idea. Its also really expensive once you start dealing with organic growth and a meaningful number of discrete customers.

Disclaimer: Principal at AWS who was helped build and run services with both multi tenant and single tenant architectures.


>But database optimization has become less important for typical applications. <..> As much as I love tuning SQL queries, it's becoming a dying art for most application developers.

We thought so, too, but as our business started to grow, we had to spend months, if not years, rewriting and fine-tuning most of our queries because every day there were reports about query timeouts in large clients' accounts... Some clients left because they were disappointed with performance. Another issue is growing the development team. We made the application stateless so we can spin up additional app instances at no cost, or move them around between nodes, to make sure the load is evenly distributed across all nodes/CPUs (often a node simply dies for some reason). Since they are stateless, if an app instance crashes or becomes unstable, nothing happens, no data is lost, it's just restarted or moved to a less busy node. DB instances are now managed by the SRE team which consists of a few very experienced devs, while the app itself (microservices) is written by several teams of varying experience and you worry less about the app bringing down the whole production because microservice instances are ephemeral and can be quickly killed/restarted/moved around. Simple solutions are attractive but I'd rather invest in a more complex solution from the very beginning, because moving away from SQLite to something like Postgres can be costlier than investing some time in setting up 3-tier if you plan your business to grow, otherwise eventually you can end up reinventing 3-tier, but with SQLite. But that's just my experience, maybe I'm too used to our architecture.


> we opted for having one DB per tenant which also makes sharding, scaling, balancing, and handling data-residency challenges easier

When you say 1 DB I suspect you mean you have a single DB Server and multiple DB's on that server. Then I don't think this really solves the data-residency problem as the clients data is just in a different DB but still on the same instance. It makes other problems for you as well for example you now have 2 DB's to run maintenance, upgrades, data migrations on. Current company uses a similar model for multiple types of systems and it makes upgrading the software very difficult.

It also makes scaling more difficult as instead of having a single DB cluster that you can tweek for everyone you'll need to tweek each cluster individually depending on the tenants that are on those clusters. You also have a practical limit to how many DB's you can have on any physical instance so your load balancing will become very tricky.

There are other problems it causes like federation which Enterprise Customers often want.


> I have started looking at tying together clusters of business apps that each have independent SQLite datastores using an application-level protocol for replication & persistence (over public HTTPS). This allows for really flexible schemes which can vary based upon the entity being transacted. With hosted SQL offerings, you are typically stuck with a fairly chunky grain for replication & transactions. If you DIY, you can control everything down to the most specific detail.

Ugh, I'm totally going to sympathize with whoever eventually joins your company after you've left who has to own this.


> However, microservices and other modern application architectures introduced new complexities into application design: Developers needed to manage data across different services and ensure consistency between them, which forced them to build complex data synchronization and processing mechanisms in-house.

I always wonder how many of these "modern" deployments could be replaced by a monolith + DB running on a ~single server. What are we up to these days for off-the-shelf hardware? 128 cores, 6TB+ of RAM, x NVMe SSDs in RAID? Application on one server, cache-backed DB on another. Failover to a cloned setup.

Is anybody here running something like this on latest hardware and can provide rough performance metrics?


> The whole point of backups is to protect against hardware failures and unforeseen software bugs/defects.

And to then restore to a point in time before the software or hardware bugs occurred, no?

> A single machine/database with all your business data on it mixed with personal data and the only record of requests from subjects is criminally irresponsible.

Um what? There are ton of good reasons to stick to a single DB, it’s not unreasonable at all. The main one being that it massively simplifies ensuring data integrity compared to a distributed system. Also it’s just easier for a small team to manage.


> Are you at all sad that you had to give that up in order to outsource compliance to Azure?

A little bit, but it's honestly much better this way in our use case. For our specific business & application, going from 50uS to 5000uS feels like nothing. We were squandering 2+ orders of magnitude over JSON blobs in columns anyways. We really didn't need 50uS transactions for anything. What turned out to be far more important was one source of truth to worry about. We have a lot of installs and a lot of SQLite databases out there. Now we are looking at exactly one for the whole enterprise.

Moving to hosted DB forced cleaner schema design. No json blobs = law now because of IO scaling concerns. We can't afford to move 5 megabytes of JSON across the datacenter every time someone clicks a button anymore. Average per-user-action database hit has fallen by 4~5 orders of magnitude. Megabytes to hundreds of bytes. Overall perceived performance is actually much better because of these design constraints. We also try to get the web views composed using as few queries as possible (i.e. finally make the database do its damn job instead of lazily peddling JSON).


> Now this is just a queue service with extra steps running in a relational DB instead of natively as an OS process. You did cite it as just an option but I don't see why this is an attractive option.

Your DB then shares a WAL log with your queue. Meaning a single managed physical replication pipeline for them both. Meaning only one set of leader-election issues to debug, not two. Meaning one canonical way to do geographic high-latency async replication. Meaning disaster recovery brings back a whole-system consistent snapshot state. Etc.

Honestly, if I had my way, every stateful component in the stack would all share a single WAL log. That’s what FoundationDB and the like get you.

> In fact I would argue running a DB (queue or no queue) is just more complex than running a queue service.

Well, yeah, but you usually need a DB. So, if you’re going to be paying the OpEx costs of the DB either way, then you may as well understand it deeply in order to wring the most productive use you can out of each OpEx dollar/man-hour spent.

(I feel the same way about Redis, as it happens: if you need it, and are locking your code into its model anyway, then you may as well take advantage of its more arcane features, like Redis Streams, Lua scripting, etc.)

However, maybe our company is uncommon in how much our service literally is doing fancy complex DB queries that use tons of DB features. We’re a data analytics company. Even the frontend people know arcane SQL here :)

> that ties you in with that library

The difference between what you / apps / abstracting libraries do in Redis, and what they do in an SQL DB, is that in the DB, the shape of everything has to be explained in a vendor-neutral manner: SQL DDL.

Sometimes Redis-based solutions converge on conventional schemas; see e.g. Sidekiq’s informal schema, which several other queuing systems are implemented in terms of. But when they don’t, there’s nothing you can really do — beyond hacking on the libraries involved — to bring them into sync.

In an SQL DB, anything can be adapted into the expected shape of anything else, by defining SQL views. (Heck, in an SQL DB with Redis support, like Postgres with redis_fdw, the Redis data can be adapted into any shape you like using SQL views.)

And that’s further enabled by the fact that the DB had received from the app, through DDL, a schema, that you can examine, manipulate, and refactor; or even synthesize together with other schemas.

> you can't be assured that future versions of your DB would support those features

You can if those features are in the SQL standard. I’ve never heard of a DBMS regressing on its level of SQL standard support.


> reasons for why everyone seems to have moved away from 1 big SQL database

I'm sure there are going to be other answers for the code side of things, but for ops:

Depends a lot on the size of the service, but in some cases: We got enough data that 1 big SQL store makes ops hard. (Took me 3 days to drop a table recently in a way that wouldn't affect the users) And splitting data became easier than before with specialised backends. (A sharded 2nd layer cache of live data seems way simpler to achieve than say 2 decades ago)


> monolithic large relational databases are hard to scale

DB2 on z/OS was able handle billions of queries per day.

In 1999.

Some greybeards took great delight in telling me this sometime around 2010 when I was visiting a development lab.

> When you have one large database with tons of interdependencies, it makes migrating data, and making schema changes much harder.

Another way to say this is that when you have a tool ferociously and consistently protecting the integrity of all your data against a very wide range of mistakes, you have to sometimes do boring things like fix your mistakes before proceeding.

> In theory better application design would have separate upstream data services fetch the resources they are responsible for.

A join in the application is still a join. Except it is slower, harder to write, more likely to be wrong and mathematically guaranteed to run into transaction anomalies.

I think non-relational datastores have their place. Really. There are certain kinds of traffic patterns in which it makes sense to accept the tradeoffs.

But they are few. We ought to demand substantial, demonstrable business value, far outweighing the risks, before being prepared to surrender the kinds of guarantees that a RDBMS is able to provide.

next

Legal | privacy