You're refusing to learn from someone who lived it. I also know of zero services which failed because they couldn't scale their technology. But I know of 100s that failed because they couldn't get enough users or usage.
Your example about the DB tables is exactly the trap to avoid while you are iterating rapidly to try and find product/market fit.
I mentioned something similar in a separate comment.
My initial DB schema was pretty bad. We did at 2 schema rewrites and migrations from the launch to 5M users. Each time it took 2 weeks of sleep less nights.
The machines today are really powerful. You can do a lot with 244 GB RAM machines backed by SSD.
Someone who doesn't have the skill set to be able to scale once they get traction - it's likely they will not have the skills to design for scale at start.
My recommendation to everyone would be pick a language and db you are most comfortable with and get started as soon as you can. You will fail on the product side a lot more times before you will fail on the technical side.
And if you are failing on technical side, reach out to me. I will definitely be able to help you find a way out. I am not sure if there is any product guy in the world who an make a similar claim on the product front. However there are at least dozens of technical guys in the world who can make a claim like I did.
So focus on launching the product as soon as possible. Work hard, reach out for help if needed. You will eventually get success.
I've more than once seen a new database company grow from nothing to a serious business by being 2-5x for one narrow set of users and 0.1x-0.5x for the rest, but being sold as 2-10x for everyone. Most of their market doesn't end up coming from the people for whom they're actually a decent choice, because capturing even a small amount of the larger market is more valuable than 100% of that tiny market segment. Trying to deprogram people who've fallen for the marketing can be really frustrating, if you're trying not to be saddled with subpar crap that's going to make your life harder.
I'm obviously being hyperbolic, not sure about you. Point remains that scaling RDBMSs is not exactly trivial and it does look like most companies eventually give up.
A big part of my job is talking to people who have scale pain with legacy systems. I don't know what percentage of the DB market this is, but it's nontrivial and growing fast.
Most of the markets I mentioned in the previous comment are nearly impossible to be successful in with a single node of legacy RDBMS sitting behind your app. Zynga is far from alone in social gaming scale pain.
Consider digital ad-tech. How many ads do you have to show before someone clicks on one? How many clicks do you need to earn $1? That can translate into: The cost of all those DB operations needs to cost way less than $1 or I'm hosed. Enter systems that can scale with less pain.
Streambase is a good example of a specialized system that can outperform legacy RDBMSs. Still, it's not like you can say, "All financial problems in scale pain can fix everything with Streambase." It's too specialized. What if you need 100gb of state? There are lots of problems in finance and some of them can be solved with Oracle/DB2 while others can't.
As the author touches on, the main problem here isn't learning about indexes. It's about "infinity scaling" working too well for people who do not understand the consequences.
In no sane version of the world should "not adding a db index" lead to getting a 50x bill at the end of the month without knowing.
I am a strong believer that services that are based on "scale infinitly" really need hard budget controls, and slower-scaling (unless explicitly overidden/allowed, of course).
If I accidently push very non-performant code, I kind of expect my service to get less performant, quickly realize the problem, and fix it. I don't expect a service to seemingly-magically detect my poor code, increase my bill by a couple orders-of-magnitude, and only alert me hours (if not days) later.
For every 1k companies who think they'll be the next Amazon, probably 1 is. Program to reality, not ego. Get walking down before you run.
Normalize your database in a practical way and you should be able to scale fairly well into typical growth. Bad database designs are the most common culprit I see.
My hypothesis: it’s hard to have a descent grasp of technologies without having actually used it. Tie that with “let’s not include too many moving parts” and it’s easy to end up in a situation where the edict is “Kafka”. Let’s say you have never used RDBMS, only used rethinkdb and that turned out to be problematic for whatever reason, next project the founder hires you on the premise that the system you build needs to scale to billions of requests per minute ASAP (eventhough currently there is 0 traffic).
I'm not sure it even makes sense for smaller platforms any more. The metric of most business success is really growth. Tying yourself to something which is troublesome to grow because it favours doing things that are difficult to scale is an evolutionary dead end. I've spent the last 25 years constantly battling this problem in relational databases. Moore's Law has been pretty kind so far assuming you have the money to buy ridiculous computers and licenses.
I am out of touch of what is available in the NoSQL side of things but about 5 years ago there were too many things which had a terrible failure mode somewhere in the stack thus I'm never sure which way to go instead.
99% of startups and in house business apps will never need to scale at a level that an RDBMS can't support. We should all be so lucky to have the success that brings these horizontal scalability problems.
Also, I hope he lets us know what kind of incredible data he end up mining from his mouse movement logs. Surely every website and app on planet Earth has completely exhausted basic solutions like A/B testing and usability testing to make their apps better. The only solution now is to embark on a data mining expedition so that artificial intelligence will tell us how to make better software.
It's not so much the DBMS that doesn't scale. It's particular data access patterns in combination with particular requirements that don't scale. RDBMS alternatives do little more than educate me about that fact
Additionally, when you do a startup and it succeeds it is already too late to redesign your app completely to make it scale. Changing the RDBMS in the middle of the game and migrating data is risky and would cost you probably more than using a properly scalable database right from the beginning.
Additionally once you know you need to scale, your competition will see your product. Knowing the idea works, if they start from scratch, but using a better database for the job, they can easily put you out of the business, because instead of competing and adding new feauters, you'll be busy fighting scalability problems.
I wonder how many people in the NoSQL and SQL doesnt scale crowd either have never met a truly competent, much less good DBA (trust me, they are very very rare) or decided it could not scale because they applied their programmatic and procedural logic to a tool that operates in a very different (SET based) paradigm.
I can only assume the reason this was at negative 1 was most people interested in this topic are already on the (not just SQL) bandwagon. IMO, scaling is a non issue for most well designed websites and as computers get faster this only becomes more apparent. There is a significant advantage to separating complex sites into independent modular components and a only tiny fraction of sites need to scale beyond this point. When you actually need to expand fine, go down that rabbit hole but, for most people it's a complete waste of time.
PS: I suspect the main problem developers actually have with SQL databases is they there ORM is significantly less powerful than SQL. All to often developers focus on row as object and forget the power of more abstract data structures.
This is akin to everyone’s dream to start a cozy, atmospheric coffee shop. These types of coffee bars fold very quickly or at best give founders years of servitude below minimum wage. The reason is that the type of behaviour this kind of establishment encourages - lounging, book reading, laptop work is exactly the opposite of the quick serve model that is conducive to high revenue. The customer loves this model, the owner hates it.
Analogous to it, the nosql DB market is the exact wrong market to enter. Large companies willing to pay big bucks for enterprise features will pay established vendors for the stodgy but battle tested stuff like Oracle or DB2. The hipster startup market pays nobody for anything as there is a myriad of free choices in every common flavour and the few that will pay will mostly do so to purchase managed hosting. And that’s only if their PoC built on your new and untrusted database ever makes it to production. And then it’s probably your cheapest tier. Don’t be selling to paupers!
If you have fewer than ten thousand simultaneous users, this is not you. After ten million, you're not using an ORM in front of a bare relational database either.
Scale at high numbers adheres to no rules or off-the-shelf tools.
Your example about the DB tables is exactly the trap to avoid while you are iterating rapidly to try and find product/market fit.
reply