DBAs are done. A certain level of database proficiency is demanded of all the other engineers and non-engineers. And if you are really working on a super-awesomely-complex system that needs super awesome scaling, then..... you make shitty queries anyway and just increase the size of the memory cache or the number of database server instances. What!??? Heresy! Well okay then, go find a bad query and improve it so you get a nice bonus later. See, look, still no DBA on staff.
Major products are released and operating right now with incredibly low performant queries and uses of databases. The infrastructure people and services keep the databases running.
It amazes me how quickly our industry has forgotten the need for DBAs. With these MongoDBs, MPP cloud dbs and Hadoops, everyone seems to have assumed that engineers can now do all db work. This is reflected in the titles too: Data "Engineer".
But from my perspective, this is delusional. There is a lot that goes into DBA's experience that is not solved by the performance improvements in databases over the past decade. But there are more choices. 20-30 years ago, you would have been forced to write code on Oracle and you would have asked for help before deciding how to structure the data. Today, with more choices, you just read some online opinions, and jump on it without any internal resource to guide you.
Not saying the world of Oracle was great, but the young on this thread (me included) would benefit from respecting the experience of the old.
Sometimes you just need a DBA to help you optimise your DB for your workload and future growth. In general, developers make shitty DBA's and vice versa.
Databases are becoming pretty good at managing themselves and the marginal performance gains from tuning usually are easily offset by throwing bigger kit at the problem or throwing more cash at the plan you are on.
Most people don't need dedicated DBAs and worse the DBA often never has sufficient domain knowledge of the problem to make an intelligent suggestion anyway.
DBs performance is complicated, yes, but the vast majority of it is extremely simple. It's just none of this simple stuff has to be learnt until it's too late and the cost of fixing it has dramatically increased.
There's a certain level where you need a DBA and that bar has been getting higher for years.
What we really need is to demystify DB performance, which for the most part is fairly simple.
What you need to do is teach your developers how to read those query plans. Show them how to find the expensive queries. To give them tools to easily see what queries their ORM is spitting out. To show them how to use a SQL profiler.
Tell your developers how query plan caching actually works. How a clustered index works and what you should and shouldn't put it on. Explain how indexes work. Explain how DB pages actually work and then it's obvious why certain indexes are a bad idea. Explain how relational keys are very important for the DB engine and leaving them off is not an 'oops', it's a serious mistake with long term consequences.
And that's at most a few days work. So why do you need that DBA?
Aside from that you need someone who knows how to maintain a DB, but again that's not particularly complicated and once it's done you can forget about it apart from the occasional sanity check that it's all working properly.
I agree. The number of engineers who don't understand simple indexing in RDBMSes is tragic. Too many engineers see a slow query and immediately say "this database can't scale, let's switch databases!"
Not having DBAs, the people with the actual skills to run the databases you need, available when you need them, certainly sounds like a business staffing problem, and not an engineering problem. Why are inexperienced developers managing databases in the first place? Or is this something people think you can just wing and be okay?
As someone who does DB optimization for a living - there is a tradeoff, I've seen far too many places where db instances run with default settings out of the box, queries missing basic indexes, no one even cared to run an explain analyze.
I've seen far too experienced developers with no basic understanding of how relational databases operate. And it's gotten worse over the years - as they can keep throwing hardware at the problem.
Performance optimization has become a lost art.
This is as much a problem of negligence as much as it is of education.
I won't even touch on proper design, which requires someone with significant database expertise from the get-go ( which most startups may not have).
You'll need an engineer with database skills, not a dedicated DBA. I haven't seen a small company with a full time DBA in well over a decade. If you can learn a programming language, you can learn about indexes and basic tuning parameters (buffer pool, cache, etc.)
DBAs still have their place. In my shop, we have more DBAs than infrastructure people.
When you have a small team working on a given tool that only really needs to manage its own data, it really doesn't matter. But some point, you do need expert gatekeepers to tell engineers when they're Doing It Wrong when there are many heterogenous clients accessing large datastores for different purposes, complex audit requirements, etc.
Not to mention that engineering expertise is just the potential. You then also need the time and the willingness to actually do that kind of tedious and potentially slow moving work instead of all the other things on your list. And as we all know, the list of things that can be improved in any system typically grows over time.
The 'out of the box' or naive and un-optimized performance of something is the baseline. And with something as huge and self-contained as a database you want the happy path to be fine in terms of performance.
Same here but no one hires for a DBA, just keep moving me between projects to put out fires or get them finished one-time or not later then they're already running. Few are actually interested in learning details of designing schema/indexes/queries to be good before there's a problem. I keep telling them to put example SQL queries in PR descriptions with EXPLAIN query plans hoping it will click for them.
My view is that the schema + queries is the essential performance consideration. The code that goes from one persisted state to another is largely plumbing implementation detail. Of course there also lots of precise logic there but it's a healthy alternate view. The first thing I do on any project I join is make a big ERD to know the names, cardinalities, and relations (and wonder how they keep changing that without an up-to-date picture).
Engineering is building some schema, creates and uses multiple data stores , message queues, etc, eventually the queries do not longer work properly as the company scales and gets more and larger customers and hundreds of other issues. Doesn’t engineering need a proper data engineering team/dba/you name it to handle those?
These days the DBAs I see most in demand are really database optimizer consultants: people who know how to scale databases and deal with all the junk that rushed developers set up.
Almost all companies need databases. Very few companies can or want to hire highly skilled database engineers with years of experience of debugging and performance profiling complex applications.
If you're like a company I was at before, you'd pay $10k+ for DB consultants to tune some queries and your prod database, and when migrating your DB to new hardware forget to re-tune it and waste the extra 64GB and even SSDs installed. There should still be a bare minimum floor of competence for actually developing against and maintaining databases organizationally whether it's a DBA, better engineers, etc. Throwing hardware at a problem is fine when you're sure that you are actually throwing it in the first place which I have seen surprisingly few places do well.
DBA is still a necessary job. I would say more and more so: Over my own now 25 year career in IT, the average developer seems to have less interest/knowledge in data modelling and administration. I don't think you'll find great DBA skills let alone a dedicated role in a small high velocity exciting startup - I feel they'll cloud brute force their way :). But in a large organization, a good DBA is vital. I have 2-3 on my project team and always looking for more.
It is astonishing how much a poorly tuned data request / SQL, or poorly designed data model / table, can impact performance by multiple orders of magnitude. It is also fascinating how much addition of specific index or tuning statistics can help execution. There comes a scale where throwing more hardware on performance problem is prohibitive over hiring a good DBA, as much as current thinking is that "hardware is cheap". But on a regular basis I'm seeing experienced, serious, dedicated developers create SQL that is functional correct and reads 10 billion rows into buffer in order to spit out a single row answer (sometimes of course that's necessary, but usually not :)
Major products are released and operating right now with incredibly low performant queries and uses of databases. The infrastructure people and services keep the databases running.
reply