You know, in years and years and years of programming, of applications and web and back-end and front-end etc etc, I have never had a real-life use case for Big-O notation, except in interviews. Nor had a real-life use case for caring about sort algorithms. I mean, not once.
I'd say the true foundation needed is entirely different. Avoiding spaghetti code, good refactoring practices, understanding when more architecture is needed, avoiding premature optimization or over-architecture, concepts like technical debt, writing code designed to be readable, how to establish practices and standards and make sure they're communicated properly... These are the kinds of things that actually matter in most projects.
(Obviously, if you're writing video codecs or kernel code, it's a whole different story...)
You're correct that most programmers don't have to write sort functions by hand but there are certainly a few issues.
Big O notation , or at least a Big O way of thinking is definitely useful when doing things like optimising queries and data structures etc.
For example I've solved performance issues in the past with DBs with poor indexing strategies (or no index at all). When I read the DB manual and it talks about different types of indexes and says "doing X will cause Y or doing Z will cause A". Knowing something about Big O notation and related things means that I can immediately intuitively understand what they key differences between Y and A would be.
Thinking about performance in terms of growth also helps me guess which parts of the application require the most optimisation. For example which parts have a roughly fixed n and which parts n will grow with time.
Abstractions leak, especially when you are dealing with parallel processing which will become increasingly important as time goes on. How you store and think about your data is going to have huge implications for just about everyone in software in the coming years when having to scale sort operations etc over hundreds of discreet CPU cores becomes the norm.
I think a degree in CS is not intended to teach you "what you will need to be a successful software developer today" but more "here are the tools to think about whatever the problems may be for software developers in 10 years time".
I agree that maybe there is going to be a gap between what is required for a CRUD implementer vs what is required for someone who designs DBMS systems for oracle and that employers want people qualified in the latter to perform the former.
OTOH I feel that this is something that the free market will correct over time and is already beginning to do so. I know a few programmers who are productively employed who are not well versed in CS theory but are still payed well because they can bring other skills to the table.
Algorithms aren't foundational, in my opinion. Once you understand the syntax of programming, you can start learning about the applications of that syntax, which for 99% of developers rarely ends up in the shape of an explicit algorithm. It's not all bad, debugging is foundational for sure.
I'd place a person's debugging skills, their ability to predict bugs, system design, knowledge of common (applicable) libraries (more so knowing when to use them, not method signatures), and perhaps even ethics above algorithms. Probably many more but there's a couple to start. Having never needed to build my own sorting algorithm in 14 years of coding, I'm pretty confident I don't need an engineer who can do that, either.
These questions have very little to do with software development as practiced by 99% of developers. They relate to computer science, a branch of mathematics with different concerns.
I've been developing software professionally for 23 years. I have a BSc in Computer Science [1] and an MSc in a computing-related field. And I have never needed to be able to answer questions like these to do my job. I have a vague understanding of big-O notation, and equally vague knowledge of the characteristics of various algorithms like heapsort, but thats all. If I need a sort algorithm then the framework or standard library provides a selection, all written by smarter people than me.
If you are developing a relational database or Google-scale distributed platform then, yes, you probably need to know this stuff. But most developers don't. I have to understand how to capture customer requirements, fit them into systems that already exist, estimate and implement them quickly and reliably, deal with the practical details of IIS or an RDBMS, maintain code, etc. I wonder if the author needs to get out more.
Sounds like you're an algorithms guy, and the other coders didn't have the background needed for understanding their use?
I'm self taught, and for many years didn't know about algorithms, big-O notation, and similar. You can do a lot of stuff without that knowledge, but there are definitely some areas that require it.
In most complex jobs (that is, jobs with a theoretical component someone writes books about), the theory part mostly comes to the fore when something goes wrong, and you have to diagnose and fix the problem.
So, in this case, programmers would use this stuff when a sort takes up too much time or too much RAM for what it's meant to do. Then you need to know what big-O is and how to fix it.
It seems a bit odd to spend all this time preparing for things that don't happen every day, but, really, that's what people pay the big bucks for: Someone who can smooth over life's little difficulties, at least in that specific realm.
So, I see where you are coming from (I actually love academic CS). But the VAST majority of the programming work out there does not require any Big-O analysis. It just does not. It's used as a tool in interviews to (essentially) look for rigor. The problem is that this harms people who are rigorous as hell in low-level details of JS and V8 (something I'd posit is actually more useful to many more companies), but never studied academic CS. There's nothing wrong with valuing the skill of complexity analysis. But there's a mismatch in how common it is to look for this.
I absolutely agree that knowing basic CS is a good thing, and it will make you a better developer. Also, we do internalize a lot of the concepts so they become second nature. That being said, it surprised me how much code there is where this doesn't matter.
As for the specific questions: It's definitely good to know big O notation, so maybe I was a bit quick to dismiss the first question. However, a more realistic/useful comparison would be between O(n^2) and O(n log n).
For picking a collection to use – definitely knowing when to use a hash map versus a list. Asymptotic run-time behavior of Quicksort – no.
Binary search tree – that's the one example I mentioned in my blog post, so yes, useful.
Graphs – never needed in the applications I've worked with. Same for using a heap, never seen/needed.
This is probably an unpopular opinion here, but only a very small subset of developers need to design algorithms or even know any of them by heart.
Disregarding a few years that was mostly WordPress consulting, my experience is largely enterprise C#. Lots of line of business applications, glorified CRUD apps, and some client work. Zero need for any ability to write a BST or radix sort.
If you're working for SpaceX, or Twitter, or a Big 4, of course you should know those things. But most developers don't work for one of those companies. The vast majority of programming is done to further a business other than programming.
"Maybe it has to do with the problem domain, but I have a hard time accepting that."
Your profile says that you are a web application hacker. I think that explains why you may not feel that data structures or algorithms are important. I was the same way myself (started my programming life by building Perl cgi scripts, made a nice little cms, definitely didn't use any fancy stuff to do that).
You should take it from me though. You will suddenly develop a deep and profound appreciation for understanding algorithms when you have a problem that requires having to comb through over a hundred thousand records, perform some type of operation on them, and sort by the results of that operation - all within a fairly narrow time period.
I think that it actually is a question of problem domains. If you never encounter this type of problem, then not knowing about things like big o notation and the various sorting algorithms will never impact you. However, when you do, that knowledge quickly pays for the time invested in learning it.
The point is this, even if I'm the fastest coder on earth, write the most elegant software, quickly with very low bug rate, innovate and spread my creativity to the team, if I don't know things like the above, I probably won't be able to work for Google... or will ever be considered a "real" software engineer. Computer Science and Application Development overlap, but are not the same thing, and require at many times, different skills. A great front end developer can understand performance and write efficient code, regardless if he knows all the details of big O notation.
This is like saying that a car mechanic needs to have a PhD in thermodynamics to work on internal combustion engines.
99.9% of coders just need to know what things in their chosen language(s) are performance sensitive and what aren't - very few actually need to know the exact reason why. And even fewer need to be able to improve on said algorithms.
If something needs sorting, you call the sort function in the object. There's absolutely no need to know what algorithm it uses, the only thing that matters is that the end result is sorted correctly.
As a software engineer, I rarely use the algorithmic computer science stuff. It is useful to know (and it is a requirement for my job occasionally), but the only times I have ever had to implement a sort routine were for interview questions or exams.
Do i really _must_ know 1) Big O notation, 2) common algorithms and their time and space complexity, 3) which opcodes exist for a CPU of your choice and 4) a kernels main system calls?
I dont know any of these. I know they exist, i know of the concepts, but no specifics. I would argue that these are good to know, and a must if you are working in a domain where this matters. For design, architecture and implementation in a high level language when the problem do not touch these areas this is not a MUST have.
On that last part a Jr. CS question that you maybe not even use in real life for some CS jobs. People forget that not everyone is writing awesome ML things or other awesome things you read about in newspapers. Most of the people are writing CRUD code, do some data transformation , automate so simple office process, etc. Never had much use there of Big-O knowledge....
Does not mean I never used much of this knowledge I have needed at some points. But by that tine I had to refresh the knowledge again...
Data structures and algorithms are the foundation. If I ever find myself unable to write a list or a tree or a sort, I'd better be doing a different job.
I have learned from a lot of commercial code. It doesn't mean I can't also write "free" code.
I find most things in all code to be generic and obvious. The same functions are used by lots of other software with just different names that there's nothing novel.
After you've learned to classify lots of functions with big O notation, you will end up writing things in an obvious way for a base-level optimization that fits naturally.
I concur:
Nowadays, the basic building blocks are done. Good sorting, database indexing, O(1) set-membership, many variants of lists, hash tables, and other higher level data structures are well covered by modern libraries. The foundations of modern system design are algorithms and data structures, and they are largely commoditized, for general use. Certainly, working on general purpose, clever-but-foundational algorithms is becomming increasingly niche.
I disagree with the final conclusion that somehow algorithms have faded into obscurity.
Anyone writing any program has to consistently track their expected program performance as it shuffles data between commodotized library calls or between network end points. This is algorithm design. Many "foundational" algorithms like max-flow actually make use of "foundational(er)" algorithms and shuffle data between them.
This is what the modern programmer does when he decomposes a data-intensive problem into sub-problems and uses commoditized libraries and re-composes those solutions.
The push to catalog and archive a wide array of well-implemented algorithms and data structures tracked the emergence of the personal and commodity hardware. The emergence of new hardware will require new solutions. Mobile, webassembly (perhaps), ASIC-based data centers, CUDA, are all exciting new areas where our "foundational" algorithms and canonical solutions need to be revisited, redesigned, and re-tuned. Or at least re-composed.
I agree that there's a balance between the two. My ideal team is probably a quarter that know core CS concepts and how/when to use them and the other three quarters devs can write good code whether or not they know core CS concepts.
The part that makes me frustrated enough to comment on this is that I've found that a lot easier to convince a dev who doesn't know Big O that they need to than it is to convince a dev that does know Big O that it's not always necessary. Convincing is significantly harder if the dev graduated from college in the past 3 years.
I've been doing full stack web programming and server administration for the past 20 years, and I almost never encounter "Big O" notation and algorithm problem solving except in interviews.
99% of the things I have built in the past (and continue to build on a daily basis) consist of just gluing things like APIs together, or building user interfaces that essentially just allow users to display and modify data in various databases.
I've had to learn tools like Docker, Git, Travis, Ansible, and Kubernetes. I've had to be able to learn the basics of languages like Javascript and Python and Ruby in a matter of months or even weeks. I've had to pick up frameworks like Angular and React and whatever new shiny thing exists, at the drop of a hat.
I have built large web-based software projects that power multi-million dollar companies, and that handle millions of users on a daily basis.
I don't have a college degree, and my knowledge of advanced math ends at Algebra. But that didn't stop me from assembling and launching a machine learning platform for the company I work for.
But yeah. I don't know shit about algorithms or your "Big O" that you love so dearly and love to judge every "programmer" by.
I'd say the true foundation needed is entirely different. Avoiding spaghetti code, good refactoring practices, understanding when more architecture is needed, avoiding premature optimization or over-architecture, concepts like technical debt, writing code designed to be readable, how to establish practices and standards and make sure they're communicated properly... These are the kinds of things that actually matter in most projects.
(Obviously, if you're writing video codecs or kernel code, it's a whole different story...)
reply