This is the most overrated and overquoted piece of (bad) software engineering advice ever. Sure, it's harder to read unfamiliar code than write new code, because your understand the reasoning behind the new code. That's exactly why rewrites often make much more sense than labor-intensive incremental changes.
I've had many real-life situations where I successfully chain-deleted thousands of lines of legacy code and replaced it with some sub-100-line method that simply worked. All that without understanding how legacy code works. You're wondering what's the secret here? Instead of trying to deduce the undocumented logic behind legacy code I gathered current requirements and implemented them in the simplest way possible.
In most cases the results were dramatically more readable, because I used build-in language capabilities and standard libraries whereas old code spectacularly failed to do so. Also, I didn't have to worry about requirements that were no longer relevant.
> I gathered current requirements and implemented them in the simplest way possible.
If you can get complete current requirements, that's fine. The issue with this is "all the stuff which falls through the cracks".
In the example from the post, some code to "handle case where internet explorer is not installed". That the sort of thing which would often be missed in requirements gathering, discovered in the field and then patched into the codebase.
Sure - if you have one of:
* close-enought-to-perfect requirements
* close-enought-to-perfect regression tests
then a rewrite isn't too hard. But achieving either of the above is something I'd consider a hard problem.
Do you have a particular way of getting signoff that your current requirements are "complete"? That would be worth knowing :-)
Yeah, my favorite definition of "legacy code" is "code that has poor test coverage."
The regression test suite should literally implement all of the business requirements, and then the two become the same thing (if there's no test for it, it's not a requirement, and vice versa), so then you can work on the codebase with confidence, and it's therefore not legacy code, no matter how old it is or what old school language it's written in.
>>I've had many real-life situations where I successfully chain-deleted thousands of lines of legacy code and replaced it with some sub-100-line method that simply worked. All that without understanding how legacy code works. You're wondering what's the secret here? Instead of trying to deduce the undocumented logic behind legacy code I gathered current requirements and implemented them in the simplest way possible.
Except one of the main reasons legacy code tends to be so long is that it deals with tens/hundreds of small edge-cases or bugs that your "sub-100-line method" simply cannot account for. So what you are doing when you re-write the legacy code without even understanding what it does is regressing the product by several years in terms of maturity and stability. Your new code will face the same problems the legacy code did, except it will break. And then you will have to start adding to it...
Now, it is absolutely possible that those edge-cases and bugs that the legacy code dealt with are no longer an issue: maybe they were added back in Windows XP days, and your users no longer use Windows XP, or something. In that case, yes, go ahead and rewrite it. But you need to think about and understand what it does first, and why.
Exactly, It comes across as really arrogant to replace thousands of lines like that without even looking at them.
Maybe those old timers first iteration only had 100 lines too.
I've done massive refactorings in the past as the parent is suggesting. But before I do it, I try to figure out whether the code is hard to read because the developer was bad, or because a good developer was forced to add support for hundreds of edge cases over the years.
If I notice at a cursory glance that I can replace more than a few ridiculously convoluted chunks of code with something simpler and be sure that I broke no edge case for those chunks, I just assume that the previous developer was simply incompetent and rewrite the whole thing to meet the original requirements. (And perhaps down the line a developer better than I will do the same with my code!)
Well, maybe. It really depends and you won't know unless you're reading the code and cross-referencing with applicable unit/regression tests.
Some of my most productive days have been in eliminating unnecessary/poorly written code, not in adding lines.
But you can't do that unless you know:
1. What the program should do
2. What the specific code you're looking at is trying to accomplish vs. what it actually does
Bringing #2 (what it actually does) into synchronization with what it's supposed to do can allow you to eliminate code, and the opportunities for reducing the overall amount of code grows as you look at a wider scope/% of the overall project.
I didn't say I never looked at them. I said I didn't need to understand how they worked. There is a huge difference.
For example, I've seen logic that looked at roughly 10 user input fields and generated one out of 30 pre-defined values. That was the extent of what it did. It had multiple classes, complex global state management, database and session interactions and thousands lines of code (with no automated tests or comments). That's how real legacy code looks.
When asked to modify something like that, a bad developer would simply add a few more lines of code, do some manual testing and move on.
Another kind of bad developer would start refactoring that code 'by the book', writing unit tests... and would never finish.
A good developer would start by asking what is the point of that final value the code generates and why we really want to modify the logic. Sure, you want another classification of stuff, but why?
In my case I realized that everything and everyone that were relying on that final value could use something else. Instead of spending time messing around with pathological code from above I spent my time switching systems and people to using other values. Then I simply removed that legacy classification mess altogether. The number of tickets we got for the system dropped by at least 50%. I still don't know exactly how all that code worked, because I merely traced out where it interacted with the rest of the world.
Someone will surely comment "But that's refactoring!". I don't care how you call it, because at the time and place I was working on the project it was considered a rewrite, and that's what matters. If everyone followed Joel's advice, no one would question the necessity of that code, no one would look at deleting it altogether it would still be there, except bigger and buggier.
A full rewrite is a great excuse to re-examine all those exceptions and see if they are really, or still, needed. In fact this can be a reason to choose a full rewrite over incremental change.
I've had many real-life situations where I successfully chain-deleted thousands of lines of legacy code and replaced it with some sub-100-line method that simply worked.
This just about never happens outside of start-up world. You can't go around deleting thousands of lines that you don't understand and just pray that your new implementation won't silently (or catastrophically) break subtle side-effects (errors or otherwise) that the organization has adapted its business around. That's just madness and would probably get you fired in most places.
Eh, We just did the equivalent in an enterprise app.
It's amazing how convoluted code can get when someone doesn't sit back and think.
It's true -- if a codebase is predominantly written and maintained by competent developers who write good tests and are given the authority to refactor as they go, this shouldn't happen.
On the other hand, who wrote it can be a significant factor in the decision.
I hate working with people that have this attitude. Blowing away old code that works and rewriting it, because obviously they are the smartest person, and can do it better than anyone else.
I've had no end of regressions and just full out failures because someone rewrites code and doesn't understand what it does or why it does it. Then I come in and see that if they had only spent an hour reviewing the code and understanding it, we would have saved weeks of development and testing time.
I would be hesitant to just make a broad statement like 'overquoted pice of (bad) software engineering advice ever' just based on your experience.
It's one thing to replace legacy plumbing code (which what you describe is likely just that) with new 5 line plumbing code.
It's quite another to walk through lines and lines of classic ASP code (sub your own 'legacy' language here) that mixes data access with business logic and perfectly meets the businesses needs.
I have found it hard to go to a client and say we will build you a new version of the application but most of the cost is in the rewrite to make the software more "modern" and only a small portion of the cost is new features.
Also, try getting current requirements from business stakeholders who sometimes don't even have a clue what the behind the scene 'magic' is happening in the application but it fits their business needs and gathering requirements amounts to 'do what the app already does'.
I certainly don't disagree with your opinion but I would venture that if Joel's advice were to be treated as a software development pattern, it would generally fit the situation.
I've had many real-life situations where I successfully chain-deleted thousands of lines of legacy code and replaced it with some sub-100-line method that simply worked. All that without understanding how legacy code works. You're wondering what's the secret here? Instead of trying to deduce the undocumented logic behind legacy code I gathered current requirements and implemented them in the simplest way possible.
In most cases the results were dramatically more readable, because I used build-in language capabilities and standard libraries whereas old code spectacularly failed to do so. Also, I didn't have to worry about requirements that were no longer relevant.
reply