You still have to test all the 80 lines if they're broken down into multiple functions, so it's something that you have to evaluate on a case-by-case basis.
It might even make it harder to test: if you break a function wrong, then you might end up with a group of functions that only work together anyway.
For example: when you break a big function into 3 smaller ones. If the first acquires a resource (transaction, file) and the third releases it, then it might be simpler to test the whole group rather than each one separately.
Breaking an 80 line function into to 8x 10 line functions does not necessarily make it easier to test. Most of the time it just adds unit testing busy work, for no clear benefit. This becomes more clear if you imagine you wanted to test every possible input. Splitting the function in 8ths introduces roughly 8x the work, if each new function has the same number of possible input states. The math is more complicated in the general case, so you have to evaluate it on a case-by-case basis. Also, if you're trying to isolate a known bug, it might be beneficial to split the function and test each part in isolation.
One thing to consider here though is that often times when you realize splitting up a function will make it easier to test, it's because your implementation sucks and is doing too much and too tightly coupled. I mean realistically how would splitting up a function make testing it easier unless the function is already complex and performing multiple tasks?
You can certainly argue that some of the clean code folks do a lot of needless abstraction that makes it harder to work on code, and I think that's true at times. But at the same time, a 200 line method doing 19 different things is also quite hard to understand and modify, and the reason testers want to split that method up is because it's really hard to understand and has too many possible outcomes.
I don't like to overly abstract things and I try to strike a balance here, but I can say without a doubt that I've never found it harder to understand and work on a single class with 20 methods that each do one thing (with descriptive method names) than I have a method with 200 lines of code doing the same 20 things. And the former is much easier to test as well.
Uhhh if you have a 1,000 line function it definitely should be broken down. Even a 100 line function is borderline. It's gonna be nigh impossible to write comprehensive unit tests for a 1,000 line function.
However, if you have a 1000+ line function that you split into small functions you can pretty easily write a few unit tests per function to see which, if any, of those chunks have problems and then need to be fixed. It's pretty much impossible to write unit tests that can sensibly test a non-trivial 1000+ line function. You might get away with it if it's doing something very straightforward but I wouldn't be very confident in it.
Do you test every line of code you write separately? Probably not. You test a function that has 5 lines of code.
Same for anonymous functions. You test the functions that use them and that's usually enough. If not, then that is a good indicator of separating them out into named functions.
For a function with 2,000 lines of code, we have to be honest with ourselves and accept that testing will never be sufficient; if there's 2,000 lines of code, you can bet good money that there's global state manipulation as well.
Making the assumption that anybody is capable of sufficiently testing such functions and subsequently re-writing them will only introduce new bugs and old regressions.
Seems harsh, but I've had to work with a few such monstrosities. Global state galore.
This is the wrong way to think about it, though. A function encapsulates some behaviour, regardless of how short or long it is. You don't test each line (directly); you test each function's mode of operation. So a function with one if statement in it potentially needs two (happy path) tests.
You could deliberately do that. But most of the time length and complexity correlate. More importantly, when you're splitting up functions to make them testable, you're very much reducing complexity and length in tandem.
Let me come out and say - I think OP is wrong. When people write 500 line long functions the only way they are going to test it is, run with real input and see if it breaks.
Such testing is not only insufficient, but any regression that is added as a result of a change will go unnoticed too, until a real user hits that bug. You will notice how most people who are supporting long functions - don't even talk about how the heck they are going to test it.
Or even worse you start writing code for "testability" and it becomes a bloated mess of one- or few-liner functions that are only called in one location by some other function.
If the function was truly linear having a long function wouldn't be so bad. But it actually isn't, the example contains multiple branches!
Will people bother testing all of them? Or will they write a single test, pass in a pizza and just glance at it actually working? My guess is the latter, as testing multiple branches from outside is often tedious, vs testing smaller specialized functions.
Splitting up functions just to make them testable is like a code smell, in my experience. It has correlated highly with cultish adherence to testing for the sake of testing rather than to achieve an engineering or quality goal.
It might be reducing "cyclomatic complexity" but not necessarily overall complexity. Often the opposite is true: reduced locality imposes its own costs in maintenance and efficiency for example.
At first I thought how horrible, but basically you have sort of 9 functions within the same scope, each having a docstring. So I guess not too different from splitting them up.
I read you have "end to end" tests.
One question though: Wouldn't each part benefit for having their own unit tests?
> Would testing it be easier if it were broken out further?
In my experience, you get the most effective use of your time by testing at interface boundaries.
How the code is structured inside of those doesn't terribly matter in most cases.
That said, you are almost always better off structuring a method as a linear set of steps rather than trying to turn it into some sort of "choose your own adventure" journey.
Basically a typical method of the sort I'm thinking of would be something like:
#{ Check early out conditions
...
#}
#{ Prepare the values
...
#}
#{ Perform I/O operation
...
#}
#{ Clean up / build return value
...
#}
You could break that out into four or five methods, but most likely the contents are going to be a bit of filler and then calling into some other library.
Easier to keep everything consolidated and avoid the spaghetti.
Wouldn't in-lining make unit testing more difficult?
Admittedly I am not a very experienced programmer, but I thought the general line of thought with regards to making your program easier to test would be that each function should do as little as possible.
A lot of those if-/case-blocks are precisely where I'd put functions :)
If you changed a bunch of those to separate, pure (i.e. side-effect-free) functions it would if nothing else make unit testing a breeze, and then you'd be free to fix bugs in the logic without fear. As it is, if I had a bug in that huge function I'd be really worried about breaking some edge-condition or implied-state 500 lines up etc.
Usually it's about keeping it granular enough that various sub-activities can be tested in isolation. If you have one function with 10000 LOCs, that's not really testable beyond "something doesn't work".
Then of course you need some way to automatically run these tests, but that's usually provided by IDEs or standard libraries these days.
It might even make it harder to test: if you break a function wrong, then you might end up with a group of functions that only work together anyway.
For example: when you break a big function into 3 smaller ones. If the first acquires a resource (transaction, file) and the third releases it, then it might be simpler to test the whole group rather than each one separately.
reply