Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Uhhh if you have a 1,000 line function it definitely should be broken down. Even a 100 line function is borderline. It's gonna be nigh impossible to write comprehensive unit tests for a 1,000 line function.


sort by: page size:

However, if you have a 1000+ line function that you split into small functions you can pretty easily write a few unit tests per function to see which, if any, of those chunks have problems and then need to be fixed. It's pretty much impossible to write unit tests that can sensibly test a non-trivial 1000+ line function. You might get away with it if it's doing something very straightforward but I wouldn't be very confident in it.

Let me come out and say - I think OP is wrong. When people write 500 line long functions the only way they are going to test it is, run with real input and see if it breaks.

Such testing is not only insufficient, but any regression that is added as a result of a change will go unnoticed too, until a real user hits that bug. You will notice how most people who are supporting long functions - don't even talk about how the heck they are going to test it.


You still have to test all the 80 lines if they're broken down into multiple functions, so it's something that you have to evaluate on a case-by-case basis.

It might even make it harder to test: if you break a function wrong, then you might end up with a group of functions that only work together anyway.

For example: when you break a big function into 3 smaller ones. If the first acquires a resource (transaction, file) and the third releases it, then it might be simpler to test the whole group rather than each one separately.


For a function with 2,000 lines of code, we have to be honest with ourselves and accept that testing will never be sufficient; if there's 2,000 lines of code, you can bet good money that there's global state manipulation as well.

Making the assumption that anybody is capable of sufficiently testing such functions and subsequently re-writing them will only introduce new bugs and old regressions.

Seems harsh, but I've had to work with a few such monstrosities. Global state galore.


Or even worse you start writing code for "testability" and it becomes a bloated mess of one- or few-liner functions that are only called in one location by some other function.

Breaking an 80 line function into to 8x 10 line functions does not necessarily make it easier to test. Most of the time it just adds unit testing busy work, for no clear benefit. This becomes more clear if you imagine you wanted to test every possible input. Splitting the function in 8ths introduces roughly 8x the work, if each new function has the same number of possible input states. The math is more complicated in the general case, so you have to evaluate it on a case-by-case basis. Also, if you're trying to isolate a known bug, it might be beneficial to split the function and test each part in isolation.

Big functions are hard to unit-test.

The article includes comments from Carmack from 2014 saying he now favours breaking the code up into pure functions.


One thing to consider here though is that often times when you realize splitting up a function will make it easier to test, it's because your implementation sucks and is doing too much and too tightly coupled. I mean realistically how would splitting up a function make testing it easier unless the function is already complex and performing multiple tasks?

You can certainly argue that some of the clean code folks do a lot of needless abstraction that makes it harder to work on code, and I think that's true at times. But at the same time, a 200 line method doing 19 different things is also quite hard to understand and modify, and the reason testers want to split that method up is because it's really hard to understand and has too many possible outcomes.

I don't like to overly abstract things and I try to strike a balance here, but I can say without a doubt that I've never found it harder to understand and work on a single class with 20 methods that each do one thing (with descriptive method names) than I have a method with 200 lines of code doing the same 20 things. And the former is much easier to test as well.


Yes I also think this is what I should do to, unfortunately when I first wrote the code I spend most of my time working on I didn't write unit tests at the function level (all my tests are at the module level). I have been meaning to get around correcting this, but the task is now so huge I fear it would take me more than a year to write all the unit tests :(

That argument applies to pretty much every single unit test ever written. A function running on a single long can take 2^64 possible values. Impossible to test by your logic. Yet they're tested without issues constantly.

What you do is put together a long list of sample functions and sample arguments that covers the expected edge cases and then test those for equivalence. Hardly impossible. Just takes time. Not bulletproof but better than nothing.


A lot of those if-/case-blocks are precisely where I'd put functions :)

If you changed a bunch of those to separate, pure (i.e. side-effect-free) functions it would if nothing else make unit testing a breeze, and then you'd be free to fix bugs in the logic without fear. As it is, if I had a bug in that huge function I'd be really worried about breaking some edge-condition or implied-state 500 lines up etc.


Surely it wouldn't be aimed at proving a 10 million line black box of code, right?

In my mind it would have to be built from the ground up, sub unit tests for function proofs and maintain 100% coverage as you go along. As long as the constituent parts are proven you don't have to zoom out to a macro level.


Even with typed driven development you still want unit tests in the form of property based tests. Also a nice way to resolve the 10-50 lines issue is to follow the Arrange, Act, Assert pattern where you could look at the second block to see the actions. Move as much of the Arrange section into a setup and it should be a bit clearer :)

i’ve found that too much in lining, especially with a lot of inline closures can throw testability out the window, but the examples here are pretty good as each example is showing how to recompose each function for clarity and usefulness, not turning them into huge inline monstrosities, of which i am all too familiar ^_^;

[edit: fix grammar]


10,000 god damn global variables. It would be very difficult to expect some kind of unit testing with that many straggling race conditions bouncing around. Wow.

Yeah sure if there are some pure functions involved then by all means unit test.

I mean yea it should be tested, but making a code change that propagates through the rest of the repository is really poor abstraction design. At that point, you might as well have one class called tester that holds pointers to everything and is about 100K lines long.

This test gives an extreme advantage to lazy functional languages that they don't enjoy in any realistic context.

Trying to come up with a single test that is good turns out to be surprisingly hard. In modern languages I'm looking for the ability to do clean functional code, stateful object code, async programming, a solid module system, and a type system that doesn't get in my way. I really can't imagine a single 5 line program that would encapsulate all of these areas.

next

Legal | privacy