A developer that uses a narrower integer data type, or who wants to be clear that the argument is a member of an enumerated set of options, not an integer. What kind of developer represents non-integer data as an integer data type?
> I think random testing is a good way to get almost exhaustively tests.
I think that testing known/predictable edge and corner cases makes more sense (in some cases), but I also think that I was responding to a comment about the impossibility of using tests to prove correctness of a function, not about how to derive practical benefits despite theoretical limitations.
Testing / verification is the hardest problem in computing. I don’t think the author actually understands why that is and has a few superficial gripes with how codebase are tested. That’s not to say that a lot of test suites are actually good (they’re not). That criticism may be fair.
Static types do not help at all with software quality, beyond catching some typos. You can never, ever write a static type that enforces anything of substance, because for that to happen you’d need to execute arbitrary code within the compiler (as happens with dependent types, and of course there is the constant fear of undecidable or infinite type checking in that system).
The classic example is that there’s an infinite number of functions with type int -> int. The logic of these functions are extremely varied, and the type doesn’t help you at all. Enum types are very useful for knowing what values you are allowed to write, but don’t help with correctness.
The author brings up large input spaces, but conveniently leaves out input space partitioning, which any non-novice tester has to employ in their test case management. The category partition method is a fantastic way of covering large swaths of a seemingly infinite input space with relatively few tests. Of course there’s combinatorial explosion, but that can be mitigated with constraints on the input combinations.
I’m not saying testing is perfect - it’s not. But, the burden of proof is on the critic of testing, not testing itself. How else can you reliably change a large software product dozens of times a week, for 20 years in a row? And what’s the alternative when I see products that large teams produce break weekly?
This is not something that “ship it and iterate” has had any meaningful impact on. Eventually you get people who don’t intimately know the whole codebase, and make a breaking change unintentionally. What is the alternative to some kind of testing?
I do not want static types or unit test. I want integration tests, system test and regression tests, neither which are commonly found by doing trivial checks of correctness. The decision if an integer is stored as a 4 byte or 8 byte, big-endian or little-endian, should be up to the computer to decide. A test to verify that the right integer is sent to the function is something I shouldn't and do not want to be spending time on. The language should be competent enough to figure it out, and if it need to optimize, it should do so dynamically when such optimization is possible.
What I found problematic is that system and integration tests is generally relegated to be done outside of the programming process. It is instead the administrators job to write an extensive test system to verify that the program actually do the job that it was written for. You start to wish for a very different set of tool design when the programmer and administrator is the same person.
But more commonly what is going to happen is that someone (two years from now) is going to change a 'person' parameter from legacy SSN to database id and some users are going to get "you don't exist" when they show up at the hospital to get medical service.
You're right though, most folks in the JS/Ruby/Python ecosystems don't do this kind of testing. It's a recurring joke:
I wasn't trying to catch the error of "program author is a moron who doesn't know the difference between Fibonacci and factorial". Were I trying to catch that error I would have been aware of it, and then much less likely to write the bug in the first place. This is a truism that is well accepted by testing proponents: which tests you write are incredibly important, and you need to write your tests first in order to avoid a curve fitting problem (so to speak). Any non-trivial test would have shown my function to be very broken, what type would you have used to represent that so it wouldn't compile?
I love this idea, it's quite interesting. On the flip side, I made a change in some software at my work the other day to replace the use of a random in a unit test. The use of random was being used to populate some data. This language would make matters incredibly difficult to recreate bugs and even test software
You’re flip flopping all over the place with your descriptions of tests. Broadly speaking, There are two types of tests being discussed here. Those are positive tests and negative tests.
Positive being that the code does what you expect under normal operation. Those you’d obviously need in any language
Negative tests are, amongst other things, checking that your functions behave correctly when you put garbage in. If your compiler flat out refuses to send a string type to a function that is designed to accept an integer then you’re already covering a whole class of bugs without needing to write specific tests for it.
This article is an anti-resume, and I recommend not reading it. I'll take a look at two claims.
Unit tests are unlikely to test more than one trillionth of the functionality of any given method in a
reasonable testing cycle. ... Trillion is not used rhetorically here, but is based on the different possible states given that the average object size is four words, and the conservative estimate that you are using 16-bit words)
An int may contain "four billion states", but for the requirements, its highly likely that we can classify the integer into three states: less than zero, zero, greater than zero. As a bank, I might not care how much money you have, only that you have more than zero. In a transaction, I don't care how much money changes hands, as long as no money is lost.
Pointing at memory-as-bits, as if we're still using punch cards, and then hand waving "I can't possibly test this", ignores sixty years of progress. The refusal to imagine a class with range checking is a damning statement about the author's own ability as an engineer.
Programmers have a tacit belief that they can think more clearly (or guess better) when writing tests [than] when writing code, or that somehow there is more information in a test than in code.
Consider writing a sorting algorithm vs testing a sorting algorithm. Would you feel more confident writing the test for a sorting algorithm than writing the algorithm itself? The test is simple: is every item in the list less than the next item? The code is far more complex. We're in the same realm as NP problems. I can write a test to verify that a graph is correctly 3 colored, but the code might be a bit harder.
Perhaps, then, the author's experience of other developers believing they can "think more clearly" is actually his observation that the developers are solving simpler problems, and are thus more confident. And that is the point of tests: it is easier to verify than solve.
In short, every conclusion in this article begs the question, "Might there be another explanation?"
Hm, this is pretty cool. I really like how this generates test cases by code inspection.
However, some of the puzzles are really obscure. For example, Im talking about that one which has an int x as parameter, which has tests like 0 -> 0, 1 -> 0 and 769 -> some really arbitrary number. Those are just not fun.
I can somewhat understand, because this is kind of the goal of property based testing—the actual values themselves matter so little to the test that you’re willing to subject those inputs to randomness
That said, this doesn’t sound like a very good way to pull that off because the developer has no control over that randomness (where it’s needed greatly).
Or, you know, write property-based tests instead, so you only need to worry about the logic and not the test values.
I've always found that if you let the author of a piece of code decide on what value that code should be tested with, he'll test for the edge cases that he's thought of (and dealt with), not the ones that actually crash production.
I find odd that there was no discussion about figuring out edge cases and testing specifically for those instead of random values. Especially for functions with large domains, having few hand-picked test values could give much more confidence than thousands of random values.
Additionally with generated test values figuring out the validation logic becomes complex, and in complexity lies potential for bugs. If your testing code more complex than the code being tested, do you really gain that much confidence about the correctness? In comparison with hard-coded test values you can also hard-code the expected results, which can be gotten from some reference or just manually checked for correctness.
Of course ideally you'd have both random fuzzing and more manually defined tests, but that is setting quite high bar.
In Java, if my function doesn't modify the int, there's no reason to test the boundaries, other than 0 if the int is used in a calculation. The type system doesn't solely determine what tests to write. Tests are inferred from a combination of type system, timings, statements and variable usage.
> You do it by checking that foo() actually works and provide the correct result.
But you're testing a tiny subset of possible scenarios of behaviour of the function. If you can anticipate all possible input types and value ranges, maybe unit tests are enough, but that's not realistic in a dynamic language/program where complex values are non-deterministically generated at runtime - based on user input, db, file, etc.
That's par for the course if you use an actual framework for Property-based Testing. All the ones I've tried certainly do mix in the usual suspects (-1, 0, 1, MAX_INT, empty list, etc.) into the randomly sampled data.
The author of this piece probably chose to make the post framework-agnostic by rolling their own random generation, but left that nuance out.
If you’re not using a language that can properly support algebraic structures and randomized property-based testing you’re essentially getting no guarantees about your code from tests. You wrote the code, you wrote the tests, they’re equally likely to be incorrect.
Their point is that the test case provided would lead the developer to believe that the implementation is a correct identity function, when in fact it isn't.
Just clarifying, I do not take this as a good reason to not write tests.
> Write a property based test, ie generate a bunch of random inputs, then assert that all of them are within some (loose) margin of error.
So someone comes along later, look at your test, and wonders: why did you go through all that trouble? You can definitely write tests for a lot of that stuff, but they still don't fluently communicate the why of your choices.
> I think random testing is a good way to get almost exhaustively tests.
I think that testing known/predictable edge and corner cases makes more sense (in some cases), but I also think that I was responding to a comment about the impossibility of using tests to prove correctness of a function, not about how to derive practical benefits despite theoretical limitations.
reply