This is reasoning backwards in a misleading way. The point is not changing the fuzzing setup to find this specific bug that we now know with hindsight was there. There are a zillion paths and you would need to be ensuring that fuzzing reaches all vulnerable code with values that trigger all vulnerable dynamic behaviours.
I wonder if one possible solution is making things more "the Unix way" or like microservices. Then instead of depending on some super specific inputs to reach deep into some code branch, you can just send input directly to that piece and fuzz it. Even if fuzzers only catch shallow bugs, if everything is spread out enough then each part will be simple and shallow.
Fuzzers can already do this. When you set up a fuzzer you set up what functions it's going to call and how it should generate inputs to the function. So you can fuzz the X.509 parsing code and hope it hits punycode parsing paths, but you can also fuzz the punycode parsing routines directly.
This is the flip size of the fuzzing approach that is called property testing. It's legit but involves unit test style manual creation of lots of tests for various components of the system, and a lot of specs of what are the contracts between components & aligning the property testing to those.
It's not backwards: you run the fuzzer, you look at the code coverage, and you compare that against what you expect to be tested. Then you update the fuzzing harness to allow it to find missing code paths.
It's far more doable than you are suggesting: fuzzing automatically covers most branches anyway, so you just need to manually deal with the exceptions (which are easy to locate from the code coverage).
I used fuzzing to test an implementation of Raft, and with only a little help, the fuzzer was able to execute every major code path, including dynamic cluster membership changes, network failure and delays. The Raft safety invariants are checked after each step. Does this guarantee that there are no bugs? Of course not. It did however find some very difficult to reproduce issues that would never have been caught during manual testing. And this is with a project not even particularly well suited to fuzzing! A parser is the dream scenario for a fuzzer, you just have to actually run it...
Yep, code coverage can tell you code is definitely entirely untested, but doesn't tell you that you are covering the input space to have high assurance that there aren't vulnerabilities.
Coverage might have helped here (or not), but it doesn't fix the general problem of fuzzing being stochastic and only testing some behaviours of the covered code.
reply