Most programming languages are not expressive enough to reliably eliminate redundant overflow checks (or overflows); afaik doing so in general way requires something in the ballpark of refinement/dependent types
> Checking overflow for addition on the other hand is something that is very seldom used
Arithmetic on numbers larger than your word size. Poster child: crypto. It's 2023 and crypto is not rare. This post cannot get from me to you without crypto in the datapath.
We don't need arbitrary-sized integers; we need exceptions on overflow (or underflow, but I'll stick to overflow for the rest of this post) to be the default, or similar language features as appropriate.
I for one am tired of the chicken & egg issue of "CPUs don't support efficient overflow checking because nobody uses it, so it's slow" and "Overflow checking is slow because the CPU doesn't support it, so nobody uses it". For all the other good security work done in both software and hardware, much for things far more complex than this, this seems like an absolutely batshit insane oversight considering the cost/benefits for fixing this.
> Chasing safety features which result in difficult to reason about semantics will inevitably lead to low language adoption.
Python solves this problem by not having overflow for integers at all, instead all integers are unlimited. It's still one of the most widely used languages in the world.
There's also functions which people do use to specify specific behaviour no matter the release mode, where handling overflows properly is particularly important - e.g. cryptography-related code.
Personally, I'd like to see more programming done in languages that simply don't allow integer overflow in the first place. Most current languages have arbitrary-precision integers; well-implemented arbitrary-precision integers are quite efficient when they fit in a machine word, and as efficient as possible when larger. Sure, you'll lose a bit of performance due to checks for overflow, but those checks need to exist anyway, in which case it seems preferable to Just Work rather than failing.
The problem with overflow isn't just needing to store numbers larger than 2 billion. Sometimes, intermediate values are larger than that even if the final result isn't.
Take averaging as a very simple example. Doing (a + b) / 2 will overflow if a and b are sufficiently large, even if the average will always fit in 32 bits. Things like this go unseen for years.
The fact many languages don't overflow check by default really saddens me. Integer overflow is the cause of so many bugs (many user-facing: http://www.reddit.com/r/softwaregore/), and yet people keep making new languages which don't check overflow. They check buffer overruns, they check bounds, and yet not integer overflow. Why? The supposed performance penalty.
The reckless removal of safety checks in the pursuit of performance would be considered alarming were it not commonplace.
(Disclaimer: I really, really care about integer overflow for some odd reason, going so far as to be going through the entire PHP codebase to add big integer support...)
Come on guys, you're getting hung up on the wrong aspect here. I'm starting with precisely the operations the OP mentions as requiring wraparound overflow support: computing digests, hashes, random numbers, things like that. Currently we use unsigned numbers when we require overflow, but things are still extremely error-prone as the OP shows. So my thought was to come up with a new dichotomy: instead of signed vs unsigned, just ask for whether you want wrap-around overflow or not. Pretty please can we just forget I ever used the word 'nominal'?
If you think adding two phone numbers can have no possible useful application, what do you say to taking strings of text, converting them to numbers, and repeatedly folding them over each other to compute a cryptographic digest? You're right that it is meaningless in the context of the original domain, but it clearly has application. The two are distinct ideas.
I totally disagree. Overflow almost certainly represents a bug and the violation of various invariants. Your code is doing something, even if it isn't giving radiation doses. Computing the wrong data and storing it can cause problems for customers, modify data you didn't plan on touching, or render data entirely inaccessible if various encryption operations are performed incorrectly.
Just like most things are not the Therac-25, most things are also not safety critical when they crash. Some web service that crashes just means that it returns 500 and you see it on whatever stability dashboard.
Overflow is fine if you're aware of it and have code that either doesn't need to care about it, or can work around it.
Consider protocols with wrapping sequence numbers. Pretty common in the wild. If I increment my 32-bit sequence counter and it goes from 2^32-1 back to 0, that's likely just expected behavior.
I don't know if I agree. Overflow is like uninitialized memory, it's a bug almost 100% of the time, and cases where it is tolerated or intended to occur are the exception.
I'd rather have a special type with defined behavior. That's actually what a lot of shops do anyways, and there are some niche compilers that support types with defined overflow (ADI's fractional types on their Blackfin tool chain, for example). It's just annoying to do in C, this is one of those cases where operator overloading in C++ is really beneficial.
From the OP, examples of intentional overflow are "..hashing, cryptography, random number generation, and finding the largest representable value for a type."
The problem is these sorts of things are trivia. They are things that sufficiently smart tooling should handle for us so people can spend time building higher level constructs and less time worrying about individual bits.
A smart compiler should have caught (a + b) / 2 and fixed it to be correct, there's no way the overflow situation is what the programmer wanted.
It's annoying that negation, ABS, and division can overflow with two's complement. But how I look at it: lots of operations can already overflow, just a fact of signed integers, and you need to guard against that overflow in portable code already. It doesn't seem to be fundamentally worse that those extra operations can overflow.
reply