Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

This is possible, but unless processor accelerated comes at a great cost. You'd need a branch after every math, unless the compiler could prove the math wouldn't overflow.

The ARM mode described elsewhere in the thread where there's an overflow flag that persists across operations would help; then you could do a check less frequently.

A mode were you get a processor exception would be great, if adding that doesn't add significant cost to operations that don't overflow. Assuming the response to such an exception is expected to be a core dump, the cost of generating such an exception can be high; of course if someone builds their bignumber library around the exception, that won't be great.



sort by: page size:

I have to ask again: Why isn't there a better hardware support for overflows on x86? Mainframe CPUs do have an option to raise an exception condition on arithmetic overflow.

ARM actually kind of has that. The register file has an overflow flag that is not cleared on subsequent operations (sticky overflow). So instead of triggering an exception, which is prohibitively expensive, you can do a series of calculations and check afterwards if an overflow happened in any of them. A bit like NaN for floating point. From what I understand the flag alone is still costly, so we will have to see if it survives.

Not really. Every processor has an Overflow Flag [1] that can be used to check for overflows in arithmetic operations.

Most programming languages don't have a way to explicitly action on contents of that flag since it would complicate optimisations and program flow, but it's still a pity that this caused so many bugs.

[1] https://en.wikipedia.org/wiki/Overflow_flag


I wish x86 had an easy way to accumulate overflow flags, you could compile entire basic blocks as if they were using native integers, do a single check at the end, and if needed, roll back the computation.

The yielding part is harder. You need to have infrastructure in place to dynamically flush certain operations when unexpected yields happen.


Python does just that for integers (since it automatically casts up to bigints instead of overflowing), as does Swift, and that's arguably fairly low-level/"compiled"! It doesn't strike me as an absurd idea.

Compilers could optimize for common cases where overflows are guaranteed to not happen, or perform the "isn't b100..000" check at the entrance to an inner loop and defer to the hardware's native overflow detection capabilities (if any).


Modern CPUs don't like traps ("exceptions"). The exception causes a pipeline flush which kills performance for math-intensive code.

For example, detecting integer overflow on x86 and x86_64 CPUs is easy: check the overflow flag after every arithmetic operation. It would only be slightly more difficult to detect overflow for SSE (vector) operations, which would require doing some bit masking and shifting.

For a language such as Swift, building it in is simple.


Which on x86, one could trap on overflow. That is one of those legacy options people complain about because they aren't used by C. I'm not sure that is as easy on Arm/etc because IIRC there isn't an integer overflow exception.

For whatever reason most newer languages (say rust) don't actually solve this problem either. They could diverge from the normal and do saturating (which arm does have) ints, or throw exceptions on overflow/underflow, but they don't because that would be to hard when they have to manually check overflow on each operation because its not a common feature of many processor arches.

edit: although LEA is one of the instructions which avoids flags updates, so even if you wanted to trap it probably wouldn't.


So it depends on how you want to detect overflow.

Trapping on math exceptions or otherwise pulling an emergency break doesn't really fit with modern hardware design if you want performance (for one thing, what do you do with the rest of your vector? do you just live with a pipeline flush?).

Adding targeted checks used sparingly can work, but probably requires a deeper understanding of the underlying numerical analysis of your problem. Generally just checking for NaN or Inf at the end is a better solution as these are in most cases absorbing states.


One must not forget that on any non-toy CPU, any instruction may generate exceptions, e.g. invalid opcode exceptions or breakpoint exceptions.

In every 4-5 instructions, one is a load or store, which may generate a multitude of exceptions.

Allowing exceptions does not slow down a CPU. However they create the problem that a CPU must be able to restore the state previous to the exception, so the instruction results must not be committed to permanent storage before it becomes certain that they could not have generated an exception.

Allowing overflow exceptions on all integer arithmetic instructions, would increase the number of instructions that cannot be committed yet at any given time.

This would increase the size of various internal queues, so it would increase indeed the cost of a CPU.

That is why I have explained that overflow exceptions can be avoided while still having zero-overhead overflow checking, by using sticky overflow flags.

On a microcontroller with a target price under 50 cents, which may lack a floating-point unit, the infrastructure to support a flags register may be missing, so it may be argued that it is an additional cost, even if the truth is that the cost is negligible. Such an infrastructure existed in 8-bit CPUs with much less than 10 thousand transistors, so arguing that it is too expensive in 32-bit or 64-bit CPUs is BS.

On the other hand, any CPU that includes the floating-point unit must have a status register for the FPU and means of testing and setting its flags, so that infrastructure already exists.

It is enough to allocate some of the unused bits of the FPU status register to the integer overflow flags.

So, no, there are absolutely no valid arguments that may justify the failure to provide means for overflow checking.

I have no idea why they happened to make this choice, but the reasons are not those stated publicly. All this talk about "costs" is BS made up to justify an already taken decision.

For a didactic CPU, as RISC-V was actually designed, lacking support for overflow checking or for indexed addressing is completely irrelevant. RISC-V is a perfect target for student implementation projects.

The problem appears only when an ISA like RISC-V is taken outside its right domain of application and forced into industrial or general-purpose applications by managers who have no idea about its real advantages and disadvantages. After that, the design engineers must spend extra efforts into workarounds for the ISA shortcomings.

Moreover, the claim that overflow checking may have any influence upon the parallel execution of instructions is incorrect.

For a sticky overflow bit, the order in which it is updated by instructions does not matter. For an overflow bit that shows the last operation, the bit updates must be reordered, but that is also true for absolutely all the registers in a CPU. Even if 4 previous instructions that were executed in parallel had the same destination register, you must ensure that the result stored in the register is the result corresponding to the last instruction in program order. One more bit along hundreds of other bits does not matter.


x86 processors already have overflow and carry bits in their flags register to tell when overflow has occurred.

It makes more sense to me to have compiler writers check the flags if they care about overflow, and avoid the slow down if they don't.


Would you consider adding a built-in way to safely multiply two numbers?

Numeric overflows in things like calculation of buffer sizes can lead to vulnerabilities.

Signed overflow is UB, and due to integer promotion signs creep in unexpected places.

It's not trivial to check if overflow happened due to UB rules. A naive check can make things even worse by "proving" the opposite to the optimizer.

And all of that is to read one bit that CPUs have readily available.


As you say, detecting the overflow is easy, but efficiently handling it is not. It adds a branch to every single arithmetic operation, and it makes it much harder for the compiler to optimise things e.g. it is hard to vectorise a loop summing an array, if every + has a conditional branch on the overflow flag.

(Also, I believe it introduces a lot of data dependencies, getting in the way of the out-of-order execution of modern CPUs.)


It could be, if the hardware supported it. Consider this quote from that page:

"in some debug configurations overflow is detected and results in a panic"

That's not good enough. We want to always detect it! Many critical bugs are caused by this in production builds too. Solving it at the language level would require inserting branches on every integer operation which is obviously not acceptable.


The processors I've used all have an overflow flag that will tell you if an addition result exceeded the size of the register. But I'm not aware of any compilers that will use the flag, because it adds overhead that isn't wanted or needed 99.99% of the time.

Instead of being stuck with yet another inaccurate number type for integers, I want to see hardware assisted bigints. Something like:

- A value is either a 63 bit signed integer or a pointer to external storage, using one bit to tell which.

- One instruction to take two operands, test if either is a pointer, do arithmetic, and test for overflow. Those cases would jump to a previously configured operation table for software helpers.

- A bit to distinguish bigint from trap-on-overflow, which would differ only in what the software helper does.

- Separate branch predictor for this and normal branches?

I don't know much about CPUs, but this doesn't seem unreasonable, and it could eliminate classes of software errors.


The boost safe_numerics library (c++) has support for this (up to a point), in addition to detecting overflow at compile or run time.

> Overflow detection at the CPU level is almost free.

Yes, detecting the overflow is free but reacting to it is expensive. If you do care about it, you'll fire off some kind of trap handler or at least do a data dependent branch which has a performance hit with pipelining.

It's definitely not something that should be enabled for every integer operation for all languages.


It's that, or have every + risk integer overflow.

It's a small thing, but it's nice to do. Or at least detect overflows and move to bignum in the default numerical implementation. It's not end-of-the-world if you don't, but it helps avoid a lot of bugs...


> With hardware support, it is even fairly cheap to implement, as you just connect the overflow bit of your integer ALUs to something interrupt-triggering.

You can exploit the IEEE-754 inexactness exception with doubles if you're okay with 53 bits of integer precision. Tim Sweeney showed me this trick a long time ago, which he combined with run-time code patching so that an arithmetic operation would be speculatively compiled as either fixnum or bignum based on observed run-time behavior. It'd compile it to fixnum by default with an overflow trap using the FP inexact exception. The trap would patch the fixnum site with the equivalent bignum code and resume. With hindsight inspired by modern trace compilers like LuaJIT, the better approach is probably to compile entire traces speculated on either case, so you can hoist most of the guard predicates and not speculate based solely on the proximate call site but on longer traces.

next

Legal | privacy