As for C, I have to contradict you. Implicit (numeric) type conversions do actual conversions (like char to int, or int to double; even pathological cases like pointer to what-counts-as-Boolean-in-C). The casts you are talking about must be explicit, and thereby are squarely the programmer's responsibility.
Of course, a language that forces all conversions to be explicit is preferable.
To underscore your general point: You're totally right. But... a but:
Implicit type conversions in C are cumbersome. In javascript for example, they are way more. If javascript had a similar role (i.e. bare metal code), the world would be way messier than it already is.
Languages without strict type checking are in general open to problems like this, more or less depending on the leniency to check potential type error. C, being a bare-metal language is especially ugly since it allows to cast anything to anything else, granted - but is is spottable, reviewable and, for new code, you won't get away with ugly casts. In javascript you don't even see casts, they just happen.
Still, javascript doesn't have an unsigned integer, so this attack vector would not work.
Some implicit conversions are okay, like type promotion from int to double. Some type coercions are fraught, like char to int or back again. I agree that array decay to pointer could be explicit, and pointers shouldn't cast to arrays.
I agree that implicit conversions in C/C++ are something that ideally wouldn't exist.
IMO, though, the only way to avoid having those rules is by requiring explicitness on the part of the programmer. All other options just substitute a different set of rules which for some cases will be expected and in other cases will be unexpected.
This seems to mostly be an argument that implicit conversions are harmful. I think that's generally agreed to be true, most newer programming languages require all type conversions to be explicit.
But whether implicitly or explicitly converted, the compiler would reject the code all the same, no? (Unless an explicit conversion would actually succeed, which is even worse. I'm not a Rust programmer so don't know. But this is why explicit casts are frowned upon in C.)
I worked on a project that compiled a declarative DSL to both vectorized CPU code and GPU code.
For the vectorized CPU code, I found ISPC generally pleasant to use, but don't be fooled by its similarities to C. It's not a C dialect, and I got burned by assuming any implicit type conversion rules that are identical across C, C++ and Java would also hold for ISPC. The code in question was pretty simple conversion of a pair of uniformly distributed uint64_ts to a pair of normally distributed doubles (Box Muller transform). As I remember, operations between a double to an int64_t result in the double being truncated rather than the int64_t being cast. I wrote some C++ code and ported it to Java more or less without modification, but was scratching my head as to why the ISPC version was buggy. I remember the feeling of the hair standing up on the back of my neck as it dawned on me that the implicit cast rules might be different than those of C/C++/Java.
Seemingly arbitrarily deviating from C/C++ behavior in a language that's so syntactically close is a big footgun. Honestly, I think that C made a mistake. An operation between an integer and a floating point number should result in the smallest floating point type that has mantissa and exponent range both as large and the floating point operand and capable of losslessly representing every value in the range of the integer operand's type. If no such type is supported, then C should have forced the programmer to choose what sort of loss is appropriate via explicit casts. Disallowing implicit type conversion is also reasonable. However, if your language looks and feels so close to C, you really need good reasons to change these sorts of details about implicit behavior.
Similarly, I think C should have made operations between signed and unsigned integers of the same size to result in the next largest sized signed integer (uint32_t + int32_t = int64_t), or just not allow operations between signed and unsigned types without explicit casts.
Forcing the programmer to convert integer types explicitly doesn't really help much. It makes it obvious that a conversion is happening, but that's it. People will simply add the required explicit casts without thinking about possible truncation or sign changes. UBSan has a very useful -fsanitize=implicit-conversion flag which can detect when truncation or sign changes occur, but this stops working when you make the cast explicit. So in practice, implicit casts actually allow more errors to be detected, especially in connection with fuzzing. Languages like Go or Rust would really need two types of casts to detect unexpected truncation.
I dislike implicit type conversions, but I think every cast needs to be accompanied by an assertion. Just sprinkling casts everywhere does nothing but silence important compiler warnings.
> Many languages "without implicit casts" e.g. Java still have this problem because they allow widening casts.
I'm reading this as, "Many red boxes are blue circles." If the language allows implicit widening casts, it has implicit casts.
I don't think more powerful checks are necessary. It's just that the implicit conversions in C are a bit wild, and they result in unexpected behavior and surprise programmers, and from experience, making all casts explicit is not such a burden (except for stuff like char ptr -> const char ptr).
I think the fantasy here is simple... as much as we want to explore new ways to make safe programs with better type systems and runtime checks, there is still some design space near C which is safer but not really more complicated.
The trouble with explicit casting is if the code is refactored to change the underlying integer type, the explicit casts may silently truncate the integer, introducing bugs.
D follows the C integral promotion rules, with a couple crucial modifications:
1. No implicit conversions are done that throw away information - those will require an explicit cast. For example:
int i = 1999;
char c = i; // not allowed
char d = cast(char)i; // explicit cast
2. The compiler keeps track of the range of values an expression could have, and allows narrowing conversions when they can be proven to not lose information. For example:
int i = 1999;
char c = i & 0xFF; // allowed
The idea is to safely avoid needing casts, in order to avoid the bugs that silently creep in with refactoring.
Continuing with the notion that casts should be avoided where practical is the cast expression has its own keyword. This makes casting greppable, so the code review can find them. C casts require a C parser with lookahead to find.
One other difference: D's integer types have fixed sizes. A char is 8 bits, a short is 16, an int is 32, and a long is 64. This is based on my experience that a vast amount of C programming time is spent trying to account for the implementation-defined sizes of the integer types. As a result, D code out of the box tends to be far more portable than C.
D also defines integer math as 2's complement arithmetic. All that 1's complement stuff belongs in the dustbin of history.
What type errors? Do you mean that, to get it to compile as C++, you have to add an explicit cast? That isn't a significant burden.
Requiring explicit casts, and forbidding implicit conversions, reduces the odds of being surprised by an unwanted conversion. There's a good argument to be made for adding explicit casts to C code even when the language doesn't require them. GCC's -Wconversion flag can help with this.
The languages I work with most often in decreasing order of frequency are: C, Python, C++, Fortran, MATLAB, and Julia. I don’t regularly cause problems for myself by running afoul of implicit casting rules, but I also spend all day writing numerical software. I can see that it can cause problems for other people who don’t spend as much time dealing with IEEE754.
There are definitely days where I feel like an OCaml-like level of casting pedantry would be nice, and other days I feel like it would be exhausting and make my code harder to read. I’ve never written a big piece of code in a language that required explicit casts for everything. I’m open to it.
I would say that unsigned integers cause me the most problems of any numeric type, but not so many problems that I’m looking to change languages. If I worked in a different field it’d be a different story I’m sure.
Maybe so, maybe so. I do agree that numeric implicit conversion is not as simple as it first seemed to me, and would have knock-on effects elsewhere.
Edit: For the record, here are the rules we've discovered so far:
1. Numeric casts: A -> B happens automatically if every value of A can be represented exactly in B. Both must be base types.
2. Aliased casts: A -> B happens automatically if A is an alias of B.
3. Automatic casts only happen if a single automatic cast is required, not more. In other words, x + y should not cause x and y to both be cast to a common type.
Of course, a language that forces all conversions to be explicit is preferable.
reply