Personally I would prefer i64, but the following are defensible:
- Treat it as a special arbitrary precision type, similar to Go
- Choose isize because it's the native size for the platform
- Make it a compile error when the type can't be inferred
The rationale for i32 in RFC 0212 is basically, "well C compilers usually do this", and some hand waving about cache (why not i16 or i8 then?!?). Then they chose f64 instead of f32 for floating literals, which undermines the cache argument. So really, Rust did it because C compilers do it, and C compilers do it to simplify porting old software, which doesn't apply to Rust.
That's another thing. I don't want to see 32 and 64 crap in code that is supposed to be high level and portable.
i32 means I'm doing FFI with C libraries.
If you check the Rust documentation, you see it has no integer types that are not a fixed size.
"All the world is x86/ARM (formerly DEC VAX / Sun SparcStation )"
C started on 18 bit machines. A program that correctly used "int argc" for the main argument count runs on an 18 bit machine or a 16 bit machine without modification.
It may be crazy but it's not exactly without precedent. Neither C nor C++ fix the sizes of the fundamental integer types, although for backward-compatibility reasons popular 64-bit platforms still have 32-bit `int` and even `long`. But yeah, there's a reason eg. Rust has no `int` and friends but `i32` etc. instead.
I find `usize` in Rust is like the default unsigned integer, used for nearly everything and especially everything index-related.
I do think in practice using `u64` instead of `usize` is meaningless, since there are so few 32-but systems today. The nice thing with the newtype pattern though is that if for whatever reason your program is going to run on a 32-bit instance and you need 64-bit IDs, it’s a 1-line change.
Both Linux (in C) and Rust choose to name types based on the physical size so as to be unambiguous where possible, although they don't entirely agree on the resulting names
Linux and Rust agree u32 is the right name for a 32-bit unsigned integer type and u64 is the name for the 64-bit unsigned integer type, but Rust calls the signed 32-bit integer i32 while Linux names that s32 for example.
It would of course be very difficult for C itself to declare that they're now naming the 32-bit unsigned integer u32 - due to compatibility. But Rust would actually be able to adopt the Linux names (if it wanted to) without issue, because of the Edition system, simply say OK in the 2023 edition, we're aliasing i32 as s32, and it's done. All your old code still works, and raw identifiers even let any maniacs who named something important "s32" still access that from modern "Rust 2023" code.
Which is why Rust has `usize` for indexes and memory offsets. Beyond that, a good programming language can't fix a bad programmer.
Were you aware that, because people use `int` so liberally, the LLP64 model 64-bit Windows uses and the LP64 model Linux, macOS, and the BSDs use keep `int` 32-bit on 64-bit platforms? Hello, risk of integer overflow.
"The problem now is that there is no 64-bit type in the mix. One solution might be to "ask the compiler folks" to provide a __int64_t type. But a better solution might just be to switch to Rust types, where i32 is a 32-bit, signed integer, while u128 would be unsigned and 128 bits. This convention is close to what the kernel uses already internally, though a switch from "s" to "i" for signed types would be necessary. Rust has all the types we need, he said, it would be best to just switch to them."
Does anybody know why they don't use the existing fixed size integer types [1] from C99 ie uint64_t etc and define a 128 bit wide type on top of that (which will also be there in C23 IIRC)?
My own kernel dev experience is pretty rusty at this point (pun intended), but in the last decade of writing cross platform (desktop, mobile) userland C++ code I advocated exclusively for using fixed width types (std::uint32_t etc) as well as constants (UINT32_MAX etc).
I understand that there is probably a performance motivation for having int/uint having machine-dependent sizes. However, it seems to me that having different sizes on different platforms is a potential security hazard if the programmer doesn't think of this. It also gives rise to porting bugs. Isn't the idea of rust to make a securer language than C++? I would have thought mandating 32 or 64 bit for int/uint would make more sense, and if the programmer needs more or less they would have to think about that.
The C standard describes an int type whose minimum range is -32767 to 32767.
Any programmer worth their salt knows this. If writing maximally portable code, they either stick to that assumption or else make some use of INT_MIN and INT_MAX to respect the limits.
The minimum range is adequate for all sorts of common situations.
Some 32 bit targets had 16 bit int! E.g. this was a configuration for the Motorola 68K family, which has 32 bit address and 32 bit data pointers.
Personally, having just suffered the issue of uint64_t (unsigned long vs unsigned long long...): I prefer having "bit size" + "semantics" in the type, and that's it. If the hardware doesn't support it, fail horrendously or use an approximation/slower path, as specified by compiler options.
In my opinion, developers should think about semantics first, optimization (device-specific or software) after (, unless you already know your device).
I slightly disagree with JVM "int = 32bit" but they essentially are forcing their own "virtual hardware", so I can understand that. For portable native code... I can only say that I'm disappointed in uint64_t. But also that, maybe, the current Rust isn't the end-all be-all of portable types.
...which is a bit of a weak reason when there are hardly any 32-bit CPUs around anymore though. Picking a narrower integer type makes sense in specific situations like tightly packed structs, but in that case it usually doesn't matter whether the integer range is 2^31 or 2^32, if you really need the extra bit of range it's usually ok to go to the next wider integer type.
You might as well just use a 64 bit int if you're going to be changing code anyways
But this blog post is about how code which is already using 64 bit ints can have code which silently truncates. Another case of C's loose handling of types, whereas in Rust there would've been an explicit cast required (granted, in this case the macro expands to having explicit type casts)
Even worse, there are platforms with >=128-bit pointers but 64-bit address space. Rust has chosen usize to be uintptr_t rather than size_t, even though it mostly uses it as if it was size_t. A ton of code is going to subtly break when these two sizes ever differ. Rust is likely already doomed on >64-bit platforms, and is going to be forced to invent its own version of a LLP64 workaround.
Having spent a lot of time porting 32bit system code to 64bit, I developed a dislike for these explicit types. It's a slippery slope to hard code your bitness with people making assumptions where size_t or pointers fit.
Now maybe if you're already 64bit that's fine (it's unlikely that we'll ever need 128bit, and code is unlikely to grow down), but for anything starting smaller it's a pain.
> it's code bloat smeared across the entire binary.
That's probably not true in the usual case. Most arch's are 64 bit nowadays. If you are working on something that isn't 64 it you are doing embedded stuff, and different rules and coding standards apply (like using embedded assembler rather than pure C or Rust). In 64 bit environments only pointers are 64 bits by default, almost all integers remain 32 bit. Checking for a 32 bit overflow on a 64 bit RISC-V machine takes the same amount of instructions as everywhere else. Also, in C integers are very common because they are used as iterators (ie, stepping along things in for loops). But in Rust, iterators replace integers for this sort thing. There still is an integer under the hood of course, and perhaps it will be bounds checked. But that is bounds checked - not overflow checked. 2^32 is far larger than most data structures in use. Which means while there may be some code bloat, the lack full 64 integers in your average Rust problem means it's going to be pretty rare.
Since I'm here, I'll comment on the article. It's true the lack of carry will make adds a little more difficult for multi precision libraries. But - I've written a multi precision library, and the adds are the least of your problems. Adds just generate 1 bit of carry. Multiplies generate an entire word of carry, and they almost a common as adds. Divides are no so common fortunately, but the execution time of just one divide will make all the overhead caused by a lack of carry look like insignificant noise.
I'm no CPU architect, but I gather the lack of carry and overflow bits makes life a little easier for just about every instruction other than adc and jo. If that's true, I'd be very surprised if the cumulative effect of those little gains didn't completely overwhelm the wins adc and jo gets from having them. Have a look at the code generated by a compiler some time. You will have a hard time spotting the adc's and jo's because there are bugger all of them.
You are nitpicking. size_t is architecture dependent, on x86 it will usually be 32-bit, on x86_64 it will be usually 64-bit. It's not equivalent to i32, u32 types.
Also, I find i32 and u32 types much cleaner than D's "unsigned int" and "int". What is the difference between "unsigned int" and "unsigned long" in D? It's still better than C++, but I'd rather write i32 and i64 rather than "int" and "long".
AFAIK until the switch to 64-bit architectures, int actually was the natural word size. Keeping int at 32-bits was probably done to simplify porting code to 64-bits (since all structs with ints in them would change their memory layout - but that's what the fixed-width integer types are for anyway, e.g. int32_t).
In hindsight it would probably have been better to bite the bullet and make int 64 bits wide.
- Treat it as a special arbitrary precision type, similar to Go
- Choose isize because it's the native size for the platform
- Make it a compile error when the type can't be inferred
The rationale for i32 in RFC 0212 is basically, "well C compilers usually do this", and some hand waving about cache (why not i16 or i8 then?!?). Then they chose f64 instead of f32 for floating literals, which undermines the cache argument. So really, Rust did it because C compilers do it, and C compilers do it to simplify porting old software, which doesn't apply to Rust.
reply