Hacker Read

lake99 · 2016-09-11 11:56:19+00:00

> Take some correct code that works with int16_t, and replace the "int16_t" with plain "int". What breaks?

Memory. Defaulting to 16-bits when an 8-bit variable will do can be incredibly wasteful on an 8-bit µC. Keep in mind that we are not just talking about one variable in isolation. We are talking about all the integers we pass between functions. We are talking about code space, data space, and stack space. There are compilers that can optimize their arithmetic operations to 8-bit registers when they can be sure that that's all their operands need.

> Would you suggest a uint29_t?

If you are aware of a machine that provides that, yes! Otherwise, I'd suggest uint32_t. Yes, "long" is guranteed to be 32-bits wide, but it can also be 64-bits wide. I would not recommend defaulting to "long", as that could be wasteful. Here's an interesting discussion about the meaning of "long" and "long long" for their compiler: https://www.dsprelated.com/showthread/comp.dsp/42108-1.php. I see this discussion as a failure of the C standard.

I would much rather the compiler provided int64_t or int40_t or whatever else they can, that is not inefficient.

reply

TheNewAndy | karma 358 | avg karma 1.99 · | 2016-09-11 13:02:43

> Memory. Defaulting to 16-bits when an 8-bit variable will do can be incredibly wasteful on an 8-bit µC.

I hope it doesn't seem like I'm just moving the goalposts, but the obvious answer here is to use a type which is at least 8-bits wide if you only need 8-bits. Such a type exists, and is called "char" (with the appropriate signedness modifiers).

This whole discussion has been about writing portable code. Using a "char" here is going to work perfectly on your 8-bit uC. It is also going to work perfectly on your SHARC chip with 32-bit chars.

> If you are aware of a machine that provides that, yes! Otherwise, I'd suggest uint32_t.

I think we are talking about different things. I am talking about writing portable code. You are talking about writing code that only works on a particular machine (or the class of machines that have a 29-bit int).

I know there is a place for that code, and once you are writing code where you actually need to make use of knowledge about the machine, then I'm all for using types that make this clear. However, typically this code only lives at the edges of the system, and the actual "computation" can be written in portable, efficient, readable code without a great deal of trouble.

I'm completely aware that long can be 64-bits on some machines. If you care so much about the wasted memory, then uint_least32_t should make you happy - if you are happy with the C99 dependency (which can limit portability, though things are getting better), then I don't see how you can see this as being worse than the fixed width uint32_t.

Personally, I have found that while it sounds nice in theory, the domains I've been working in have meant that the memory doesn't make much difference (if it is just sitting on the stack or being passed between functions, then on 64-bit machines, there is typically no differences, as calling conventions tend to pad things out). It is only when you have an array of these in memory that it might start to matter, and here it tends not to matter a great deal - you are typically now optimizing an algorithm for an amd64 machine (read - it has plenty of memory) and the algorithm typically doesn't actually use a lot of it (since it needs to run on tiny micros too).

Anyway, I think we probably agree for code which isn't supposed to be portable. Potentially just that I tend to work more on the code that is supposed to be portable, and you work on the code at the edges?

reply

kazinator | karma 30751 | avg karma 1.78 · | 2019-03-19 16:33:40

> good old C has uint16_t

Firstly, no, good old C doesn't. These things are a rather new addition (C99). In 1999 there was decades of good old C already which didn't have int16_t.

It is implementation-defined whether there is an int16_t; so C doesn't really have int16_t in the sense that it has int.

> It's funny because C opted to leave the number of bits machine dependent in the name of portability, but that turns out to have the opposite effect.

Is that so? This code will work nicely on an ancient Unix box with 16 bit int, or on a machine with 32 or even 64 bit int:

  #include <stdio.h>
  int main(void)
  {
    int i;
    char a[] = "abc";
    for (i = 0; i < sizeof a; i++)
      putchar(a[i]);
    putchar('\n');
    return 0;
  }

Write a convincing argument that we should change both ints here to int32_t or whatever for improved portability.

Sanddancer | karma 5346 | avg karma 2.91 · | 2016-06-03 03:56:07+00:00

For 8 bit platforms, 8 bits are still bytes/sbytes. Most of them have native functions that work on 16 bit integers, they just take up a pair of registers. You have access to the same array of integer sizes. Plain int is 16 bits, and longs are 32 bits and long longs still 64. Still, I much prefer using the int definitions included in C99, where you define the bit size explicitly. uint16_t is a lot more explicit than int, especially if you've got code that's being shared between a few different micros of different word sizes.

mark-r | karma 5759 | avg karma 1.62 · | 2024-01-21 09:36:14

A very long time ago, the Microsoft C/C++ compiler used 16 bit ints. I had a boss that insisted we use long instead of int because he had been burned by this. Hadn't been a problem for at least 20 years, but that didn't matter to him.

1500100900 | karma 89 | avg karma 1.78 · | 2013-04-24 01:07:48+00:00

>int8, int16, int32, int64 are all explicit and force the compiler (and the hardware) to obey the wishes of the programmer.

At least in C99, the compiler doesn't need to support exact-width integer types.

>People make much ado about the fact that "a byte isn't necessarily 8 bits"

Well, POSIX.1-2004 requires that CHAR_BIT == 8.

reply

Joker_vD | karma 4965 | avg karma 2.36 · | 2024-02-19 19:52:49

Well, that's a shame because bitwidth of standard integer types is quite uncertain. CHAR_BIT can be (and is, on some platforms) 16 or 32, long was never guaranteed to be 64 bits (it's quite often was 32 bits on platforms with 16 bit ints) etc, not to mention that if that is what the standard integer types were for then they'd probably have names like int8/uint8/int16/int32/etc.

It's almost as if they were not, in fact, intended for precise control of bitwidths in portable manner...

reply

klabb3 | karma 6788 | avg karma 3.72 · | 2024-05-23 17:26:19

> Unfortunately this is less suitable for integers larger than 64 bits, because then you run out of bits in the first byte.

Just spitballing here but is it necessary to support every increment of octets? Do we really need a 7 octet/56bit width for instance? Instead couldn’t we just allow widths of log2(n_octets)? An int64 would need 2 bits of length info (8, 16, 32, 64). Or maybe I’m thinking wrong cause not all bits are can be used.

> its efficiency really depends on the expected input distribution

Right. But everything else in computers is rounded to the nearest power of 2, so maybe it works here too haha.

reply

cesarb | karma 14181 | avg karma 3.67 · | 2017-08-15 12:01:51+00:00

> What is the advantage of having "int", "long" types that just give minimum byte requirements instead of more specified types like int32_t and uint8_t?

The advantage is that these more specific types might not even exist. For instance, some machines had 36-bit words, so a 32-bit type would have to be emulated, at the cost of extra instructions.

reply

segfaultbuserr | karma 12891 | avg karma 6.11 · | 2020-04-22 15:53:20

I think it's funny. C was originally invented in an era when machines didn't have a standard integer size, 36-bit architectures were at their heydays, so C integers - char, short, int, and long - only have a guaranteed minimum size that could be taken for granted, but nothing else, to achieve portability. But after the computers of world have converged to multiple-of-8-bit integers, the inability to specify a particular size of an integer become an issue. As a result, in modern C programming, the standard is to use uint8_t, uint16_t, uint32_t, etc., defined in <stdint.h>, C's inherent support of different integer sizes are basically abandoned - no one needs it anymore, and it only creates confusions in practice, especially in the bitmasking and bitshifting world of low-level programming. Now, if N-bit integers are introduced to C, it's kind of a negation-of-the-negation, and we complete a full cycle - the ability to work on non-multiple-of-8-bit integers will come back (although the original integer-size independence and portability will not come back).

klodolph | karma 24325 | avg karma 3.92 · | 2023-05-04 08:40:27

The variable-size int, unfortunately, made a lot of sense in the early days of C. On processors like the x86 and 68000, it made sense for int to be 16-bit, so you don't pay for the bits you don't need. On newer systems, it makes sense for int to be 32-bit, so you don't pay to throw away bits.

caf | karma 12991 | avg karma 2.9 · | 2019-07-05 20:57:32

Turbo C on MS-DOS for one. In fact 16-bit int was the norm on that platform, because the architecture didn't have 32-bit general purpose registers.

In the C89 days, you'd use 'short' in aggregates (structs and array) for values you knew wouldn't exceed 16 bits so didn't want to potentially waste space; 'long' in situations where you knew 16 bits wouldn't be enough; and 'int' the rest of the time (where 16 bits was enough, and there weren't any storage benefits to outweigh the performance benefit of using the native word size).

reply

lake99 | karma 1323 | avg karma 2.37 · | 2016-09-11 08:13:51+00:00

> How do you port it to a machine without a 16-bit type?

The problem is made more severe when you use "int" and let the compiler decide the size of the variable. I guess I don't know what you're advocating.

> The stuff about communicating with the outside world is quite different.

I meant communicating as a throwaway example. It's a valid example, but really, everything gets affected. For example, a CAN identifier is 29 bits. What generic "int" would you use when you want to fill in this identifier? And no, I work at a level that's lower than an ABI. So, no OS, no file IO, etc.

reply

overgard | karma 10309 | avg karma 4.9 · | 2022-03-16 22:01:28

Leaving out UB, what is the size of an "int"? Any of: 16bit, 32bit, 64bit is correct. However I'm absolutely certain the code you write for an 8bit microcontroller is a lot different from your 64bit server, so why is this language pretending to scale like that? Yeah you can now write "uint8_t" instead of "char", but the point is that the ecosystem is a landmine and all the maimed people are shamed with "you should have read the 400 page map"

At risk of sounding crazy what if we just took this undefined behavior and just defined it? Theres like maybe 100 cpu ISAs in the world we care about, and the web and the transitory nature of tech had made legacy support kinda useless. Life would probably be simpler if we just said "ok heres how these constructs work" and just removed the ambiguity

reply

annathebannana | karma 12 | avg karma 2.4 · | 2016-01-08 15:04:28+00:00

Do you have any reason for supporting uint16_t instead of uint_least16_t?

Their point is against portability as they may not exist at all.

reply

officialtedleo | karma 8 | avg karma 8.0 · | 2016-01-08 12:23:24+00:00

> If I specify that I want a type that's 32 bits large, my code won't run on 16 bit systems.

Why not?

reply

flohofwoe | karma 17782 | avg karma 3.67 · | 2021-03-11 15:10:48+00:00

It made more sense in an era when computers hadn't settled on 8-bit bytes yet. A better idea (not mine) is to separate the variable type from the storage type. There should be only one integer variable type with the same width as a CPU register (e.g. always 64-bit on today's CPUs), and storage types should be more flexible and explicit (e.g. 8, 16, 32, 64 bits, or even any bit-width).

ajross | karma 32824 | avg karma 3.42 · | 2011-10-18 18:50:54+00:00

C99 specified stdint.h, which includes int16_t/uint16_t. So compliant compilers are required to support it even if it doesn't map to a built-in type. So you won't lose the short.

That said, it wouldn't make it any less insane; so no one does this and in fact int is 32 bits everywhere except microcontrollers (and 8086, for the tiny handful of people writing BIOS or bootloader code).

reply

cpeterso | karma 43338 | avg karma 5.73 · | 2018-09-04 17:27:42+00:00

> Why on earth would anyone use a 16 bit wchar_t?

Fun fact: the C standard does not specify the width of wchar_t. While MSVC uses 16-bit wchar_t (UTF-16), gcc (even MinGW on Windows) uses a 32-bit wchar_t.

reply

stephencanon | karma 5266 | avg karma 3.5 · | 2016-11-07 21:26:43+00:00

`uint8_t` cannot exist on any platform with `CHAR_BIT > 8`. Such platforms are non-existent in the mainstream CPU world, but surprisingly common in the DSP world.

And yes, it's a compile-time failure, which is great. My comment should not be read as criticism at all (though I would likely use `uint16_t`, as the OP says the code is intended to work with [presumably aligned] 16-bit words).

reply