Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Wow, that sounds nightmarish. Three different ways to store the string, pointers to buffers everywhere, and because it is so complex your string library needs to account for all possibilities because there's no way anybody is going to want to manually twiddle with those strings.

If you get it right and hide all of the complexity from the user then it's not quite as bad, although undoubtedly confusing the first few times someone inspects it in their debugger.



view as:

I used to work for a company that whose C++ string class worked like this, although not quite as intricate. Had a separate mode for statically compiled strings too. It was pretty nice, although I didn't agree with every style choice. Makes some branchy code though.

In Visual Studio debugger you can write scripts to pretty-print string variables in the "watch" window, I would guess other good debuggers can do this too.


String operations are intrinsically complex by nature. At least they are now in this age that you cannot assume the encoding is always ASCII.

The goodness of this approach is that if you just need to pass around the string, a shallow copy of the first 16 byte block is enough. It might or might not contain the whole of the array, but if you need to know, it means you need to go through all the corner cases, no strcat(small_buffer, unsanitized_user_input), thank you very much!


YEs, you are right... it is very complex. But it is consistent enought that the compiler can do it all in the background.

If you think that it is a bad idea to do:

   str_t* string = (str_t*)malloc(string_length);
then half of the goal is been achieved right there.

The other half is to have this nailed down in the language definition in sufficient detail so that implementations do not differ widely and you can poke inside it with assembly when you have to.


Legal | privacy