Hacker Read

catnaroek · 2017-01-16 15:48:22+00:00

Because a string is just a piece of data, and if your program can take it as input, then it must be handled correctly.

“But writing parsers was sooo boring in college, and who has to do this in real life?”

andreasvc | karma 1599 | avg karma 1.34 · | 2014-01-16 20:26:29+00:00

Because here you're not doing string manipulation, but AST manipulation (i.e., the datastructure you get after parsing the language).

JohnHaugeland | karma 430 | avg karma 0.39 · | 2022-05-30 08:57:32

imagine thinking the difference between parsing a string and using a pre-parsed datastructure was "trivial for everyone" because you could count off how many classes of character were involved

one major difference is that you don't have to write the parser at all the other way

reply

hombre_fatal | karma 18663 | avg karma 3.79 · | 2022-11-30 17:17:36

I think it’s just because everyone knows how to work with strings, it’s so tangible, and you can see what data a string contains right there in the print(thing).

You don’t even have to know what the type is of the thing to work on it’s string representation.

reply

Aeolun | karma 23207 | avg karma 2.16 · | 2023-03-09 10:40:43

> allowing you to write compile-time string parsers

I’m not entirely sure this is a good thing. But it’s certainly convenient in some instances.

reply

strictnein | karma 8018 | avg karma 3.53 · | 2020-05-02 08:23:20+00:00

Sorry, but this article starts off with an excellent example of why this is horrible:

   val x: String = "hello"
   String x = "hello"

The first line reads: "value X is of type String and contains hello"

The second line reads: "String x contains hello"

val and : are fluff and add nothing. Arguments about it being tougher to parse would have some merit if this wasn't all figured out almost 50 years ago.

reply

P5fRxh5kUvp2th | karma 1382 | avg karma 1.39 · | 2022-09-24 09:36:36

It makes it easier to parse a text file without special cases.

It's an insignificant reason, but that's why it's traditionally recommended.

reply

jameshart | karma 19910 | avg karma 4.65 · | 2023-05-26 07:39:56

Because people are taught to think in strings. And programming languages coddle them with tools like concatenation and string formatting. And because we let people think they can do useful things with strings as a result.

But what people actually need are grammars.

The exact same reason why parsing HTML with a regex unleashes Zalgo is why generating HTML with string templates is bad. Because both treat HTML as a string, not a grammatically restricted language.

reply

kwhitefoot | karma 8110 | avg karma 1.56 · | 2016-07-14 10:24:09

Almost all of the programs that I have ever read parse strings far too often. Almost all scientific and engineering software can eliminate it except at the UI and database.

jkkramer | karma 1622 | avg karma 10.27 · | 2012-08-28 20:56:26+00:00

You are not forced to pass strings. That's merely a convenience (for Java). You can construct and pass data structures instead.

AndyKelley | karma 9290 | avg karma 6.93 · | 2021-12-20 17:37:51

A lot of people reach for string handling when the actual correct thing to do is intentionally avoid string handling, and only handle strings as opaque encoded UTF-8 bytes, that cannot be reasoned about in terms of human language.

I would even argue that having string handling in a standard library (or language) has the potential to cause a net increase in bugs, because of people thinking they are handling strings when actually they are just screwing around with codepoints. Go's string handling is completely broken, for example. As a result of strings in the language, Go programs tend to be more broken than C programs in terms of string handling.

reply

rkangel | karma 7315 | avg karma 3.63 · | 2014-08-06 10:09:11+00:00

Different literal types are only necessary when the parser needs to treat the string differently - that's why you need them for raw strings for instance.

When you don't need to do that, a function call is a much better answer. There are many things that could be string literals but aren't because they don't have to be.

reply

aap_ | karma 946 | avg karma 2.93 · | 2019-04-25 10:10:14

I was actually more looking for practical issues. Usually the code that I write doesn't even handle strings a lot. Maybe I'm just using other languages for when I do that or maybe I'm using other approaches where others would use strings or maybe I just subjectively don't find them so bad as others. I'd just like to see exactly what people are complaining about so I could find out why I usually don't.

pencilguin | karma 396 | avg karma 0.81 · | 2022-10-20 15:09:09

Because, then, to parse out a number you would first need to get it into a string.

skrebbel | karma 32746 | avg karma 5.58 · | 2013-11-28 11:59:14

Good point!

It is my opinion, however, that string types are fine, just not perfect. I should have maybe made that clearer.

reply

jordibunster | karma 147 | avg karma 2.88 · | 2012-05-29 23:45:30+00:00

Looks great. I wonder why the author went for a specific syntax for interned strings, as opposed to doing what Java does (automatically interning String literals).

duped | karma 7438 | avg karma 3.07 · | 2024-03-16 02:30:16

> also `String` and `OsString`

Because these are different types with different semantics.

Why should there be one string type? Languages have to interface with the real world, where string data could be anything, but programs still need to be written for the common cases... hence, &str and String.

reply

brudgers | karma 49350 | avg karma 2.35 · | 2017-06-30 13:16:45

Text (aka strings) exists in virtually all software projects

For me, distinguishing between text as something that is intended to be read by humans and strings as serial sequences of characters that may or may not be human readable but will be processed by one or more computing automata is useful. For example in C, the string "Hello World" is terminated by a null character. The null character is not part of the text the string encodes.

Or to put it another way, I find that treating strings as text as two different layers of abstraction clarifies my intent. Code that manipulates text is built on code that manipulates strings and in between there's parsing that has to occur.

reply

Nullabillity | karma 4768 | avg karma 1.8 · | 2014-05-12 13:50:18+00:00

Working with strings as code causes editors to get confused, is hard to compose, and means you lose out on any kind of static analysis.

staticassertion | karma 12534 | avg karma 3.12 · | 2021-01-14 15:33:05

This is what I was referring to at the end of my post:

> Few work directly on strings. But they do it naturally, without enforcement - so like, a function might take a 'str', but the 'str' passed in was parsed into a wrapping structure already.

ie: Most programs in typed languages already do what you're saying - they parse the data directly into a structure, and therefor they validate some aspects of it naturally, so even when you do see a 'str' in typed code it's very often already gone through some sort of parsing phase.

reply