Hacker Read

jen20 · 2019-05-27 14:11:17+00:00

I assume given the rest of the context you were not implementing a FIX parser - but you’d find the same escape code layering violations if you were!

babuskov | karma 5151 | avg karma 4.35 · | 2016-06-10 10:33:30

I looked at the code... writing my own parser would be faster than fixing it. But that's beside the point. I decided to use a 3rd party library because I did not want to invest my time into that. The moment I had to even look at the source code, that was broken.

niftich | karma 12086 | avg karma 5.57 · | 2017-02-24 05:01:47+00:00

Your comment doesn't apply for this particular case, because the submission goes into great detail that the parser in question was written with Ragel, a parser generator. The code written by them in Ragel contained a bug, which lay uncaught and dormant for years, and manifested only when calling/wrapping code was altered.

jheriko | karma 1542 | avg karma 0.94 · | 2023-07-19 17:19:40

Good to see someone avoiding horrible parser generators, even if the code is ugly, poorly styled, and bug prone.

"In short, there are a few reasons that parsing is a mess, and none of those reasons are actually resolvable by parser generators."

I'm pretty sure this is untrue /and/ part of the problem. Build quality on these tools is appalling...

reply

davemp | karma 2089 | avg karma 3.43 · | 2024-01-23 07:18:49

I don't see how writing a parser from scratch would mitigate bugs vs using a regex parser. Parsers are just a hot spot for security bugs that should get extra scrutiny.

imkevinxu | karma 1080 | avg karma 6.88 · | 2012-10-24 15:57:18

Awesome detective work there! Yes you are correct, the basics of the parser is that it is just replacing values for x and while I did a check for 10x to be 10x, I did not check for x10 to turn into x10.

Just pushed the bug fix, should work now! Hacker News is awesome, thanks :)

reply

chriswarbo | karma 7592 | avg karma 2.33 · | 2014-10-25 13:05:43

> Just write code to parse the language in a straightforward way.

This approach is why many consider parsing to be a solved problem, so it's certainly a valid approach. However, it's not the only valid approach.

For example, "straightforward" parsers often give terrible error messages: when the intended branch (eg. if/then/else) fails, the parser will backtrack and try a more general alternative (eg. a function call). Not only does this give an incorrect error (eg. "no such function 'esle'"), but it might actually succeed! In which case, the parser will be in the wrong state to parse the following text, and gives a non-sensical message (eg. "unexpected '('" several lines later).

This is an important problem, since these messages can only be decyphered by those who know enough about the syntax to avoid hitting them very often! Inexperienced users will see error messages over and over again, and have no idea that they're being asked to fix non-existent errors in incorrect positions.

reply

ralphb | karma 141 | avg karma 3.71 · | 2022-11-21 11:03:10

I'm confused and very far from an expert here. What is wrong with parsers, and what is the alternative?

dotancohen | karma 9525 | avg karma 1.72 · | 2023-12-29 12:54:07

Well, then, you can rest knowing that other parser writers came to the same workaround as you did!

wallnuss | karma 340 | avg karma 5.96 · | 2020-05-10 12:49:32+00:00

It's being worked on. I am rather excited for the upcoming work that makes the parser replaceable and allows us to actually give good syntax errors! There is some discussion about making error printing more configurable so that one can skip stack-frames that are unlikely to be the cause (albeit that's a double-edged sword).

hctaw | karma 917 | avg karma 3.25 · | 2021-02-23 23:19:52+00:00

well anything that takes untrusted input that might need to be validated with parsing is a massive security hole. Parsing generators don't fix this class of bug, they just change how it manifests.

TeMPOraL | karma 106045 | avg karma 3.04 · | 2021-06-07 08:53:28+00:00

Not if you're actually parsing, with an explicit parse step and a well-defined language. Exploits tend to happen when parsing is done by the so-called "shotgun parser" - aka. various checks and conditionals randomly scattered throughout your code, that implicitly define an input language that's different from what you think it is.

tseabrooks | karma 1313 | avg karma 3.83 · | 2012-07-17 09:52:48

Is it a bug though? Sounds like a conscious design limitation in the parser. A bad limitation, but not a bug. Sounds like it's behaving "as expected".

robryan | karma 6222 | avg karma 1.7 · | 2012-09-01 20:13:18

Without having investigated it, I would guess that the parser isn't abstract and modular enough so they end up with a mess of code trying to handle all the different possible combinations of syntax.

tester756 | karma 3905 | avg karma 1.96 · | 2021-07-18 16:40:07

Why there's so much parsing related exploits?

likeclockwork | karma 713 | avg karma 1.35 · | 2013-02-19 17:01:54

My parser seems to be broken.

Zef | karma 30 | avg karma 1.88 · | 2011-07-05 14:17:25+00:00

Maybe he could have adapted an existing one, but I think the main issue was that parsers traditionally just quite when they find an error, language.js has some error recovery feature (naughty or) to proceed with parsing.

harperlee | karma 4912 | avg karma 3.77 · | 2020-11-26 17:31:13

That’s not like what was said above. They said that a strict parser would choke on unrecognized tags, thus making experimentation non-viable.

Sloppy programming is not about enabling new syntax at all. That simile is not useful.

reply

ReleaseCandidat | karma 1898 | avg karma 2.07 · | 2024-01-20 08:38:51

> I haven't found a parser generator that makes it painless to provide good error messages.

This. And letting the user add their own infix function with configurable precedence and associativity is easy using Pratt too.

reply

jgalt212 | karma 2685 | avg karma 0.84 · | 2017-11-17 15:08:30

a parser that is 97% correct is broken.