Well, that is just choosing the semantically correct parse from multiple syntactically correct parses. This parser isn't even finding syntactically correct parses.
I looked at the code... writing my own parser would be faster than fixing it. But that's beside the point. I decided to use a 3rd party library because I did not want to invest my time into that. The moment I had to even look at the source code, that was broken.
I don't think you actually disagree with the author. I think they would basically agree with everything you wrote and just add on, "therefore write your parser so it actually does recover correctly." Which is what most of the post boils down to.
It's a shame parsers are such a PITA to write. So many problems could be trivially solved if writing a grammar and generating a parser for it were in any way a pleasant process.
Without having investigated it, I would guess that the parser isn't abstract and modular enough so they end up with a mess of code trying to handle all the different possible combinations of syntax.
So what you're saying is that, except for all the times you come back to work on the code because something broke, it works reliably? Nothing about the parser could be improved so that it doesn't break on data format changes? Nothing could be improved such that instead of alerting you to failures, it could be pro-actively adjusted to accept new formats before something fails?
reply