Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Does it use NFA?

http://swtch.com/~rsc/regexp/regexp1.html

Because the issue with the URL regex mentioned is with backtracking.



sort by: page size:

Sigh. https://swtch.com/~rsc/regexp/regexp1.html

Sadly, RegEx has evolved far away from the original regular expression we learnt in school, and it is certainly less NFA like. This make it harder to execute a faster speed, e.g. backtracing makes it more context sensitive etc.


Here is some further reading on the topic that I found useful.

https://swtch.com/~rsc/regexp/regexp1.html


I always use https://regex101.com when I write regex. The interface is great and provides instant feedback so you can dial in your expressions quickly. But to answer your question directly, no. I don’t really know it.

I found this today, I think it's worth reading. It's regarding A) http://swtch.com/~rsc/regexp/regexp1.html

http://swtch.com/~rsc/regexp/regexp1.html should be required reading before any developer tries to become a "regex wizard".

Also see: A series of posts on regex parsers by Russ Cox.

https://swtch.com/~rsc/regexp/


Regex101 is especially useful, I use it to check for catastrophic backtracking¹ before committing a regex.

1-https://www.regular-expressions.info/catastrophic.html


You might be interested in this: http://swtch.com/%7Ersc/regexp/regexp1.html

Most don't, but re2 and a few others do. The ones that do use it don't have exponential runtime on malicious inputs and lack a few features (back references, mostly). https://swtch.com/~rsc/regexp/ is a great resource on this.

You can always use tools like Regex101[0] to verify if they actually work or not. I have tried a few generated by the AI, and it seems to do the job most of the time.

[0]: https://regex101.com/


Seems broken. It doesn't implement a proper regular expression engine.

http://regexr.com/3f8ge

c.f. https://swtch.com/~rsc/regexp/regexp1.html


Hopefully they use re2[1] or a similar regexp engine without backtracking.

[1]: https://code.google.com/p/re2/


Alternatively, where possible, use a regular expression engine that does not have that issue: https://swtch.com/~rsc/regexp/regexp1.html

By coincidence, I found this link a bit earlier today. It tries to avoid flavors and exotic syntax.

https://rexegg.com/regex-quickstart.html


For more background on this library, read Russ' excellent articles on Regular Expressions

http://swtch.com/~rsc/regexp/


It sounds like someone read Russ Cox's work on regexp.

http://swtch.com/~rsc/regexp/

To future regexp implementors: Do it this way first!


boom. https://regex101.com/r/PxSY4U/1 technically it does parse it. :P

Russ Cox has an excellent write up on it.

"Regular Expression Matching Can Be Simple And Fast" (2007)

https://swtch.com/~rsc/regexp/regexp1.html


Note that this "regex", unlike the ones in the article, is not actually a regular expression. You could perhaps call it an irregular expression. There can be no such thing as a backreference in a regular expression.

More info: https://swtch.com/~rsc/regexp/regexp1.html

next

Legal | privacy