Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

I guess you mean "For real?" as in "Is this really a primality test?" Yes, it is, and Krumins describes how it works in the post that I linked (http://www.catonmat.net/blog/perl-regex-that-matches-prime-n...).

Essentially, although it looks complicated, it's the simplest possible idea: it's just trying factorisations, with the `(11+?)` part being the factor, and the `\1+` part insisting that we repeat that factor. It is of course wildly inefficient, and limitations in the backtracking engine mean that it can give false positives for large numbers (http://zmievski.org/2010/08/the-prime-that-wasnt, also linked from that post).



sort by: page size:

I think it’s worth noting that the original author of the regex primality test is Abigail, a well known Perl hacker.

https://www.masteringperl.org/2013/06/how-abigails-prime-num...

This blog article does a nice job discussing Abigail’s work.


Meh.

I think an even greater Regexp trick is the regular expression that determines primality:

http://stackoverflow.com/questions/3296050/how-does-this-reg...


For some this will be obvious, but for others it may still be interesting:

If this was a "true" regular expression, in the sense that it describes a regular language (http://en.wikipedia.org/wiki/Regular_language), then checking whether the number is prime or not, by checking if the expression matches, would run in linear time. This would be worth a few nobel prizes at least.

Unfortunately, it is an "extended" regular expression, i.e. not regular at all in the strictest sense, and (as the article says) requires backtracking.


Could the AKS primality test, which showed that PRIMES is in P, be implemented as an honest regular expression? Presumably not because then we'd know that PRIMES is not just in P but is a regular language, which would be too good to be true.

https://en.wikipedia.org/wiki/AKS_primality_test


You're mostly right, I made a similar point few months ago[1].

However, I must disagree that we are able to use regular expressions to test for primality all the number we're interested in. Your argument only proves that there exists a regular expression that tests for primality all the numbers below N, where N is arbitratily large. It does not shows any way how to construct it[2], nor does it guarantee that the DFA we will be using to test for primality will fit in available memory.

[1] - http://news.ycombinator.com/item?id=3046623

[2] - It's actually trivial: example regular expression we're looking for is /a^p_1|a^p_2|...|a^p_N/, where p_1, ..., p_N are primes below. Sadly, to use it to test for primality, we need to know if the number is prime beforehand.

Actually, it sounds like an interesting problem -- to come up with an interesting and nontrivial regular expression schema that for every N gives us a regular expression that tests for primeness all numbers less than N.

Come to think of it, the existence of any such interesting schema seems highly unlikely -- the Parikh's theorem implies that the set of lengths of words matched by regular expression (or even context-free grammar) seems to be too constrained to allow for such scheme -- we can easily find non-prime matched by an such regular expression, and Parikhs theorem seems to imply that it is not much bigger than the automaton state count.


Perl's regex engine is really fast, so presumably:

    perl -lne '(1x$_) =~ /^1?$|^(11+?)\1+$/ || print "$_ is prime"'

For real?

     # Check if a number is a prime
     perl -lne '(1x$_) !~ /^1?$|^(11+?)\1+$/ && print "$_ is prime"'

As this thread mentions cryptic code, Perl, regexps, and primes, I'm compelled to paste the cryptic Perl regexp to determine primes:

  /^1?$|^(11+?)\1+$/

Invented in 1998 by Perl hacker Abigail:

http://neilk.net/blog/2000/06/01/abigails-regex-to-test-for-...

(check out Abigail's other JAPHs if you like stuff like this)


Me too. I had to look it up. This page has pretty good breakdown:

https://itnext.io/a-wild-way-to-check-if-a-number-is-prime-u...

The main trick for me was you first have to convert the number to unary, which was done outside of the regex.


There is literally an entirely deterministic algorithm for checking primality, so you can be 100% sure that a randomly-generated number is prime: https://en.wikipedia.org/wiki/AKS_primality_test

Note: this is not a regular expression in the technical sense. It's common to talk about "Perl-compatible regular expressions" (PCREs) or regexes to avoid confusion.

No regular expression can check for prime numbers because the language { 1^n | n is prime } is not regular. You can see this by applying the pumping lemma: http://en.wikipedia.org/wiki/Pumping_lemma_for_regular_langu...


In case anyone is wondering, gpt4 provides the same solution. It takes some prodding though, and you get treated with a lecture of how inefficient it is before it provides the solution.

"Apologies for any confusion before, and I understand the nature of your request now. It's a fun challenge to think about! While it's not practically useful or efficient, and it is actually quite limited, a creative theoretical solution exists that involves regular expressions. This specific solution only works for very small numbers due to the inherent limitations of regex. It's based on the principle that a number is prime if it cannot be expressed as a product of two smaller natural numbers. We represent the number as a repeated string of 1s, then check for repeating sub-patterns.

For instance, this Perl regex checks if a number, represented as unary (a string of 1s), is prime:

perl Copy code /^1?$|^(11+?)\1+$/ In this regex:

^1?$ checks for the case where you have zero or one '1', which represent the numbers 0 and 1, neither of which are prime. ^(11+?)\1+$ attempts to match the input with a repeating group of '1's. If a match is found, the number is not prime, because a repeating subgroup of '1's represents a divisor. This pattern should be used in a negative sense: if a string of 1s (representing your number in unary form) matches this pattern, it's not prime. If it doesn't match, then it is prime.

Again, this is quite impractical for real use cases as the unary representation is extremely inefficient, and the computation grows quickly with larger numbers, but it's a neat way of showing how flexible and powerful regular expressions can be."


Can you decide if a number is prime using a true regular expression?

No. This won't work because the time cost of primality testing is too high (cubic in the number of binary digits?), and checking if a string satisfies a regular expression takes only linear time. Since cubic is bigger than linear, this won't work. Too good to be true!

Not a proof, but a rule of thumb. No need to even use the pumping lemma.


The prime number detector is quite clever. It reminds me of the number based logic constructs from Godel's incompleteness theorem. It's essentially maps a mathematical concept to a string and operates on it.

For any unary integer greater than 1, the regular expression /^(11+?)\1+$/ matches it iff the number is not a prime. In other words, if there exists no integer less than the number itself and greater than one (first capturing group) that is a divisor of the number. The non-greedy + is also a nice touch, it will start out from the smallest factors and build from there.


Modern prime testing is quite far from brute force. Tests such as AKS[1], ECPP[2], and APR[3] can determine whether a number is prime in polynomial time (in the number of bits/digits), whereas a true brute force search would be exponential.

[1]https://en.wikipedia.org/wiki/AKS_primality_test

[2]https://en.wikipedia.org/wiki/Elliptic_curve_primality_provi...

[3]https://en.wikipedia.org/wiki/Adleman%E2%80%93Pomerance%E2%8...


Performing a Miller-Rabin primality test or filling a Sieve of Eratosthenes will almost always be the way to go. However, this line of code seems like magic and that's kind of impressive.

This is indeed beautiful. But the author limits itself to a low-level unpacking of the regex and running a few tests cases, all the while remaining mystified as to why this works.

A high-level description helps makes obvious what's happening:

  To check the primality of a number N with string matching,     
    repeat a string (say, "1") N times
    and declare N prime:
      UNLESS the repetitions match the string 0 or 1 times
      OR the repetitions can be EVENLY matched into groups of MORE than one string.

The pattern detects if a number N is not prime by searching for A and B such that N = A*B. The (11+?) searches for A bigger than 1, and the repetition of this group (the \1+ part) repeats A as much as possible, eventually B times. If an exact match isn't possible, it backtracks and advances A. If no such A and B exist, A will keep increasing until it becomes bigger than N, and then the match fails.

An interesting pattern that wasn't immediately obvious, but a very inefficient way to detect primes :)

next

Legal | privacy