Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
The number of legal chess positions estimated at 4.5x10^44 – proof games wanted (github.com) similar stories update story
1 points by tromp | karma 9374 | avg karma 3.0 2021-09-05 05:18:55 | hide | past | favorite | 101 comments



view as:

Cool project!

Why do they make such a big deal out of having numbered the positions? That's kind of a basic thing about a countable set, you can enumerate its elements if you want. We already knew the set of legal chess positions was countable.

Because it's very hard to make a numbering that's both:

1. compact, in that it uses a relatively narrow range of numbers.

2. efficiently computable, in that you can quickly map numbers (or preferably, thousands of them) to positions, and back.

The combination of these is what allows for estimating the number of legal positions.


Thanks, that is helpful! Let me ask a more specific question. Combinatorics gives us lots of ways to calculate the sizes of sets, and in my experience, they rarely if ever involve an explicit enumeration. Why then is an explicit enumeration an important milestone in this particular problem? To a combinatorialist, it would be more interesting to see the more-easily-countable-structure you're presumably mapping these positions to/from.

This chess position ranking wouldn't be possible without making use of such combinatorics, as you can see with the use [1] [2] of the multinomial ranking [3].

You're right that counting is much easier than ranking. That's reflected in my Haskell counting program [4] being MUCH simpler and MUCH faster than the ranking program. But counting is only good for upper bounding and doesn't let you draw a random sample to determine the fraction of legal ones, as needed for estimation.

[1] https://github.com/tromp/ChessPositionRanking/blob/main/src/...

[2] https://github.com/tromp/ChessPositionRanking/blob/main/src/...

[3] https://github.com/tromp/ChessPositionRanking/blob/main/src/...

[4] https://github.com/tromp/ChessPositionRanking/blob/main/src/...


So at a high level, you need the ranking because you need to do computations on individual elements of the set (e.g. random samples) to refine the estimate. Which is not necessary in a lot of ordinary combinatorics problems, but is here because this one is so hard. Cool, thank you!

I can imagine this project being of interest in the seminar of the right university department.


> Combinatorics gives us lots of ways to calculate the sizes of sets, and in my experience, they rarely if ever involve an explicit enumeration.

I don't think that this is true. I think that explicit enumerations are rarer than exact size computations, which are themselves rarer than asymptotic size estimations, for presumably obvious reasons; but a bijective proof (https://en.wikipedia.org/wiki/Bijective_proof) is still regarded as the gold standard in combinatorics, and as a desideratum.


A bijective proof isn't the same as an enumeration, which is a bijection to the set 1,2,3...

Of course you can derive an enumeration by bijection to a set with an existing enumeration, but the point is that that's usually not of interest.


I've experimented with using factoradic (mixed radix) numbers to index into the combinatorial gamestates for Magic the Gathering.

I think the knowledge that the set is countable and can thus be counted is, as you say, quite obvious. However, actually counting it or generating the complete set is terrifically hard in practice. There are many more illegal positions than legal ones (I suspect, would be curious to know if this is actually the case). Moreover each legal position may be the result of thousands of games played legally or thousands of games played illegally. Computers historically haven't been able to enumerate this set, or judge from a position if it is conclusively legal or illegal since the complexity of that question is immense. We're on the cusp of being able to do that and I think that's what makes this project more than a trivial exercise in counting countable things.

Are there any intelligent approaches to automating proof games? Can I just play the position backwards, unwinding it with a bias to returning pieces to their home positions?

Also isn’t this relatively easy if you can just waste turns (pointless moves by a Queen) while returning pieces back home?

This is fascinating because on the surface it seems very do-able but I bet the devil is in the details.


Great question. This was foreseen long ago (1500s?); the turn wasting thing is limited by the fifty move rule. A game ends in a draw if no pawn is moved or piece captured for 50 moves. Since pieces can only be removed not added and pawns can only move forward, never backward, this proves that cycles are not possible in chess (they certainly would be without this rule of course)

See https://en.m.wikipedia.org/wiki/Fifty-move_rule


Wouldn’t the 3-fold repetition rule also prevent cycles?

https://en.m.wikipedia.org/wiki/Threefold_repetition


Yes, definitely. I hadn't remembered that one. You could have much longer chains with the 3 fold repetition rule alone though. I guess both together potentially make it easier on the players and arbiters

It seems the 50 move rule is probably older according to Wikipedia (1800s vs 1500s), which is kind of surprising to me, I would have expected the other way around. It's easier to keep track of "moves since last capture/pawn advance" than every board position played so far. Maybe just in theory not in practice. I am a terrible chess player.


You should find it unsurprising that the simpler rule to keep track of came first, right? :)

Well when you put it that way I feel silly thinking otherwise haha

Which begs the question: without the fifty-move rule (only threefold repetition), what is the longest legal chess game?

Let's assume the 50-move rule is automatically applied. We can have 49 moves per pawn move, followed by 49 moves per piece captured. A maximum of 8 pawns (4 white and 4 black, for example) can possibly be advanced over the course of the game to the opposite side (6 squares ahead from their home square). Therefore, it seems to me that the maximum number of pawn moves could involve one player advancing their pawns 4 spaces, followed by the opposing player capturing all those pawns (using pieces rather than pawns, as both moving a pawn and capturing a pawn would be a waste), and then advancing their pawns to the back rank. This would be 8x4 + 86 = 80 pawn moves, plus 8 pawn captures. That's 88 allowances for 49-move knight sequences, plus the initial 49-move knight sequence made from move one. So, with no pawns remaining, and all of black's pawns converted to pieces on the back rank, we have 8949=4361 moves. Now, there are 22 pieces and 2 kings remaining on the board. That puts an upper limit of 22 captures, leading to an additional 2249 allowable moves. All together, that is 11149=5439 total moves. I'm probably off by a few.

> A maximum of 8 pawns (4 white and 4 black, for example) can possibly be advanced over the course of the game to the opposite side (6 squares ahead from their home square).

Not so! Pawns move diagonally via capturing, which theoretically could allow white to promote all 8 pawns just in the d column, and black to promote all 8 in the e column (at the expense of various other pieces).

edit: of course, you'd run out of pieces first, but you can accomplish something similar by using multiple columns.

Sample game start that accomplishes a subset of this (2 pawns promoted from each side, 6 pawns remaining from each side):

1. Nf3 Nc6 2. Ng5 Nb4 3. Ne6 Nd3+ 4. exd3 dxe6 5. Ke2 Qd7 6. Kf3 Qc6+ 7. Kg3 e5 8. d4 e4 9. d5 e3 10. d6 e6 11. d7+ Ke7 12. d4 Kf6 13. d8=R e2 14. Rd5 e1=R 15. Ra5 Re5 16. d5 Rh5 17. d6 e5 18. d7 e4 19. d8=R e3 20. Rdd5 e2 21. Rdb5 e1=R

https://lichess.org/wtJoIdT3#42


> A maximum of 8 pawns (4 white and 4 black, for example) can possibly be advanced over the course of the game to the opposite side

I think they all can.

If white’s A pawn takes a black piece in the B file and black’s B pawn takes a white piece in the A file, both white’s two B pawns and black’s two A pawns have a clear route to promotion.

You can do the same for files C and D, E and F, and G and H, at the cost of 8 taken pieces.

Of course, that “white’s A pawn takes a black piece” is wasteful in the sense that it is both a pawn move and a piece being taken, but I think it still is better than letting the pawns end up facing each other without being able to move more.


> http://tom7.org/chess/longest.pdf

This paper concludes 17,697 moves. 50-move rule is not automatically applied per FIDE rules.


The 75 move rule is, though.

Yes, that's the rule used to generate the game in the paper.

That probably involves creating the maximum number of pieces, by capturing 4 pawns to promote the 12 others. Let's give each side 1 king, 3 queens, 3 rooks, 3 knights, 2 light squared bishops, and 2 dark squared bishops.

Then have these move through as many board permutations as possible without triple repetition. This would give a multinomial of (64 choose 36 1 3 3 3 2 2 1 3 3 3 2 2) ~ 4.6 x 10^41 positions to move through, which could be multiplied by 2 for side to move (over estimating a lot since many positions will have the wrong king in check, or the right king in impossible check) and another factor 2 for first repetition, giving roughly a 2e42 upper bound.


For a hilarious read, which answers a related question, I recommend the paper "Is this the longest Chess game?", published on April 1st, 2020:

http://tom7.org/chess/longest.pdf


There was a sigbovik paper a few years back by the inimitable Tom Murphy that found this.

http://tom7.org/chess/longest.pdf


3 fold is by mutual agreement, if one side wants to continue they can. At 5 there is no choice

No mutual agreement is needed. Either player can decide to claim a draw by threefold repetition (§9.2) or 50 moves (§9.3). The only difference with the fivefold repetition and 75 move rules (§9.6) is that those draws are automatic.

https://handbook.fide.com/chapter/E012018


I stand corrected.

The threefold repetition has to be claimed, so the game can legally go on if both players overlook (or ignore) it.

There's also the fivefold repetition rule, which draws the game automatically.


The rule says a player can claim a draw, but they are not obligated to do. So, if you're a player and want to exhaust the search space, this only helps you if you seek a draw; but perhaps there would be a guaranteed win somewhere if you kept playing (and the opponent didn't claim their draw).

There's the usual limits where one may claim a draw (threefold repetition or 50 moves without capture or pawn move), but there also are limits where a draw is automatic and should be enforced by the arbiter even if players do not request it (fivefold repetition or 75 moves without capture or pawn move).

One could imagine some games require captures, castles and en passants to reach, so maybe in the general case, but devil indeed. I thought I recognized Tromp from Equihash work and indeed same person.

I was thinking about writing a reverse move generator for chessIO just recently. Shouldn't be too hard. The question is how large the tree will grow. Its a fascinating idea looking for useful applications :-)

I think it may be possible to automate finding what I would call proof game kernels, which is just the subsequence of pawn moves (including promotions) and pawn captures.

One could further simplify the search by abstracting away from the exact location of pawns, but recording only their color and order within each file.

Many of the 381 illegal positions can be proven illegal just by absence of such proof kernels.

Proof game kernels can also guide construction of full proof games.


Peter Österlund just outlined his tools and methodology for constructing proof games, which use similar ideas, at

http://talkchess.com/forum3/viewtopic.php?f=7&t=77685&p=9051...


Somewhat relatedly, François Labelle has some great pages on chess statistics and problems:

https://wismuth.com/chess/statistics-games.html

https://wismuth.com/chess/statistics-positions.html

https://wismuth.com/chess/chess.html


I in fact corresponded with Francois in the past 2 weeks about the possibility of using his retrograde analysis program jacobi [1] for finding proof games, which he was kind enough to explore. It turned out to be infeasible, even when providing the program with a sequence of requires pawn captures and promotions.

[1] https://wismuth.com/jacobi/


You nerds needs to leave chess alone and learn how to have fun without putting your fun through a computer.

"They're all legal. That is, reachable from the starting position in a valid game of chess."

Help me out on the legality of some of these boards. I see excess pieces on many boards (for example, 2 or more queens, 4 or more knights of one color).

Are these considered legal by pawn conversion?


Pawn promotions are legal moves, yes.

Yes. You do not need to choose a queen when the pawn reaches the opposite side.

Were redundant choices culled? No point in tracking all variations of pawn > queen/rook/bishop when one includes the moves of the others.

There is sometimes a point when you need to avoid giving stalement, like in the famous Savedraa study [1] or the vastly more intricate Babson task [2].

[1] https://en.wikipedia.org/wiki/Saavedra_position

[2] https://en.wikipedia.org/wiki/Babson_task


Sometimes the queen having both rook and bishop moves is a liability. There are positions (real positions from real games, not merely hypothetical ones) where promoting to a queen results in immediate stalemate, while promoting to a rook wins the game.

So those choices are not redundant.


There are even times when this is the right move. Sometimes you can put them in check by promoting to a knight, for example, where a queen would not. Common in beginner puzzles.

This is a very nice illustration of this, where a player had to turn off autoconversion to queen on the last seconds to win the game: https://youtu.be/HFG8pCInJKw


Another situation where you would want to convert to something other than a queen is when doing so would result in a stalemate.

So assuming all pawns of one color are converted, I guess 9 queens or 10 knights would be a legal, if unlikely, board setting, but 10 queens or 11 knights are not?

Yes. If you only have eight pawns, then you can only get eight extra pieces as a result of promotion.

The 10 queens situation is obviously impossible, but so is having 2 white queens and black has all of his pawns on the starting squares.

It's more difficult than you're thinking. If white has 6 dark squared bishops, they you know that he promoted at least 5 pawns on dark squares so one pawn must have taken a piece since it's promoting on a dark square instead of a white one. Not every board set-up will have enough missing pieces that 5 pawns can maneuver to get promoted on dark squares.


About the same order of magnitude of the number of air molecules in Earth's atmosphere.

Curiously, a similar observation (but just for carbon dioxide molecules) was made back in 2008 by Eric Holcomb in

http://www.nwchess.com/articles/misc/chessic_coincidence.htm


I'd call some board positions irrelevant, if a win situation had to be traversed to get there.

Can you define "win situation" ?

I assume he means a forced win. Meaning if the winning side makes the correct moves, the losing side cannot win.

We don't know whether the game is already a forced win for white before a single move is made. It could be. So this system is not workable because we don't know what positions are not forced wins.

AFAIK it's also unproven that it's not a forced win for black. (Not trying to correct you, I just think it's an interesting addition.)

Do you mean like if a mate in 2 was passed over? Seems strange to dismiss that since even great players will miss a winning position sometimes.

I think the goal is estimating all legal moves, not all you'd find in a typical game.

It would be interesting to define parameters for a "good move" and see how many there are, though!


So at only 10 seconds per game, we could annotate all these positions with Stockfish in a mere 10^38 years. We might even be done before all the stars go out!

Don't count on those stars! They might go out in a lot less years, if those guys in Tibet keep chanting all of those names.

> Black to move, and in the final position, the pawn at h4 can be captured en-passant.

> They're all legal.

But then, looking at the final position[1], black is in check by the knight on b2. If white's pawn can be captured en passant, this implies the knight was not the most recent move, so the black king was in check on the previous turn as well.

It's obviously not legal to remain in check. What am I missing?

[1] https://raw.githubusercontent.com/tromp/ChessPositionRanking...


To quote from Seinfeld: "It's not you. It's me"

I missed something, and clearly misclassified this position, leaving only 51 legal in the 1k set (and only 537 in the 10k set).

Oops. It's embarrassing since I did pay attention to the en-passant positions in the 919-93 positions not from the smaller 1k sample, such as this one: https://github.com/tromp/ChessPositionRanking/issues/916

I'll have to make some changes to my README. Meanwhile, congratulations on spotting the error. And I'll be sure to mention you in the eventual publication.

This underscores the need to back up my classification with proof games...


As I went to look for the corresponding issue, I was surprised to not find the position among the open issues.

In fact it was closed as

https://github.com/tromp/ChessPositionRanking/issues/915

So I had classified it correctly, but only AFTER making the diagrams for the 1k sample.

That means the 538 count for the 10k sample still stands. Good. Less things to fix on the README :-)


A misclassified issue was found by Github user 1pab42 at https://github.com/tromp/ChessPositionRanking/issues/878

That changes the prospective number of legal positions to 539...


But just today github user @mcdallas found another position misclassfied as legal [1], so the two misjudgements cancel out and the original 538 count currently still stands. Lovely:-)

[1] https://github.com/tromp/ChessPositionRanking/issues/794


Peter Osterland found yet another misclassification in

https://github.com/tromp/ChessPositionRanking/issues/824





Github user @Ananas1729 (Ben Jones) found another at https://github.com/tromp/ChessPositionRanking/issues/243

Why isn't it legal to remain in check?

The rules of Chess say so. Duh.

Ha!

It's funny-- except for Tenoke all the responses so far boil down to essentially this.

I'll try again.

Assume an alternate reality where this incredibly important rule doesn't exist. In it, the other player simply takes the king on the next turn from a neophyte who didn't realize they are still in check.

What bad things would happen as a result of the non-existence of the gratuitous rule?

And don't just speculate based on the zero cost of posting on HN. Give me the real reason based on the history of the creation of chess rules and/or the documented behavior of real chess players in history.


> Assume an alternate reality where this incredibly important rule doesn't exist. In it, the other player simply takes the king on the next turn from a neophyte who didn't realize they are still in check.

This is not merely an alternate reality; this is effectively the reality of blitz chess, where capturing the king is a way of claiming victory by declaring that the opponent made an illegal move when they didn't move out of check. In blitz chess you do not have the right to correct illegal moves after the opponent points them out.

https://www.reddit.com/r/chess/comments/jrvmk7/capture_the_k...


I feel as if I'm being gaslighted.

If the penalty for the specific illegal move of not getting oneself out of check is to lose the game, what could the practical significance of the rule possibly be?

The other respondent is claiming that chess would become a game of "gotcha," as if the rule somehow prevents chess players from faking injuries to distract the chess ref and get free moves.


That penalty applies only in blitz-chess.

In regular chess, the rule just helps a player avoid one particular kind of blunder. I would agree that makes it a poor excuse for a rule.


I'm so confused-- what is the penalty in normal chess for trying to move a piece when your king would be left in check?

There is no immediate penalty, normally the opponent notices the illegal move and points it out, and the player has to undo the illegal move and continue with a legal move. Now if the player disobeys, then the opponent is able to declare the game forfeit.

An interesting edge case is if the opponent does not notice the illegal move. See FIDE for details.


> What bad things would happen as a result of the non-existence of the gratuitous rule?

Chess would be a game of gotcha instead of the game it currently is.

The rule is there to support the preferences of the kind of people who think chess is fun.


I feel as if I'm asking something fundamental that hasn't been considered by insiders. As if I asked music afficionados why Beethoven's Appassionata Sonata couldn't be performed in F-sharp minor and the response is something like, "Because arbitrarily changing the keys of sonatas would make a travesty of classical music, and performing the piece in F minor supports the preferences of the people who enjoy listening to classical music."

In other words, complete horseshit.


> I feel as if I'm asking something fundamental that hasn't been considered by insiders.

You're not. You're asking a stupid question and complaining that other people don't share your views on what makes a game fun.


I don't know the history, but it's consistent with all the other rules, e.g. checkmate stops the game one move before the king is captured, stalemate occurs if the king would have to open himself up to capture, etc.

Chess is a politically-themed wargame featuring monarchy. The King, being top kick, makes the rules of the land. Of course the King would enact such a rule that makes putting/leaving him in danger illegal.

Think about actual, medieval monarchs in their castles, holding literal absolute power over everyone. If someone were to invent a board game featuring a King, and the King were to be made aware that you moved him into danger and got him captured, he might have your head.


Because the next move made by your opponent would trivially be to capture your king, ending the game.

Being in check means it’s your move and the pieces are positioned so that if it was your opponent’s move they could capture your king. So, to “remain in check” would mean to make a move which still allows your opponent to capture your king. If this were allowed the opponent would just capture your king and you would lose. Instead, the rules of chess require you to “get out of check”, i.e. make a move which prevents your opponent from capturing your king. If there is no such move you are said to be in “checkmate” and you lose.

If you remain in check you essentially lose the game during the next move.

It's a good question as most games, including chess have few rules about making otherwise legal moves illegal just because they might be suboptimal. In novice games the other player might not even realize they can win, so it being illegal to stay in check is an oddity but one that's actually in the rules.

As far as I’m aware, the only game-theory difference between chess and a variant of chess where moving into check is legal, is that stalemate exists. Otherwise the “can’t move into check” rule is basically useless.

This is pretty interesting, too. Linked in the README:

Number of legal Go positions https://tromp.github.io/go/legal.html


Question for Tromp: do you prefer working with computable big numbers or uncomputable big numbers?

I'm assuming you're refering to the number of legal Go positions versus the number of legal chess positions.

Obviously the former is more fun, since you feel like you scaled the summit when you determine the exact count in all its glory.

When you can only estimate, you cannot reach the summit, only reach a little higher than the previous explorer by spending much more effort on your estimation climb. I do feel some satisfaction in having established base camp though:-)


No, I mean the number of legal Go and chess positions vs Busy Beaver numbers. BB sequences don't have much in the way of summits, but they do allow for establishing basecamps more precisely.

I very much enjoyed determining functional busy beaver numbers [1].

Maybe somewhere in between the enjoyment from the computable numbers of Go and chess. There is no estimation involved there. Although there are lower bounds.

The other difference being that it's a much more esoteric subject (on account of the prevalence of the Turing Machine based definition) with a much more limited audience.

[1] https://oeis.org/A333479


Absolutely awesome project. I will definitely try to contribute as I can in finding proof games. Though, I want to understand the problem and all the methods and mathematics used a bit better. Do you have any recommendations of resources? Hell, even if you just list topics I should look into it'd be helpful.

You could check out all the references at the bottom of the README. The ranking implementation was heavily inspired by Brent Yorgey's Data.Enumeration.Invertible [1] that he blogged about at [2].

[1] https://hackage.haskell.org/package/simple-enumeration-0.2/d...

[2] https://byorgey.wordpress.com/2019/07/05/lightweight-inverti...


Legal | privacy