Can anyone elaborate what “search” means in this context? It seems they are trying to determine the most probable Nash equilibrium but not sure what this mean when applied to RL.
The "search" here refers to the idea that, in principle, you could search the entire space of possible hands of cards and exhaustively predict the optimal action by imagining every hand. However, like in the game of Go, this is computationally intractable, so instead they use machine learning to "guide search" towards more promising "moves". In AlphaGo (and here) this learning happened as part of a reinforcement learning pipeline.
Imagine searching through the entire tree - billions of nodes due to the branching of chess.
Now use RL to help decide where to search and branches to prune
this jpeg link downloaded a webp file to my computer and wen i try to open it it asks if im sure i want to open it. webp seems to be an image file but is it safe to open
When I was in grad school, I was working on general game playing AI. Unfortunately, I was in a "pure logic" research group, founded on the old-school AI principles that believed AI could be derived from deterministic logic.
Of course, this limited the games that we could simulate to purely deterministic games (checkers, chess, go, etc.). Any games that included an aspect of chance required a hack like a "dice player" or a "deck player" that would add the random aspects of the game. Of course, this led to other problems, since the engines would try to calculate the current state of the game based on the "optimal" play of the random player.
This is a much more interesting approach, and I imagine will prove to be far more useful.
Underlying paper here. Originally published several months ago and updated with more information last week.
reply