Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

> Cho won a close game 1,[28] lost game 2 when his invasion into enemy territory was killed,[29] and won game 3. Zen uses neural network techniques similar to AlphaGo,[30] however ran on more modest hardware during the match.

not sure if suggests any comparison between AlphaGo and Zen. I'd like to see Cho play against AlphaGo.



sort by: page size:

>Hassabis at AAAI indicated DeepMind’s intent to try to train AlphaGo entirely with self-play. This would be more impressive, but until that happens, we may not know how much of AlphaGo’s performance depended on the availability of this dataset, which DeepMind gathered on its own from the KGS servers.

As an amateur (10k) go player who watch the Fan Hui games, I can say that AlphaGo seems to rely heavily on the training data. This is because all of its moves feel very human, even in circumstances where strong humans find better, weird looking moves. This is in contrast to chess AI (and even other go AI) that feel distinctly roboty in how they play. In watching a professional (9p) commentary of the games, it seems that Fan Hui not lose because of particuarly good moves of AlphaGo, but because of some specific mistakes that he made (this is not unusually, as most professional games come down to loosing moves, instead of winning moves). In this sense, AlphaGo seems to be playing at human level, but with fewer mistakes.

This is certainly impressive, but unfortunately AlphaGo currently seems to be a demonstration of synthesizing and automating expert human knowledge, instead of creating new knowledge.


> Either way, it's interesting to note that AlphaGo had literally thousands of games to learn from to find weaknesses in human play, but Lee Sedol seems to have only needed 3 before he was able to find weaknesses in AlphaGo's play.

To be fair we can't know how many games Sodol played in his own head to figure this out.


>> Cho Chikun had the large advantage over Lee Sedol that his opponent (Zen) and similar programs (Crazy Stone, Leela) are simply available for him to practice against and probe for weaknesses.

That's an argument against Go having been solved by AlphaGo, not for. If Lee Sedol could beat AlphaGo given the opportunity to practice against it, then nothing is solved.

Also, for a system to dominate against humans in a game it must be able to beat any opponent every time. AlphaGo hasn't demonstrated this yet, so while in the popular opinion AlphaGo "dominates", in the technical sense it does not, and Go won't be "solved" conclusively until it has.


> It would be surprising if AlphaGo hadn't be trained on historic matches with Lee giving him an early edge until Lee can adapt.

This kind of learning doesn't work like that. It doesn't learn from specific examples, or make meaningful inferences from single data points. It learns tiny gradients from millions of examples. If we had an approach that could create a meaningfully distinct strategy depending on {whether we included or excluded {every game Lee Sedol has played in his life} from training}, that would be wildly more significant than just beating him.


> I hoped that when an AI beat a pro at go, it would be with a more adaptive algorithm, one not specifically designed to play go.

The particular algorithm used by AlphaGo is of course specific to Go (the neural network inputs have a number of hand-crafted features), but the overall structure of the algorithm - MCTS, deep neural nets, reinforcement learning - is very general. So there's two ways to look at it. One is that what you wanted has actually transpired.

The other is that what you asked for is completely unreasonable. I think it highly unlikely that an algorithm not specialised to Go will ever be able to beat all specialist Go playing programs.

AlphaGo can't explain the outputs of its two NNs, but it can still explain its moves by showing which variations it thinks are likely.


> I never thought the AI player had a 'deep understanding' of Go any more than I think string sort routines

It has a neural network trained on millions of games. Of course the model captures more than a sort routine which is non-parametric, in other words not trainable from data. The game of Go is complex enough that we can't code it up manually.

The fact that AI is implemented on matrix multiplication does not detract from it, humans are implemented in electro-chemical reactions. These reactions are all local, no single cell in our body has the big picture.

On the other hand, human understanding is also brittle - how many Go players still can't beat the AI even after this article has been posted? Blind spots/adversarial attacks exist in both humans and AI. It was AlphaGo that influenced the way Go is played at the top levels by introducing new strategies, techniques and moves that were previously considered unconventional or suboptimal by human players. AlphaGo found blind spots in our thinking first.


>AlphaGo did a lot of things that the professional commentator found odd

AlphaGo is maximizing it's odds of winning, not maximizing it's score. Humans usually do the inverse, which is not an optimal strategy.

AlphaGo really is better.


>Ke Jie claimed he would beat AlphaGo [1]

He changed his mind after watching just one more match:

http://english.donga.com/List/3/all/26/527586/1

>But after watching three matches, he said, “AlphaGo was perfect and made no mistake. If the conditions are the same, it is highly likely that I can lose.”

>“As AlphaGo learns endlessly, all human beings could be defeated in the near future,” Ke said on AlphaGo’s capabilities.


> ... it must sometimes lose ...

Well, yea. In training, AlphaGo played against itself, and lost every one of those games. (Incidentally, it also won every one.)


> I think it's unlikely that the human mind tackles Go in the same way.

This is very certainly true, which is what makes AlphaGo interesting to watch and study. The human mind, even one that has trained on Go for years on end, will still work with abstractions and ideas that do not relate to the game. AlphaGo and other computers lack this attribute, as any and all abstractions they may have learned relate entirely to the game.

Any ideas about the "human perception" of Go they may have gleaned from games that are included in the initial training dataset, I suspect have long been supplanted by novel notions gathered during the phase where the Neural Nets played against themselves. These phases are documented in the AlphaGo blog from Deepmind[1].

I suspect that we may reach "human level intelligence", but that this intelligence will not arise in the same way. That is to say, computers will at some point match us in most tests of intelligence, but the solutions they devise will be completely novel.

[1] https://blog.google/topics/machine-learning/alphago-machine-...


"Meh" news because we had a higher profile match just recently: AlphaGo vs Lee Sedol (both much stronger than DeepZen or Cho).

I mean, it's mildly interesting that DeepZen uses fewer resources (just ~40 CPUs and 4 GPU, compared to AlphaGo's ~2k CPU and ~200 GPUs, IIRC). But that's relevant only to hardcore Go enthusiasts who would want to run such AIs on their own hardware.

In contrast, the next interesting milestones in Go AI are:

* AI beating a team of cooperating top pros, say top-10 humans vs AI (easy)

* solving Go (hard)

* finding ways to translate insights gained from observing strong Go AI into real-life situations relevant outside of Go (my personal favourite; Go has that fascinating simple-meets-complex abstracting property [1])

[1] https://rare-technologies.com/go_games_life/


> IIRC, the version without tree search beat the full version 25% of the time.

That would be amazing but it seems hard to believe. Any references?

I found this (which is also impressive):

    AlphaGo team then tested the performance of the policy 
    networks. At each move, they chose the actions that were 
    predicted by the policy networks to give the highest 
    likelihood of a win. Using this strategy, each move took 
    only 3 ms to compute. They tested their best-performing 
    policy network against Pachi, the strongest open-source 
    Go program, and which relies on 100,000 simulations of 
    MCTS at each turn. AlphaGo's policy network won 85% of 
    the games against Pachi! 
1. https://www.tastehit.com/blog/google-deepmind-alphago-how-it...

2. https://gogameguru.com/i/2016/03/deepmind-mastering-go.pdf


Quick info:

> This is where Kellin Pelrine steps in. Pelrine is a good player, but an amateur. Specifically, he's one level below the top amateur ranking. He's also one of the study authors, so he was well aware of the vulnerabilities of KataGo, so he thought why not try his own hand?

> Apparently, it was surprisingly easy to find a way to defeat AI by exploiting its weakness. Pelrine managed to beat KataGo 14 out of 15 times. For comparison, KataGo beat AlphaGo 100 times out of 100, and AlphaGo beat mankind's best player 4-1.

One caveat is that this is vs AlphaGo which was trained on pro games, so has biases in what it's been trained on. I wonder if it would do as well against AlphaZero which is only from self-play. The article says KataGo is the strongest, so stronger than AlphaZero. Because of non-transitivity, it's also not known if the same weaknesses exist in AlphaZero. The warning for weaknesses in AI/ML is still of great importance when applying the tech.


>What is interesting to me is that the computer makes clear mistakes when its on the lead. Since it might find the chances to win equally among different scoring results, it often picks a weaker one.

When they interviewed the devs briefly, they said this is because AlphaGo doesn't really consider the score other whan winning, so it will pick a move it think is an 81% chance to win by one stone over an 80% chance to win by 10 stones. When it's ahead these moves can look like mistakes but a better way of describing them would be hedging.


Considering he's beaten Lee Sedol and Gu Li, who knows how this will play out.

But no doubt he'll talk to Sedol about his take away from playing AlphaGo, especially since part of playing against any computer is reverse engineering its decision tree.


Correct. It played like a top level human player, pretty evenly matched with Lee Sedol. AlphaGo from yesterday would have wiped the floor with AlphaGo from 6 months ago.

Various commentators mentioned how both players, human and synthetic, made a few mistakes. Even I caught a slow move made by the AI. So whether Lee Sedol was at the top of his peformance, or not, is a bit of a debate. But the AI was clearly on the same level, whatever that means.

It was an intense fight throughout the game, with both players making bold moves and taking risks. Fantastic show.


> AlphaZero won some games in a romantic style

The difference in style is likely influenced by the insertion of historical boards as input to the neural network.

The sequence of moves are therefore more likely to look related to one another.


> he will later point out to as achieving very good results some 20 moves later

This. It's a fairly common feature of any AI that uses some form of tree search/minimax, and the effect is very pronounced in chess. Even the best human players can only think 6-8 plies into the feature versus ~18 for a computer. What we can (could?) do is apply smarter evaluation functions to the board states resulting from candidate plays and stop considering moves that look problematic earlier in the search (game tree pruning). AI tends to use very simple evaluation functions that can be computed quickly. They do so given that 1) it allows for deeper search, and a weak heuristic evaluated far in the future often beats a strong one evaluated a few plies prior and 2) for some games (like Go) it's really hard to codify the "intuitions" that human players speak of.

Because search based AI considers board states __very__ far in the future, the results are often completely counterintuitive in a game with an established theory of play. Those theories are born of humans, for humans.

The introduction of MCTS some years back was the first leap towards a human level Go AI (incidentally, MCTS is more human-like than exhaustive tree search in that it prunes aggressively by making early judgement calls as to what merits further consideration). AlphaGo's use of deep policy and evaluation networks to score the board is very cool, and the next step in that journey. What's interesting to me is that, unlike chess AI, AlphaGo might actually advance the human theory of Go. It's possible that these "strange moves" will lead to some very interesting insights if DeepMind traces them through the eval and policy networks and manages to back out a more general theory of play.


> It is simply using perfect information to optimize its probability of winning.

Perfect information would be the knowledge of all possible outcomes of a given move. That's not even possible in chess, and is fantastically less possible in a game like go. That's why, until AlphaGo, there had never been a go computer program that had ever even come close to beating a professional go player.

Let me emphasize that: As of early 2015, nearly 20 years after Deep Blue beat the world champion Kasparov at chess, there had never been a computer go program that had come close to beating a professional go player. The game starts with literally 361 possible moves, and each move thereafter decreases the possible moves by exactly one. The search space is just so massive that it's impossible, by brute force computation, to do a search with any meaning.

What's needed is an intuition about two things. First, for a given a board position, which color is more likely to win? Secondly, given the hundreds of moves you could make, which ones are the best ones to explore? Go players have some rules, but primarily they develop an intuition for these two things by playing and looking at hundreds and hundreds of games.

AlphaGo's primary architecture was the same. It has two neural nets, which spit out board evaluations and move suggestions. These nets don't have explicitly coded rules, but were developed by playing millions of games. How is that any different than our wetware neural networks?

> In fact, players who have played against it have often gotten worse.

I hadn't heard this, but this is probably not unusual. Take my StarCraft example: Suppose someone had gotten to Diamond league mainly by perfecting their early rush strategies, but then hit a wall. Then they watch that video series which says the foundation of a GrandMaster strategy is having a solid economy first. If they decide to take this advice, it will entail a complete re-building of their skillset; they'll almost certainly drop in their rankings before picking back up again.

You could imagine the same thing happening in go: for hundreds of years, certain principles have been believed to be true. AlphaGo regularly violates these principles. As people are exploring this alternate strategies, they will inevitably get the "new" principles wrong while they're learning.

On the other hand, the early versions of AlphaStar clearly dominated mainly by having inhuman micro capabilities (ability to precisely control individual units). A human trying to replicate a strategy that relied on inhuman micro would inevitably fail.

Similarly, it might be that certain moves in go are good moves if you can do the kind of massively deep search that only AlphaGo can do. If that's the case, then of course humans trying to imitate AlphaGo are going to fail. A similar thing happened in chess: computers historically have been amazing at tactics and weak on strategy. A human playing a computer should focus on a good strategy, because they're never going to beat a computer at tactics.

But none of that changes the fact that what AlphaGo is doing, with two neural nets that spit out answers based not on explicit rules but based on millions of games worth of experience, is indistinguishable from human intuition.

next

Legal | privacy