I'm working on an AI for a game and I want to use the MinMax algorithm with the Alpha-Beta pruning.
I have a rough idea on how it works but I'm still not able to write the code from scratch, so I've spent the last two days looking for some kind of pseudocode online.
My problem is that every pseudocode I've found online seems to be based on finding the value for the best move while I need to return the best move itself and not a number.
My current code is based on this pseudocode (source)
minimax(level, player, alpha, beta)
{ // player may be "computer" or "opponent"
if (gameover || level == 0)
return score
children = all valid moves for this "player"
if (player is computer, i.e., max's turn)
{ // Find max and store in alpha for each child
{ score = minimax(level - 1, opponent, alpha, beta)
if (score > alpha)
alpha = score
if (alpha >= beta)
break; // beta cut-off }
return alpha }
else (player is opponent, i.e., min's turn) // Find min and store in beta
for each child {
score = minimax(level - 1, computer, alpha, beta)
if (score < beta) beta = score
if (alpha >= beta)
break; // alpha cut-off }
return beta }
} // Initial call with alpha=-inf and beta=inf
minimax(2, computer, -inf, +inf)
As you can see, this code returns a number and I guess that this is needed to make everything work (since the returned number is used during the recursion).
So I thought that I may use an external variable to store the best move, and this is how I've changed the previous code:
minimax(level, player, alpha, beta)
{ // player may be "computer" or "opponent"
if (gameover || level == 0)
return score
children = all valid moves for this "player"
if (player is computer, i.e., max's turn)
{ // Find max and store in alpha
for each child
{
score = minimax(level - 1, opponent, alpha, beta)
if (score > alpha)
{ alpha = score bestMove = current child // ROW THAT I ADDED TO UPDATE THE BEST MOVE }
if (alpha >= beta)
break; // beta cut-off }
return alpha }
else (player is opponent, i.e., min's turn) // Find min and store in beta
for each child {
score = minimax(level - 1, computer, alpha, beta)
if (score < beta)
beta = score
if (alpha >= beta)
break; // alpha cut-off }
return beta } } // Initial call with alpha=-inf and beta=inf
minimax(2, computer, -inf, +inf)
Now, this is how it makes sense to me because we need to update the best move only if it's a player's turn and if the move is better than the previous.
So, while I think that this one's correct (even if I'm not 100% sure), the source has also a java implementation that updates the best move even in the score < beta case and I don't understand why.
Trying with that implementation led my code to choose as best move a move from the opposition player, which doesn't seem to be correct (assuming that I'm the black player, I'm looking for the best move that I can make so I'm expecting a "black" move and not a "white" one).
I don't know if my pseudocode (the second one) is the correct way to find the best move using MinMax with alpha-beta pruning or if I need to update the best move even in the score < beta case.
Please feel free to suggest any new and better pseudocode if you prefer, I'm not bound to anything and I don't mind rewriting some code if it's better than mine.