1 view

I'm working on an AI for a game and I want to use the MinMax algorithm with the Alpha-Beta pruning.

I have a rough idea on how it works but I'm still not able to write the code from scratch, so I've spent the last two days looking for some kind of pseudocode online.

My problem is that every pseudocode I've found online seems to be based on finding the value for the best move while I need to return the best move itself and not a number.

My current code is based on this pseudocode (source)

minimax(level, player, alpha, beta)

{ // player may be "computer" or "opponent"

if (gameover || level == 0

return score

children = all valid moves for this "player"

if (player is computer, i.e., max's turn)

{ // Find max and store in alpha for each child

{ score = minimax(level - 1, opponent, alpha, beta)

if (score > alpha)

alpha = score

if (alpha >= beta)

break; // beta cut-off }

return alpha }

else (player is opponent, i.e., min's turn) // Find min and store in beta

for each child {

score = minimax(level - 1, computer, alpha, beta)

if (score < beta) beta = score

if (alpha >= beta)

break; // alpha cut-off

return beta }

} // Initial call with alpha=-inf and beta=inf

minimax(2, computer, -inf, +inf)

As you can see, this code returns a number and I guess that this is needed to make everything work (since the returned number is used during the recursion).

So I thought that I may use an external variable to store the best move, and this is how I've changed the previous code:

minimax(level, player, alpha, beta)

{ // player may be "computer" or "opponent"

if (gameover || level == 0)

return score

children = all valid moves for this "player"

if (player is computer, i.e., max's turn)

{ // Find max and store in alpha

for each child

{

score = minimax(level - 1, opponent, alpha, beta)

if (score > alpha)

{ alpha = score bestMove = current child // ROW THAT I ADDED TO UPDATE THE BEST MOVE

if (alpha >= beta)

break; // beta cut-off

return alpha }

else (player is opponent, i.e., min's turn) // Find min and store in beta

for each child {

score = minimax(level - 1, computer, alpha, beta)

if (score < beta)

beta = score

if (alpha >= beta)

break; // alpha cut-off

return beta } } // Initial call with alpha=-inf and beta=inf

minimax(2, computer, -inf, +inf)

Now, this is how it makes sense to me because we need to update the best move only if it's a player's turn and if the move is better than the previous.

So, while I think that this one's correct (even if I'm not 100% sure), the source has also a java implementation that updates the best move even in the score < beta case and I don't understand why.

Trying with that implementation led my code to choose as best move a move from the opposition player, which doesn't seem to be correct (assuming that I'm the black player, I'm looking for the best move that I can make so I'm expecting a "black" move and not a "white" one).

I don't know if my pseudocode (the second one) is the correct way to find the best move using MinMax with alpha-beta pruning or if I need to update the best move even in the score < beta case.

Please feel free to suggest any new and better pseudocode if you prefer, I'm not bound to anything and I don't mind rewriting some code if it's better than mine.

by (108k points)

Provided that you want to get the best move only for one player and that this player, which is the maximizer, is passed to the MinMax function everytime thus need a new move (so that minmax(2, black, a, b) which returns the best move for the black player while minmax(2, white, a, b) returns the best one for the white player).

Following the code finding the best move using minmax in alpha-beta pruning:

public Move returnMove

private class MoveValue

public double returnValue

public MoveValue(

returnValue = 0;

public MoveValue(double returnValue

this.returnValue = returnValue

public MoveValue(double returnValue, Move returnMove

this.returnValue = returnValue

this.returnMove = returnMove;

} }

protected MoveValue minMax(double alpha, double beta, int maxDepth, MarbleType player)

if (!canContinue())

return new MoveValue();

ArrayList<Move> moves = sortMoves(generateLegalMoves(player));

Iterator<Move> movesIterator = moves.iterator();

double value = 0;

boolean isMaximizer = (player.equals(playerType));

if (maxDepth == 0 || board.isGameOver())

value = evaluateBoard();

return new MoveValue(value);

MoveValue returnMove;

MoveValue bestMove = null;

if (isMaximizer)

while (movesIterator.hasNext())

Move currentMove = movesIterator.next();

board.applyMove(currentMove);

returnMove = minMax(alpha, beta, maxDepth - 1, player.opponent()); board.undoLastMove();

if ((bestMove == null) || (bestMove.returnValue < returnMove.returnValue)) { bestMove = returnMove;

bestMove.returnMove = currentMove;

if (returnMove.returnValue > alpha)

alpha = returnMove.returnValue;

bestMove = returnMove;

if (beta <= alpha)

{ bestMove.returnValue = beta;

bestMove.returnMove = null;

return bestMove; // pruning } }

return bestMove; }

else {

while (movesIterator.hasNext())

{

Move currentMove = movesIterator.next();

board.applyMove(currentMove);

returnMove = minMax(alpha, beta, maxDepth - 1, player.opponent()); board.undoLastMove();

if ((bestMove == null) || (bestMove.returnValue > returnMove.returnValue)) { bestMove = returnMove;

bestMove.returnMove = currentMove;

if (returnMove.returnValue < beta)

beta = returnMove.returnValue;

bestMove = returnMove; }

if (beta <= alpha)

{ bestMove.returnValue = alpha;

bestMove.returnMove = null;

return bestMove; // pruning

} }

return bestMove;

} }

Minimax is a decision rule used in Artificial Intelligence so if you wish to learn more about Minimax Decision Rule then visit this Artificial Intelligence Course.