Explore Courses Blog Tutorials Interview Questions
0 votes
in AI and Deep Learning by (50.2k points)

I am implementing the minimax for a Stratego game (where the computer has perfect knowledge of all the pieces). However, I find that the computer will often not attack a piece that it can easily destroy. From what I understand, the minimax scores come from the leaf nodes of a move tree (where each level is a turn and each score for the leaf node is calculated using an evaluation function for the board in that position). So if I have a depth of 3 levels, the computer can choose to attack move 1 or attack on move 3. According to the minimax algorithm, it has the same score associated with it (the resulting board position has the same score). So how do I influence the minimax algorithm to prefer immediate rewards over eventual rewards? i.e. I would like the score to decay over time, but with the way minimax works, I don't see how this is possible. Minimax always uses the leaf nodes to determine the intermediate nodes.

1 Answer

0 votes
by (108k points)

The minimax should be able to notice if there is a risk in delaying to capture a piece automatically and changing the evaluation function to force it so that it can prefer earlier captures.

I think you can start storing extra information in your game states (not only the board). You'll want to store timestamps(A variable containing the date and time at which an event occurred) in memory for every game state which permits you to tell in hindsight exactly at what time (in which turn) a piece was earlier captured. Using that information you could execute a decay factor in the evaluation function used in leaf nodes of the search tree.

Browse Categories