0 votes
1 view
in AI and Deep Learning by (50.5k points)

So I was assigned the problem of writing a 5x5x5 tic-tac-toe player using a genetic algorithm. My approach was to start off with 3x3, get that working, and then extend to 5x5, and then to 5x5x5.

The way it works is this:

  • Simulate a whole bunch of games, and during each turn of each game, lookup in a corresponding table (X table or O table implemented as a c++ std lib maps) for a response. If the board was not there, add the board to the table. Otherwise, make a random response.

  • After I have complete tables, I initialize a bunch of players (each with a copy of the board table, initialized with random responses), and let them play against each other.

  • Using their wins/losses to evaluate fitness, I keep a certain % of the best, and they move on. Rinse and repeat for X generations, and an optimal player should emerge.

For 3x3, discounting boards that were reflections/rotations of other boards, and boards where the move is either 'take the win' or 'block the win', the total number of boards I would encounter was either 53 or 38, depending on whether you go first or second. Fantastic! An optimal player was generated in under an hour. Very cool!

Using the same strategy for 5x5, I knew the size of the table would increase, but did not realize it would increase so drastically. Even discounting rotations/reflections and mandatory moves, my table is ~3.6 million entries, with no end in sight.

Okay, so that's clearly not going to work, I need a new plan. What if I don't enumerate all the boards, but just some boards. Well, it seems like this won't work either, because if each player has just a fraction of possible boards they might see, then they are going to be making a lot of random moves, clearly steering in the opposite direction of optimality.

What is a realistic way of going about this? Am I going to be stuck using board features? The goal is to hard-code as little game functionality as possible.

I've been doing research, but everything I read leads to min/max with A-B pruning as the only viable option. I can certainly do it that way, but the GA is really cool, my current method is just exceeding reality a bit here.

EDIT Problem has been pretty much solved:

Using a similarity function that combines the hamming distance of open spaces, the possible win conditions, and a few other measures have brought the table down to a very manageable 2500 possibilities, which an std:: map handles in a fraction of a second.

1 Answer

0 votes
by (108k points)

Super tic-tac-toe is a board game composed of nine tic-tac-toe boards arranged in a 3-by-3 grid. Players take turns playing in the smaller tic-tac-toe boards until one of them wins in the larger tic-tac-toe board. Each small 3-by-3 tic-tac-toe board is referred to as a local board, and the larger 3-by-3 board is referred to as the global board.

This isn't minimax because you're not looking for the best moves one at a time based on the previous state of the board, you're looking for all the best moves at the same time, hoping to converge on a winning strategy.

The first move of the game has 81 different possibilities. After that, the moves are constrained to a single board. There will be 8 boards (the first board has one filled in already) that have 9 choices when you first arrive. After that, there will be 9 board with 8 choices, 9 with 7 choices, 9 with 6, etc, until you are forced to move in the last spot.

There are 3 strategies for the Tic-Tac-Toe that can be implemented on the GA:

  • A randomized strategy that gives a game state, picks from all valid moves with an equal probability of winning.

  • Monte Carlo Algorithm:  The idea behind this is so that we prefer games that let us win sooner, and that we prefer wins over ties over losses. The Monte Carlo plays extremely well against a randomized strategy, although it takes some time to run. It's win-rate over a 100 games was 0.88(88%) wins, 0.01(1%) losses, and 0.11(11%) ties.

  • Weighted Positions Genetic Algorithm: Each individual is given 81 features. Each feature corresponds to a different box. For each turn, we evaluate all valid moves and pick the one with the highest weight. This strategy was able to achieve a win rate of 0.79 wins, 0.16 losses, and 0.05 ties against a randomized_strategy over 100 games, and won all 100 games against the Monte Carlo algorithm.

If you are looking to learn more about Artificial Intelligence then you visit Artificial Intelligence Tutorial. Also, if you are appearing for job profiles of AI Engineer or AI Expert then you can prepare for the interviews on Artificial Intelligence Interview Questions.
Welcome to Intellipaat Community. Get your technical queries answered by top developers !