Explore Courses Blog Tutorials Interview Questions
0 votes
in AI and Deep Learning by (50.2k points)

First of all: This is not a question about how to make a program play Five in a Row. Been there, done that.

Introductory explanation

I have made a five-in-a-row-game as a framework to experiment with genetically improving AI (ouch, that sounds awfully pretentious). As with most turn-based games, the best move is decided by assigning a score to every possible move and then playing the move with the highest score. The function for assigning a score to a move (a square) goes something like this:

  1. If the square already has a token, the score is 0 since it would be illegal to place a new token in the square.

  2. Each square can be a part of up to 20 different winning rows (5 horizontal, 5 vertical, 10 diagonals). The score of the square is the sum of the score of each of these rows.

  3. The score of a row depends on the number of friendly and enemy tokens already in the row. Examples:

    • A row with four friendly tokens should have an infinite score because if you place a token there you win the game.

    • The score for a row with four enemy tokens should be very high since if you don't put a token there, the opponent will win on his next turn.

    • A row with both friendly and enemy tokens will score 0 since this row can never be part of a winning row.

Given this algorithm, I have declared a type called TBrain:

type TBrain = array[cFriendly..cEnemy , 0..4] of integer;

The values in the array indicate the score of a row with either N friendly tokens and 0 enemy tokens, or 0 friendly tokens and N enemy tokens. If there are 5 tokens in a row there's no score since the row is full.

It's actually quite easy to decide which values should be in the array. Brain[0,4] (four friendly tokens) should be "infinite", let's call that 1.000.000. vBrain[1,4] should be very high, but not so high that the brain would prefer blocking several enemies wins rather than wining itself

Consider the following (improbable) board:

0123456789 +---------- 0|1...1...12 1|.1..1..1.2 2|..1.1.1..2 3|...111...2 4|1111.1111. 5|...111.... 6|..1.1.1... 7|.1..1..1.. 8|1...1...1.

Player 2 should place his token in (9,4), winning the game, not in (4,4) even though he would then block 8 potential winning rows for player 1. Ergo, vBrain[1,4] should be (vBrain[0,4]/8)-1. Working like this we can find optimal values for the "brain", but again, this is not what I'm interested in. I want an algorithm to find the best values.

I have implemented this framework so that it's totally deterministic. There are no random values added to the scores, and if several squares have the same score the top-left will be chosen.

Actual problem

That's it for the introduction, now to the interesting part (for me, at least)

I have two "brains", vBrain1 and vBrain2. How should I iteratively make these better? Imagine something like this:

  1. Initialize vBrain1 and vBrain2 with random values.

  2. Simulate a game between them.

  3. Assign the values from the winner to the loser, then randomly change one of them slightly.

This doesn't seem to work. The brains don't get any smarter. Why?

Should the score-method add some small random values to the result, so that two games between the same two brains would be different? How much should the values change for each iteration? How should the "brains" be initialized? With constant values? With random values?

Also, does this have anything to do with AI or genetic algorithms at all?

PS: The question has nothing to do with Five in a Row. That's just something I chose because I can declare a very simple "Brain" to experiment on.

1 Answer

0 votes
by (108k points)

If you want to approach this problem with the genetic algorithm, then you will need an entire population of "brains". Then assess them against each other, either every combination or use a tournament style. Then you have to select the top percent of X of the population and use those as the parents of the next generation, where offspring are created via mutation or genetic crossover for example swap rows or columns between two "brains".

Take a look at the Neuro Evolution of Augmenting Topologies (NEAT). A fancy acronym which basically means the evolution of neural nets - both their structure (topology) and connection weights. You can get a .Net implementation called SharpNEAT that you may wish to look at. SharpNEAT V1 also has a Tic-Tac-Toe experiment.

If you wish to know more about the Genetic Programming and Genetic Algorithm then visit this Artificial Intelligence Course.

Browse Categories