I am making an AI for a zero-sum 4-player board game. It's not zero-sum (the 4 players will "die" when they lose all their lives, so there will be a player who died first, second, third and a player who survived. However, I am telling the AI that only surviving counts as a win and anything else is a lose) After some research, I figured I would use a minimax algorithm in combination with a heuristic function.
However, my heuristic function is different from the one the OP of that question had. Mine takes 9 weights and is a lot slower, so I can't let the agents play 1000 games (takes too much time) or breed them with the crossover method (how do I do a crossover with 9 weights?).
So I decided to come up with my method of determining fitness and breeding. And this question is only about the fitness function.
Here are my attempts at this.
First Attempt
For each agent A
in a randomly generated population of 50 agents, select 3 more agents from the population (with a replacement but not the same agent as A
itself) and let the 4 agents play a game where A
is the first player. Select another 3 and play a game where A
is the second player, and so on. For each of these 4 games, if A
died first, its fitness does not change. If A
died second, its fitness is increased by 1. If it died third, its fitness is increased by 2. If it survived, its fitness is increased by 3. Therefore, I concluded that the highest fitness one can get is 12 (surviving/winning all 4 games -> 3 + 3 + 3 + 3).
I ran this for 7 generations and starting from the first generation, the highest fitness is as high as 10. And I calculated the average fitness of the top 10 agents, but the average didn't increase a bit throughout the 7 generations. It even decreased a little.
I think the reason why this didn't work is that there's gotta be a few agents that got lucky and got some poor performing agents as its opponents.
Second Attempt
The game setups are the same as my first attempt but instead of measuring the results of each game, I decided to measure how many moves did that agent make before it died.
After 7 generations the average fitness of the top 10 does increase but still not increasing as much as I think it should.
I think the reason why this failed is that the game is finite, so there is a finite number of moves you can make before you die and the top performing agents pretty much reached that limit. There is no room for growth. Another reason is that the fitness of the player who survived and the fitness of the player who died third differs little.
What I want
From my understanding of EAs (correct me if I'm wrong), the average fitness should increase and the top-performing individual's fitness should not decrease over time.
My two attempts failed at both of these. Since the opponents are randomly selected, the top-performing agent in generation 1 might get stronger opponents in the next generation, and thus its fitness decreases.
Notes
In my attempts, the agents play 200 games each generation and each generation takes up to 3 hours, so I don't want to let them play too many games.
How can I write a fitness function like this?