I am working on a neural network based on the NEAT algorithm that learns to play an Atari Breakout clone in Python 2.7, and I have all of the pieces working, but I think the evolution could be greatly improved with a better algorithm for calculating species fitness.
The inputs to the neural network are:
- X coordinate of the center of the paddle
- X coordinate of the center of the ball
- Y coordinate of the center of the ball
- ball's dx (velocity in X)
- ball's dy (velocity in Y)
The outputs are:
- Move paddle left
- Move paddle right
- Do not move paddle
The parameters I have available to the species fitness calculation are:
- breakout_model.score - int: the final score of the game played by the species
- breakout_model.num_times_hit_paddle - int: the number of times the paddle hit the ball
- breakout_model.hits_per_life - int: the number of times the paddle hit the ball per life, in the form of a list; e.g. first element is the value for the first life, 2nd element is the value for the 2nd life, and so on up to 4
- breakout_model.avg_paddle_offset_from_ball - decimal: the average linear distance in the X direction between the ball and the center of the paddle
- breakout_model.avg_paddle_offset_from_center - decimal: the average linear distance in the X direction between the center of the frame and the center of the paddle
- breakout_model.time - int: the total duration of the game, measured in frames
- breakout_model.stale - boolean: whether or not the game was artificially terminated due to staleness (e.g. ball gets stuck bouncing directly vertical and paddle not moving)
If you think I need more data about the final state of the game than just these, I can likely implement a way to get it very easily.
Here is my current fitness calculation, which I don't think is very good:
def calculate_fitness(self):
self.fitness = self.breakout_model.score
if self.breakout_model.num_times_hit_paddle != 0:
self.fitness += self.breakout_model.num_times_hit_paddle / 10
else:
self.fitness -= 0.5
if self.breakout_model.avg_paddle_offset_from_ball != 0:
self.fitness -= (1 / self.breakout_model.avg_paddle_offset_from_ball) * 100
for hits in self.breakout_model.hits_per_life:
if hits == 0:
self.fitness -= 0.2
if self.breakout_model.stale:
self.fitness = 0 - self.fitness
return self.fitness