I've been reading through the Monte Carlo Tree Search survey paper by Browne et."A Survey of Monte Carlo Tree Search Methods"
I'm wrestling with just one piece of the pseudocode on p. 9. My question occurs in a similar form in both the Backup and BackupNegamax functions.
Suppose that I'm player 1 in a 2-player zero-sum game. (So, using the BackupNegamax function.) It's my turn to move, and I'm using MCTS to choose my move. In BackupNegamax, why is the delta value negated as you back up the tree? I understand that in a two-player zero-sum game, if the reward is delta for player 1 (me), then it's -delta for player 2. But shouldn't the entire tree be from player 1's perspective? (This would then be similar to how nodes are rated in a minimax tree if I'm not mistaken.)
If the perspective of the Q value switches back and forth depending on what level of the tree you're on, wouldn't that mess up the calculation shown in the BestChild function? Specifically, suppose some node v has a very high Q value because it has often led to high rewards for player 1. The given pseudocode seems to suggest that v's parent, which I'll call u, would likely have a very low (very negative) Q value (of course u's Q value would also account for its other children's Q values.)
So it doesn't make sense to me that u (the parent) would have a very low Q value while v (the child) has a very high one. I know v is from player 1's perspective in the pseudocode, and u is from player 2's perspective, but my question is why. Why aren't both node's Q values stored from player 1's perspective? That way both u and v would have high Q values, and thus high exploitation ratings, and they'd both be considered valuable for further exploitation according to the BestChild function.
(I'm coming at MCTS from experience with minimax, and in minimax, the entire tree is from Max's perspective, so that's why I'm struggling with the different idea here.)
My question also applies to Backup - why is each Q value updated according to the perspective of the player at that level of the tree, instead of everything being updated from "my" perspective?
I hope I've been clear in my question. Thank you very much for your help!