Probably because of stochastical dependence. The root-problem changed and thus the different paths might be traversed. In minmax I would think, given a 50-move decision, we could reuse 1/50 of our already pre-computed data (simplified; loss is huge), but in MCTS it's maybe not as irrelevant in terms of math-proofs if we are to re-use these or not. I think this paper is explaining this (chapter 5).
Some implementations do indeed retain the information.
For example, the AlphaGo Zero paper says:
The search tree is reused at subsequent time-steps: the child node corresponding to the played action becomes the new root node; the subtree below this child is maintained along with all its statistics, while the remains of the tree are discarded.
If you want to learn about monte Carlo tree search then visit this Python Course.