All this time (especially in Netflix contest), I always come across this blog (or leaderboard forum) where they mention how by applying a simple SVD step on data helped them in reducing sparsity in data or in general improved the performance of their algorithm in hand. I am trying to think (for a long time) but I am not able to guess why is it so. In general, the data in hand I get is very noisy (which is also the fun part of big data) and then I do know some basic feature scaling stuff like log-transformation stuff, mean normalization. But how does something like SVD helps? So let's say I have a huge matrix of user rating movies..and then in this matrix, I implement some version of recommendation system (say collaborative filtering):

1) Without SVD

2) With SVD

how does it help Thanks