You should perform data normalization before doing PCA because it makes PCA work faster with more accuracy.
For example, consider a data set X with a known correlation matrix C:
>> C = [1 0.5; 0.5 1];
>> A = chol(rho);
>> X = randn(100,2) * A;
Now we will implement PCA to find principal components (features with high co-relation):
>> wts=pca(X)
wts =
0.6659 0.7461
-0.7461 0.6659
To scale the first feature of the data set by 100:
>> Y = X;
>> Y(:,1) = 100 * Y(:,1);
Here, the principal components are aligned with the coordinate axes:
>> wts=pca(Y)
wts =
1.0000 0.0056
-0.0056 1.0000
There are two methods to resolve it:
Rescale the data:
>> Ynorm = bsxfun( df, Y, std(Y))
To get PCA results:
>> wts = pca(Ynorm)
wts =
-0.7125 -0.7016
0.7016 -0.7125
They might be different from the PCA performed on original data.
Second, perform PCA using the correlation matrix of the data, instead of the outer product:
>> wts = pca(Y,'corr')
wts =
0.7071 0.7071
-0.7071 0.7071
This part is might similar to standardizing the data by subtracting the mean and dividing the standard deviation.
Hope this answer helps.