Source: Eeglablist
I’m trying to find out, how much variance of the original data the components (computed by ICA) explain, but I’m not sure how to do this. (My goal is something like: Component 1 explains 60% of the variance in the original data. etc.)
The pvaf is computed as follows (from envtopo.m help):
pvaf(comp) = 100 - 100 * mean(var(data - back_proj)) / mean(var(data));
Here, variance is computed across channels/ICactivations.
What exactly does the 'percent variance accounted for (pvaf)' mean? For example if I got a pvaf of about 60, does this mean that this component explains 60% of the original data (or is it more complicated)?
As it is clear in the equation, pvaf 60 means that the selected ICs
explains 60% of the temporal average of the EEG variance across channels.
These pvaf’s sum up to about 80, does this mean that 20% of the original data cannot be explained by all the components?
By all the 'selected' components.
These pvaf’s add to over 400 (whereas the ones from above sum up to only 80 which seems to be more reasonable to me), does this mean that the independent components overlap?
There is subadditivity in the variance calculation, namely:
var(A+B+C) <= var(A) + var(B) + var(C)this is because when you perform (A+B+C) cancellation happens. So you cannot add up each pvaf values to 100, but it will be always larger than 100. I know that this is confusing although totally valid.
Makoto