-
Notifications
You must be signed in to change notification settings - Fork 11
SCCAN questions
SCCAN stands for Sparse Canonical Correlation Analysis for Neuroimaging. Strictly speaking, SCCAN is not a statistical method but an optimization routine that follows machine learning principles. Since many users expect SCCAN to be just a voxel-based statistical method, they tend to apply the same concepts, and often get confused by the results. Below you can find some frequently asked question and respective answers which should help understand this method.
SCCAN finds a set of voxels that together contribute to explain the behavioral score (multivariate method). These voxels are found slowly in a number of iterations, by giving weights to each voxel. At each iteration, weights are smoothed, and isolated voxels are set back to zero (the lesion of a single voxel does not cause a deficit, remember?). The extent of the final solution depends largely on the sparseness
value. For example, when SCCAN is called to find a solution, it is basically being asked to find a solution of a certain extent (aka sparseness
). However, since you don't know how extensive the results should be, LESYMAP runs internal 4-fold cross-validation procedures to find what is the best sparseness
. This means that your subjects are split in 4 groups, 3/4 is used to identify the voxel weights and 1/4 is used to predict the behavioral scores with those weights. The best sparseness
is the one with the most accurate predictions of new patients (the 1/4 chunks). Once the best sparseness
is found, a final SCCAN is run on all subjects using the optimal sparseness
value. The map you see at the end is derived from this final SCCAN run on all subjects.
Because SCCAN is not a voxel-wise method, and it doesn't produce p-values for each individual voxel. The only map your should care is stats.img
, which contains the voxel weights - the stronger the weight the more important that voxel is in relation to behavior.
There is a single p-value you should look when running SCCAN. That is the p-value of the correlation between true and predicted behavioral scores at the best sparseness
value, and is called CVcorrelation.pval
. This is the p-value of the solution as a whole, not of individual voxels. This global p-value is used for one thing only: to decide whether the solution is random or not. For example, if the relationship between lesions and deficit is too poor, the solutions found during cross-validation will be almost random, and will not be able to predict new patients. As a result CVcorrelation.pval
might be 0.23, which indicates that brain-behavior relationships are too weak to identify voxels that can predict new patients.
Because pThreshold
is used only to decide whether the global solution is significant or not. It has no effect on the solution itself, you will keep getting the same solution for as long as the CVcorrelation.pval
is below pThreshold
. If you are looking to get more extensive or less extensive results you should change sparseness
.
The only way to get different results is to change sparseness
. Increasing sparseness
towards 1 or -1 will produce larger maps, while decreasing sparseness
towards 0 will produce very focal maps. Changing sparseness
is not advised, however, because the optimal sparseness
should be found empirically. A manual choice is an arbitrary decision. One of the benefits of our implementation of SCCAN is that the researcher does not have much room to tweak the results (unless you know what you are doing).
No. The effect of sparseness
is different from the effect of thresholding a map. A larger sparseness will tell the algorithm that it has more freedom to retain voxels with smaller weights. However, this has an impact also on the other voxels. There will probably be less iterations (faster solution). But since voxel weights are refined at each iteration, the final result will have different weights also around the peaks. Typically, the peaks identified at low sparseness
start to look like valleys at high sparseness
. The overall pattern of results may look similar - the most relevant region is still the same - but the relationship of weights with nearby voxels will be different.
No. SCCAN is usually more precise than voxel-wise methods, but is not a magic method. Simulations show that slight displacements in peaks can be observed with SCCAN, too (see the comparison video). These displacements may occur for various reasons, one of which might be that the noise introduced in simulations may randomly push the behavioral score to relate better with a neighboring area than the area that produced it. Another one might be that the spatial smoothing applied iteratively to SCCAN weights may slightly push findings outside the brain-air border. Although these effects deserve more investigation, for now you can assume that some displacement may exist with SCCAN, too, but it is not something easy to identify or fix in today's standard of analyses (it would probably require simulations for your specific dataset).
The voxel weights obtained from SCCAN are typically very small (i.e. 0.0000003). LESYMAP normalizes these scores (-1 to 1) and removes voxels with weights closer to zero (-0.1 > weight < 0.1). This process removes a lot of voxels with weights smaller than 10% of the maximal value. The 10% threshold is an arbitrary choice, it was chosen simply because it seemed to produce the most accurate map in various scenarios. Further investigation is needed to see if this value is indeed the best one. The developer of the SCCAN method has now included a special option to use a 10% threshold in the internal SCCAN iterations (sparseDecom2
function in ANTsR). This new implementation makes SCCAN much faster while apparently producing the same results. However, we still use the traditional post-hoc thresholding in LESYMAP because more rigorous tests are needed to make sure the new method produces equivalent results. In conclusion, you should know that the true output of SCCAN is more extensive than what you see in the LESYMAP results. If you are curious about the original non-truncated weights, check the output called rawWeights.img.