run_pca
run_pca.RdThis function takes in a count matrix (where genes (features) are rows and samples are columns) and sample level metadata and returns a list object with an R::prcomp calculated object, the metadata, the percent variance explained for each principal component, and the genes (features) chosen for the PCA
Usage
run_pca(
feature_by_sample,
meta,
method = "prcomp",
ntop = 1000,
hvg_selection = "scran",
hvg_force = NULL,
feature_scale = TRUE,
feature_center = TRUE,
normalization = TRUE,
sample_scale = "cpm",
log1p = TRUE,
remove_regex = "^MT|^RPS|^RPL",
irlba_n = 50
)Arguments
- feature_by_sample
Raw feature (gene) count matrix (where genes/features are rows and samples are columns).
- meta
Metadata for the samples. The rows must match the columns for
feature_by_sample.- method
Defaults to prcomp, use irlba for large matrices for speed improvement
- ntop
Number of highly variable genes/features to use in the prcomp PCA. Defaults to 1000.
- hvg_selection
Either "classic" or "scran" to select the "ntop" features. "classic" will simply use the top n features by variance, and "scran" will use the scran package's strategy of scaling variance by expression (as highly expressed features/genes) will also have higher variance and thus may be less useful for sample distinction.
- hvg_force
Optional vector of features / genes that must be in the stats::promp input
- feature_scale
Default is TRUE, which means features (genes) are scaled with the R::scale function.
- feature_center
Default is TRUE, which means features (genes) are centered
- normalization
Default is TRUE, if set to FALSE will override
sample_scaleandlog1pand not do any sample scaling- sample_scale
Default is cpm; performs cpm scaling on the samples with the scuttle::calculateCPM() function.
- log1p
Default is TRUE; applies log1p scaling to the input count matrix.
- remove_regex
Default regex pattern is '^MT|^RPS|^RPL'. Set to '' to skip.
- irlba_n
Default 50. Only used for irlba - will return this many principal components.
Value
A named list object with the prcomp output returned under the $PCA slot, the given metadata under the $meta slot, the percent variance of each PC as the $percentVar slot, a list object containing the scaled data's "center" and "scale" values for use in the metamoRph function, and the used parameters under the $params slot.\