Skip to contents

This function uses the training_data and the training_labels to build a lm based model for each label type which can be used in model_apply(). The training_data is intended to be the sample x PC (principal component) row x column matrix. Which is the $x output from base R prcomp. We provide precomputed prcomp PCA outputs from the plae.nei.nih.gov resource for adult human eye, adult mouse eye, fetal human eye, and fetal mouse eye ( see vignette("pca_download", package = "metamoRph"))

Usage

model_build(
  training_data,
  training_labels,
  num_PCs = 50,
  BPPARAM = BiocParallel::SerialParam(),
  model = "lm",
  verbose = TRUE
)

Arguments

training_data

sample (row) by principal component (column) matrix

training_labels

vector which has the row-matched labels (e.g. cell types) for each sample.

num_PCs

number of principal components to use from the training_data. Defaults to the first (top) 50.

BPPARAM

The BiocParallel class

model

Default is lm. We also support xgboost, glm, rf, and svm. In our experience we find lm and svm to be the best performers.

verbose

Print training status for each label type

Value

A list of models for each individual label type