The SCVI tutorial notebook currently only shows how to do inference on the same dataset that was used to train the model.
This is a good first step, but arguably an even more common workflow is to embed an unseen dataset using a pre-trained scVI model. I wonder if this is currently supported; if it is, then it would be great to add it to the tutorial notebook.
Traditionally, in scVI, one would run:
scvi.model.SCVI.prepare_query_anndata(adata, model_path)
model = scvi.model.SCVI.load_query_data(adata, model_path)
Currently, the SCVIDataModule errors out when it sees a value of batch_labels that it has not seen during training:
ValueError: y contains previously unseen labels: 'new_batch'
Of course, as a hacky workaround, I could just set the batch label of my new dataset to one of the seen batch labels, but this is not really biologically correct. It should be possible to specify a new unseen batch.
The SCVI tutorial notebook currently only shows how to do inference on the same dataset that was used to train the model.
This is a good first step, but arguably an even more common workflow is to embed an unseen dataset using a pre-trained scVI model. I wonder if this is currently supported; if it is, then it would be great to add it to the tutorial notebook.
Traditionally, in scVI, one would run:
Currently, the SCVIDataModule errors out when it sees a value of
batch_labelsthat it has not seen during training:Of course, as a hacky workaround, I could just set the batch label of my new dataset to one of the seen batch labels, but this is not really biologically correct. It should be possible to specify a new unseen batch.