stamp
Main module.
STAMP
__init__(adata, n_topics=20, n_layers=1, hidden_size=128, layer=None, dropout=0.0, train_size=1, rank=None, categorical_covariate_keys=None, continous_covariate_keys=None, time_covariate_keys=None, enc_distribution='mvn', gene_likelihood='nb', mode='sign', verbose=False)
Initialize model
Parameters:
Name | Type | Description | Default |
---|---|---|---|
adata |
_type_
|
AnnData object |
required |
n_topics |
int
|
Number of topics to model. Defaults to 10. |
20
|
n_layers |
int
|
Number of layers to do SGC. Defaults to 1. |
1
|
hidden_size |
int
|
Number of nodes in the hidden layer of the encoder. Defaults to 50. |
128
|
layer |
_type_
|
Layer where the counts data are stored. X is used |
None
|
dropout |
float
|
Dropout used for the encoder. Defaults to 0.0. |
0.0
|
categorical_covariate_keys |
_type_
|
Categorical batch keys |
None
|
continous_covariate_keys |
_type_
|
Continous bathc key |
None
|
verbose |
bool
|
Print out information on the model. Defaults to True. |
False
|
batch_size |
int
|
Batch size. Defaults to 1024. |
required |
enc_distribution |
str
|
Encoder distribution. Choices are multivariate normal. Defaults to "mvn". |
'mvn'
|
mode |
str
|
sign vs sgc(simplified graph convolutions). |
'sign'
|
beta |
float
|
Beta as in Beta-VAE. Defaults to 1. |
required |
train(max_epochs=800, min_epochs=100, learning_rate=0.01, betas=(0.9, 0.999), not_cov_epochs=5, device='cuda:0', batch_size=256, sampler='R', weight_decay=0, iterations_to_anneal=1, min_kl=1, max_kl=1, early_stop=True, patience=20, shuffle=True, num_particles=1)
Training the data
Parameters:
Name | Type | Description | Default |
---|---|---|---|
max_epochs |
int
|
Maximum number of epochs to run. Defaults to 2000. |
800
|
learning_rate |
float
|
Learning rate of AdamW optimi er. DefaRults to 0.01. |
0.01
|
device |
str
|
Which device to run model on. Use "cpu" to run on cpu and cuda to run on gpu. Defaults to "cuda:0". |
'cuda:0'
|
weight_decay |
float
|
Weight decay of AdamW optimizer. Defaults to 0.1. |
0
|
early_stop |
bool
|
Whether to early stop when training plateau. Defaults to True. |
True
|
patience |
int
|
How many epochs to stop training when training plateau. Defaults to 20. |
20
|
get_metrics(topk=20, layer=None, TGC=True, pseudocount=0.1)
Get metrics
Parameters:
Name | Type | Description | Default |
---|---|---|---|
topk |
int
|
Number of top genes to use to score the metrics. Defaults to 10. |
20
|
layer |
_type_
|
Which layer to use to score the metrics. If none is chosen, use X. Defaults to None. |
None
|
TGC |
bool
|
Whether to calculate the topic gene correlation. Defaults to True. |
True
|
Returns:
Name | Type | Description |
---|---|---|
_type_ | description |
get_cell_by_topic(adata=None, batch_size=None, device=None)
Get latent topics after training.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
device |
str
|
What device to use. Defaults to "cpu". |
None
|
Returns:
Name | Type | Description |
---|---|---|
_type_ | A dataframe of cell by topics where each row sum to one. |
get_feature_by_topic(device='cpu', return_softmax=False, transpose=False, pseudocount=0.1)
Get the gene modules
Parameters:
Name | Type | Description | Default |
---|---|---|---|
device |
str
|
Which device to use. Defaults to "cpu". |
'cpu'
|
num_samples |
int
|
Number of samples to use for calculation. Defaults to 1000. |
required |
pct |
float
|
Depreciated . Defaults to 0.5. |
required |
return_softmax |
bool
|
Depreciated. Defaults to False. |
False
|
Returns:
Name | Type | Description |
---|---|---|
_type_ | description |