Data preparation¶
Spatial transcriptomics data¶
SpaGFT adopts scanpy input methods, and the following two files are needed:
Raw count matrix, which indicates gene expression information on all spots/pixels. It might be
.h5ad,.h5,.csv, which can be loaded by scanpy normally for creating the AnnData object.Tissue_positions_list file: A
.csvor.txtfile contains coordinate information of spots. More details could be found here (scanpy).
Optionally, stained image information could be used to visualize as the background of SVGs or TMs. The method of adding them into AnnData object can be found in scanpy or stLearn.
We recommend load Visium data by:
import scanpy as sc
adata = sc.read_visium(path_to_visium_dataset)
For all spatial transcriptomics datasets, it should be pointed out that raw count matrix needs to be found at adata.X and the spatial coordinate information needs to be found at adata.obs or adata.obsm