HiST: Histological Image Reconstruct Tumor Spatial Transcriptomics via MultiScale Fusion Deep Learning
HiST Github Repository Address
Table of Contents
Introduction
Spatial transcriptomics (ST) offers valuable insights into the tumor microenvironment by integrating molecular features with spatial context, but its clinical diagnostic application is limited due to its high cost.
To address this, we develop multi-scale convolutional deep learning framework, HiST, which utilizes ST to learn the relationship between spatially resolved gene expression profiles (GEPs) and histological morphology. HiST accurately predicts tumor regions (e.g., breast cancer, area under curve: 0.96), which are highly concordant with pathologist annotations. Then HiST reconstructs spatially resolved GEPs with an average Pearson correlation coefficient of 0.74 across five cancer types, which is >3 folds greater than that of the best previously reported tool. HiST's application module performs well in predicting cancer patient prognosis for five cancer types from the Cancer Genome Atlas (e.g., a concordance index 0.78 in breast cancer) and immunotherapy outcomes. Moreover, spatial GEPs aid to unveil regulatory networks and key regulators to immunotherapy.
In summary, HiST’s robust performance in tumor identification and reconstruction of spatial GEPs and its applications in prognosis prediction and immunotherapy response offer great potential for advancing tumor profiling and improving personalized cancer treatment.
Installation
- We recommend run HiST on Linux
To get started, clone the repository and install the required dependencies:
|
|
Method1 :Use requirement file(Not recommended):
|
|
Use nvcc -Vto check cuda version on your device
Method2 :Follow the instructions:
|
|
install seurat in R(conda env HiST)
|
|
|
|
Used for gene selection method (Optional):R package sf installation instructions
|
|
Other dependencies(Optional; if WSIs are used for training or prediction)
|
|
Usage
We use two sample from CRC dataset of 10x Visium technology as an example.
0. Download data
(A)Pre-trained model weights for feature extraction can be downloaded here, and please put it in /your_working_directory/HiST/resource/.
(B)Two test sample data of CRC can be downloaded here. Please unzip data.zip and put the contents in /your_working_directory/HiST/data/
Data folder structure:
- HE: Full resolution HE images.
- hires_HE: High resolution HE images provided by spaceranger.
- seurat_obj: ST sample Seurat objects.
./data
├── HE
│ ├── CRC1.jpg
│ └── CRC2.jpg
├── hires_HE
│ ├── CRC1_tissue_hires_image.png
│ └── CRC2_tissue_hires_image.png
├── seurat_obj
│ ├── CRC1.rds.gz
│ └── CRC2.rds.gz
1. Preprocess module
For preprocess module, we obtained the histological information and spatial context of the original whole slice imaging (WSI), avoiding the high GPU memory requirements of high-resolution WSI.
- Step1(Optional): Gene selection
./R/1.gene_select.R. Sample file:./resource/CRC_SVG346_list.txt
|
|
- Step2: Create gene matrix and mask matrix
./R/2.get_matrix.R
|
|
- Step3: Prepare mask and patch & feature extraction. Run in python, referring to the vignette.
2. Prediction module
We used an improved U-Net framework on prediction module with two prediction tasks, including tumor spots identification and tumor spatial transcriptomics prediction.
- Please refer to the vignette for specific steps.
3. Application module
We utilized the ST profiles obtained from prediction module as the molecular features of HE histology images and trained the model for disease prognosis and immunotherapy response prediction.
A. Survival model
- Step0: Download slide images from TCGA.
- Step1: Prepare WSI patches.
(i) Cut WSIs into patches Output: HE(resized smaller TCGA HE images) and tiles. Usage:
|
|
(ii) Clean up tiles (Optional): source:wsi-tile-cleanup Output: Tiles only containing tissue sections. Installation:
|
|
Usage:
|
|
Please refer to the vignette for the following steps.
- Step3: Feature extraction.
- Step4: Spatial gene profiles prediction by HiST gene prediction module.
- Step5: Training survival model.
B. Immunotherapy response model
Please refer to the vignette for the following steps.
- Step0: Download slide images from NGDC.
- Step1: Prepare WSI patches.
- Step3: Feature extraction.
- Step4: Spatial gene profiles prediction by HiST gene prediction module.
- Step5: Training classfication model.
Credits and Acknowledgments
Ground truth of tumor segmentation was inferred by Cottrazm
Pretrained model weights are from CTransPath
Tiles clean up method using wsi-tile-cleanup
License
This project is licensed under the MIT License. See the LICENSE file for details.
Citation
(Unpublished now)
@article{HiST,
title={HiST: Histological Image Reconstruct Tumor Spatial Transcriptomics via MultiScale Fusion Deep Learning},
author={Wei Li#, Dong Zhang#, Eryu Peng, Shijun Shen, Yao Liu*, Junke Zheng*, Cizhong Jiang*, Youqiong Ye*},
journal={XX},
year={2025},
doi={xx}
}