Multi-Scale Learning for Analysis of Spatial Transcriptomics Data
Sergei Maslov (BioE)
Alvaro Hernandez (Biotechnology Center)
Maxim Raginsky, Olgica Milenkovic, and Ilan Shomorony (ECE)
Sihai Zhao (Statistics)
Hee Sun Han (Chemistry)
Hanghang Tong (CS)
Michael Robben (Animal Sciences)
Research Problem
The field of Spatial Transcriptomics (ST), which holds great promise for elucidating complex spatiotemporal interactions of genes, cells, and tissues, has not yet fully benefited from specialized learning techniques capable of interpreting and extracting knowledge from heterogeneous data at different scales. Therefore, there is an urgent need to merge this new breakthrough technology with state-of-the-art machine learning. The ability to spatially resolve transcriptomics data at both intra and inter-cellular scales is critical because many biological processes are fundamentally embodied in the spatial proximity of interactions. The key problem is that important phenomena occur at and between 2 different levels of biological organization and different spatial scales. There are few methods to quantitatively define and extract these higher-order phenomena from the high-resolution and multimodal data generated by spatial omics technologies.
ST Vision
Our key idea is to model spatial omics data using complex graph or hypergraph structures, whose nodes themselves are graphs or other sophisticated mathematical entities. Spatial information is most easily represented by nearest-neighbor graphs, which connect cells that are physically close to each other. At the same time, each cell in the nearest neighbor graph model represents a node, which itself has various functional representations in the form of graphs (e.g., those describing intracellular proximities between mRNAs, chromatin, and protein-protein interactions) or other data structures, such as distributions or dynamic trajectories. This new modeling paradigm uses graph-theoretic and network-analytic approaches to learn biological patterns and predict structure and function.
Larger Impact
Given the importance of spatial transcriptomics in a wide range of disciplines, a successful development of our Center for Multi-Scale Machine Learning for Spatial Transcriptomics will be invaluable to several interdisciplinary research communities at the University of Illinois, including the neuroscience community at UIUC, the Gene networks in Neural & Developmental Plasticity theme at the Carl R. Woese Institute for Genomic Biology (IGB), the Biomedical Imaging Center at the Beckman Institute, and the Cancer Center at Illinois.