Skip to main content

Research Overview

Molecular regulations in cellular systems are central to health and disease. The Computational Systems Biology Group, led by Dr Pengyi Yang, focuses on developing computational and statistical models to reconstruct molecular networks and model their regulations in differentiation and development. To translate computational predictions to biological findings, the group also focuses on experimentally validating hypotheses generated from computational models.

Lab Head

Pengyi Yang

Pengyi Yang

Group Leader, Computational Systems Biology
Available for Student Supervision

Group Leader, Computational Systems Biology

View full bio

Team Members

Han Kim
Han Kim
PhD Student in Computational Biology
Di Xiao
Di Xiao
PhD Student Computational Biology

Research Projects


Molecular trans-regulatory networks (TRNs) comprised of cell signalling, transcriptional, translational, and (epi)genomic regulations are central to health and disease. A major initiative in our group is to integrate trans-omic datasets generated by state‐of‐the‐art mass spectrometer (MS) and next-generation sequencer (NGS) from various cell systems for reconstructing TRNs and understand how different regulatory machineries (e.g. signalling, transcription, and epigenomics) co-operate to define cell states, functions, and fates.

We have previously developed various computational methods to integrate the multi-layered trans‐omic datasets generated during naive to formative pluripotency transition in embryonic stem cells (ESCs) (Yang et al. Cell Systems, 2019). Our current research project aims to further this study by developing methods to characterise signaling cascades, transcriptional networks, and protein networks and their cross‐talks with the aim of answering the following questions:

  • How do different layers of regulations talk to each other in controlling stem cell fate?
  • Can we accurately predict stem cell differentiation trajectories based on their TRNs?
  • What are the key mechanisms of stem/progenitor cells in establishing identities and making cell fate decisions.

Single-cell based omics are becoming the next wave of development in biotechnologies. promising to revolutionise our ability to study biological systems at an unprecedented resolution. Our group is working on multiple methodological development and lab experiment projects with the goal of characterising cellular systems and diseases at the single-cell level.

On the methodology front, we have recently developed a computational method together with Prof. Jean Yang's group for multiple single-cell RNA-seq data integration (Lin et al. PNAS, 2019). Our current research project aims to extend on this work by developing a suite of data processing, cell type characterisation, and network reconstruction methods and tools for single-cell omic data. In parallel, we are planning to conduct experiments to profile single cells in ESC populations and during their differentiation to multiple cell lineages. Research findings from these projects will directly contribute to our aim in addressing the three questions raised in Theme I.

Computational and statistical methods are at the core of our research. To tackle complex biological questions by utilising heterogenous omic data generated from various biotechnology, our group is specialised in developing novel computational methods for analysing (i) MS-based proteomic and phosphoproteomic data, and (ii) NGS-based RNA-seq, ChIP-seq, and Hi-C data.

Build on our long-term success in computational methodology innovation, the group is developing various machine learning and deep learning methods with targeted application to biological questions and omic data types. Example of our recent developments include a knowledge-based unsupervised learning method for kinase identification (Yang et al. PLoS Computational Biology, 2015) and a semi-supervised learning method for kinase-substrate prediction (Yang et al. Bioinformatics, 2016) from phosphoproteomic data. Continued innovation in computational and statistical methods will be a key force of our group in answering fundamental biological questions.

Note on Publications Below

Bold: CSB group member

✢: Co-first author
†: Corresponding/Co-corresponding author

Key Publications

Full NCBI Bibliography.

View all publications by Pengyi Yang.

PhosR enables processing and functional analysis of phosphoproteomic data.

Kim, H., Kim, T., Hoffman, N., Xiao, D., James, D., Humphrey S., Yang, P. (2021) Cell Reports, 34(8), 108771. [BioC R package]

Transcriptional network dynamics during the progression of pluripotency revealed by integrative statistical learning.

Kim, H., Osteil, P., Humphrey, S., Cinghu, S., Oldfield, A., Patrick, E., Wilkie, E., Peng, G., Suo, S., Jothi, R., Tam, P. & Yang, P. (2020) Nucleic Acids Research, 48(4), 1828-1842.

Multi-omic profiling reveals dynamics of the phased progression of pluripotency

Yang, P.✢†, Humphrey, S.✢†, Cinghu, S., Pathania, R., Oldfield, A., Kumar, D., Perera, D., Yang, J., James, D., Mann, M. & Jothi, R. (2019) Cell Systems, 8(5), 427-445. [The Stem Cell Atlas]

scMerge leverages factor analysis, stable expression, and pseudoreplication to merge multiple single-cell RNA-seq datasets.

Lin, Y., Ghazanfar, S., Wang, K., Gagnon-Bartsch, J., Lo, K., Su, X., Han, Z., Ormerod, J., Speed, T., Yang, P. & Yang, J. (2019) Proceedings of the National Academy of Sciences of the United States of America, 116(20), 9775-9784. [BioC R package]

Intragenic enhancers attenuate host gene expression.

Cinghu, S., Yang, P., Kosak, J., Conway, A., Kumar, D., Oldfield, A., Adelman, K. & Jothi, R. (2017) Molecular Cell, 68(1), 104–117. [PDF]

Histone-fold domain protein NF-Y promotes chromatin accessibility for cell type-specific master transcription factors.

Oldfield, A., Yang, P., Conway, A., Cinghu, S., Freudenberg, J., Yellaboina, S. & Jothi, R. (2014). Molecular Cell, 55(5), 708-722. [PDF]