I am a computational biologist focusing on biomolecular fitness landscapes and biomolecular interaction and regulation, which form the molecular basis of biomedical science and the foundation of future in silico (virtual) biology. I am currently a postdoc with Yufeng Shen at Columbia University, where I investigate protein fitness landscapes from both evolutionary and biophysical (protein dynamics) perspectives. I developed SeqDance and ESMDance, two protein language models (pLMs) trained on protein dynamics data. I explained the scaling behavior of pLMs for fitness prediction (why larger models do not always perform better). I also developed MotifAE for unsupervised discovery of functional motifs from pLM. I received my Ph.D. in Biomedical Informatics from Peking University with Tingting Li. During my doctoral work, I studied biomolecular interactions involved in degradation regulation and protein localization mediated by phase separation.
I was born in Zhucheng, Shandong, China (Chinese name: 侯超). Outside of research, I enjoy basketball, photography, travel, and food.
Updated: April 2026
Postdoc, 2023.09-
Columbia University
PhD of Biomedical Informatics, 2020.09-2023.07
Peking University
Bachelor of Medicine and Economics, 2015.09-2020.07
Peking University

Unsupervised discovery of functional motifs from pLM

why larger model do NOT always perform better?

SeqDance/ESMDance trained on Protein Dynamic Properties

PhaSepDB, PhaSePred and MloDisDB

Degpred predicts degron and binding E3 via deep learning