Xudong Sun

xudongs3@illinois.edu
Google Scholar | DBLP | GitHub | LinkedIn

I'm an incoming assistant professor in the Edward S. Rogers Sr. Department of Electrical & Computer Engineering (ECE) at the University of Toronto starting in Fall 2026. I'm currently spending my gap year at the University of Wisconsin-Madison as a postdoc working with Tej Chajed. I received my PhD from the Siebel School of Computing and Data Science at the University of Illinois Urbana-Champaign where I spent six wonderful years advised by Tianyin Xu. I work on computer systems and my research focuses on system reliability. I got my bachelor degree from Nanjing University in 2019.

I have interned at VMware Research (2020, 2022) mentored by Lalith Suresh and Microsoft Research (2021) mentored by Suman Nath.

My group at the University of Toronto ECE is actively looking for PhD and master's students in Fall 2026.

photo taken in 2025

Research

The goal of my research is to make computer systems reliable. Even critical computer systems have bugs, and the bugs come with serious consequences, e.g., outage of your favorite app, loss of your personal data. My research is about making sure such bugs never happen (again)!

My research is in the intersection of computer systems, formal methods, and software testing. I build (1) testing techniques to detect the presence of bugs in real-world systems and develop a deep understanding of them, and (2) formal verification techniques to rigorously guarantee the absence of bugs in real-world system.

As a system researcher, I care about real-world systems that support everyone's daily life, and developers who struggle with bugs and failures of these systems. My PhD research has improved the reliability of many critical distributed and operating systems. Recently, I'm interested in formally verifying real-world cloud systems, e.g., Kubernetes.

Systems research should be relevant, practical and beautiful. That's what I always aim for.

Prospective Students: I'm looking for Fall'26 PhD/MS students with strong backgrounds in systems, formal methods, or testing. Send me an email with your CV if you are interested in working with me.

Publications

Who Watches the Watchers? On the Reliability of Softwarizing Cloud Application Management
NSDI 2026
Jiawei Tyler Gu, Zhen Tang, Yiming Su, Bogdan A. Stoica, Xudong Sun, William X. Zheng, Yue Zhang, Akond Rahman, Chen Wang, and Tianyin Xu
PDF  |  BibTex |  Code Star

Converos: Practical Model Checking for Verifying Rust OS Kernel Concurrency
ATC 2025
Ruize Tang, Minghua Wang, Xudong Sun, Lin Huang, Yu Huang, and Xiaoxing Ma
PDF  |  BibTex

Multi-Grained Specifications for Distributed System Model Checking and Verification
EuroSys 2025
Lingzhi Ouyang, Xudong Sun, Ruize Tang, Yu Huang, Madhav Jivrajani, Xiaoxing Ma, and Tianyin Xu
PDF  |  BibTex  |  Code

Anvil: Verifying Liveness of Cluster Management Controllers
OSDI 2024
Xudong Sun, Wenjie Ma, Jiawei Tyler Gu, Zicheng Ma, Tej Chajed, Jon Howell, Andrea Lattuada, Oded Padon, Lalith Suresh, Adriana Szekeres, and Tianyin Xu
PDF  |  BibTex  |  Talk  |  Code Star
Jay Lepreau Best Paper Award
Invited to publish at USENIX ;login: (article)
Covered by IllinoisCS News

SandTable: Scalable Distributed System Model Checking with Specification-Level State Exploration
EuroSys 2024
Ruize Tang, Xudong Sun, Yu Huang, Yuyang Wei, Lingzhi Ouyang, and Xiaoxing Ma
PDF  |  BibTex  |  Code Star

Acto: Automatic End-to-End Testing for Operation Correctness of Cloud System Management
SOSP 2023
Jiawei Tyler Gu, Xudong Sun, Wentao Zhang, Yuxuan Jiang, Chen Wang, Mandana Vaziri, Owolabi Legunsen, and Tianyin Xu
PDF  |  BibTex  |  Code Star
Selected for solution showcase at KubeCon + CloudNativeCon Europe 2024
Invited to publish at USENIX ;login: (article)
Covered by IllinoisCS News

Push-Button Reliability Testing for Cloud-Backed Applications with Rainmaker
NSDI 2023
Yinfang Chen, Xudong Sun, Suman Nath, Ze Yang, and Tianyin Xu
PDF  |  BibTex  |  Talk  |  Slides  |  Code Star
Featured by The Weekend Read

Automatic Reliability Testing for Cluster Management Controllers
OSDI 2022
Xudong Sun, Wenqing Luo, Jiawei Tyler Gu, Aishwarya Ganesan, Ramnatthan Alagappan, Michael Gasch, Lalith Suresh, and Tianyin Xu
PDF  |  BibTex  |  Talk  |  Slides  |  Code Star
Selected for presentation at KubeCon + CloudNativeCon North America 2021
Invited for presentation at KBE Insider (Episode 14)
Covered by VMware Office of CTO Blog (endorsement from Kit Colbert), Hacker News, Paper Review by Micah Lerner, IllinoisCS News
Invited to publish at USENIX ;login: (article)

Reasoning about modern datacenter infrastructures using partial histories
HotOS 2021
Xudong Sun, Lalith Suresh, Aishwarya Ganesan, Ramnatthan Alagappan, Michael Gasch, Lilia Tang, and Tianyin Xu
PDF  |  BibTex  |  Talk
Covered by "A CAP tradeoff in the wild" by Professor Lindsey Kuper

Testing Configuration Changes in Context to Prevent Production Failures
OSDI 2020
Xudong Sun*, Runxiang Cheng*, Jianyan Chen, Ran Ang, Owolabi Legunsen, and Tianyin Xu (*co-primary)
PDF  |  BibTex  |  Talk  |  Slides  |  Code Star

Service

Teaching

Mentoring

It's my plesure and a lot of fun working with the following talents:

Honors & Awards

How to pronounce my name