Xudong Liao
Founding Engineer at Netpreme | Ph.D., HKUST
I am a founding engineer at Netpreme, where we build next-generation computer systems to break the memory wall for AI.
I received my Ph.D. from The Hong Kong University of Science and Technology (HKUST), where I was advised by Prof. Kai Chen. Before that, I earned my B.Eng. in Software Engineering from Wuhan University in 2020, graduating as an Outstanding Graduate.
My research focuses on building high-performance systems that bridge applications, algorithms, and hardware. In particular, I work on:
- application-aware optimization for distributed systems. I design systems that push the hardware-software boundary by tailoring architectures and algorithms to workload behavior:
- MixNet (SIGCOMM’25) is a runtime-reconfigurable optical-electrical fabric for distributed Mixture-of-Experts training. It exploits dynamic, sparse, and localized traffic patterns to adapt topology on the fly, enabling scalable and cost-efficient training across thousands of GPUs while maintaining near-ideal training speed.
- Herald (NSDI’24) is an embedding-aware scheduler for DLRM training. It leverages predictable and infrequent in-cache embedding access patterns to eliminate a substantial portion of communication overhead and accelerate training.
- Pallas (ATC’25) is a rack-scale CPU scheduling system that combines switch programmability with request-level predictability to enable efficient in-network workload shaping and achieve near-optimal microsecond-level tail latency.
- practical, high-performance learning-based congestion control. This line of work includes MOCC (EuroSys’22), Spine (CoNEXT’22), Astraea (EuroSys’24), Jury (EuroSys’25), Learn-to-Probe (EuroSys’26), and PolicyCache (NSDI’26). Across these projects, we address challenges such as multi-objective optimization, runtime overhead, fairness, convergence, signal distinguishability, and performance generalization, with the goal of making learning-driven transport practical in real deployments.
During my time at WHU, I was fortunate to be advised by Prof. Yanjiao Chen. I have also had the opportunity to collaborate closely with Prof. Guyue Liu from Peking University and Dr. Zhizhen Zhong from MIT on several recent projects.
Research Interests
- Machine Learning Systems
- Datacenter Networking
- Congestion Control
- Optical Networking
news
| Sep 25, 2025 | Passed PhD thesis defense! |
|---|---|
| Aug 22, 2025 | Two co-authored papers LTP and MFS accepted to EuroSys 2026! |
| Jul 12, 2025 | MixNet accepted to SIGCOMM 2025! |
| Apr 25, 2025 | Pallas accepted to ATC 2025! |
| Jan 10, 2024 | Astraea accepted to EuroSys 2024! |
selected publications
* equal contribution
View Full Publication List →- ATCTowards Optimal Rack-scale μs-level CPU Scheduling through In-Network Workload ShapingIn 2025 USENIX Annual Technical Conference (ATC 2025) , 2025
- OSDIEnabling Efficient GPU Communication over Multiple NICs with FuseLinkIn Proceedings of the 19th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2025) , 2025
- EuroSysAchieving Fairness Generalizability for Learning-based Congestion Control with JuryIn Proceedings of the 20th ACM European Conference on Computer Systems (EuroSys 2025) , 2025
- NSDIPolicyCache: Intra-flow Learning in Congestion ControlIn Proceedings of the 23rd USENIX Symposium on Networked Systems Design and Implementation (NSDI 2026) , 2026
- EuroSysLearn-to-Probe: Achieving Signal Distinguishability in Learning-based Congestion ControlIn Proceedings of the 21th ACM European Conference on Computer Systems (EuroSys 2026) , 2026