Xudong Liao

PhD Candidate, HKUST, Hong Kong SAR, China

xudong.jpg

I am a Ph.D candidate in Hong Kong University of Science and Technology (HKUST), advised by Prof. Kai Chen. Before that, I received my B.Eng in Software Engineering in Wuhan University (Outstanding Graduate) in 2020.

In my research projects, I focus on:

  • application-aware optimizations for distributed systems. I build systems that push the hardware–software boundary by tailoring architectures and algorithms to each workload’s unique behavior:
    • MixNet (SIGCOMM’25) – a runtime reconfigurable optical–electrical fabric that leverages the dynamic, sparse and localized traffic patterns in distributed Mixture-of-Experts training to regionally adapt its topology on-the-fly, enabling scalable and cost-efficient training across thousands of GPUs—with near-ideal training speed and significantly reduced networking costs.
    • Herald (NSDI’24) – an embedding-aware scheduler that exploits the predictable and infrequent in-cache embedding access patterns of DLRM training to schedule the embedding access, eliminating a substantial portion of communication and efficiently speeding up the training.
    • Pallas (ATC’25) – an rack-scale CPU scheduling system that utilizes switch programmability and request-level predictability to enable efficient in-network workload shaping, driving near-optimal microsecond-level tail latency.
  • designing performant and practical DRL-driven CC algorithms – including MOCC (EuroSys’22), Spine (CoNEXT’22), Astraea (EuroSys’24), and Jury (EuroSys’25). These projects tackle key obstacles such as multi-objectives, overhead, fairness & convergence, and performance generalizability, paving the way for real-world deployment of DRL-based transport protocols.

I was fortunate to be advised by Prof. Yanjiao Chen during my time at WHU. Additionally, I am fortunate to collaborate closely with Prof. Guyue Liu from Peking University and Dr. Zhizhen Zhong from MIT on several recent projects.

Research Interests

  • Machine Learning System
  • Datacenter Networking
  • Congestion Control
  • Optical Network

news

Aug 22, 2025 Two co-authored papers LTP and MFS accepted to EuroSys 2026!
Jul 12, 2025 MixNet accepted to SIGCOMM 2025!
Apr 25, 2025 Pallas accepted to ATC 2025!
Jan 10, 2024 Astraea accepted to EuroSys 2024!
Dec 07, 2023 Herald accepted to NSDI 2024!

selected publications

* equal contribution

View Full Publication List →
  1. SIGCOMM
    MixNet: A Runtime Reconfigurable Optical-Electrical Fabric for Distributed Mixture-of-Experts Training
    Xudong Liao, Yijun Sun, Han TianXinchen WanYilun JinZilong WangZhenghang RenXinyang HuangWenxue Li, Kin Fai Tse, Zhizhen Zhong, Guyue Liu , Ying Zhang, Xiaofeng Ye , Yiming Zhang, and Kai Chen
    In Proceedings of the 2025 ACM SIGCOMM Conference (SIGCOMM 2025) , 2025
  2. ATC
    Towards Optimal Rack-scale μs-level CPU Scheduling through In-Network Workload Shaping
    Xudong LiaoHan TianXinchen WanChaoliang ZengHao WangJunxue Zhang, Mengyu Ma, Guyue Liu, and Kai Chen
    In 2025 USENIX Annual Technical Conference (ATC 2025) , 2025
  3. OSDI
    Enabling Efficient GPU Communication over Multiple NICs with FuseLink
    Zhenghang Ren , Yuxuan Li , Zilong WangXinyang HuangWenxue Li, Kaiqiang Xu, Xudong Liao, Yijun Sun, Bowen Liu, Han TianJunxue Zhang , Mingfei Wang, Zhizhen Zhong, Guyue Liu , Ying Zhang, and Kai Chen
    In Proceedings of the 19th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2025) , 2025
  4. EuroSys
    Achieving Fairness Generalizability for Learning-based Congestion Control with Jury
    Han TianXudong Liao, Decang Sun, Chaoliang ZengYilun JinJunxue ZhangXinchen WanZilong Wang , Yong Wang, and Kai Chen
    In Proceedings of the 20th ACM European Conference on Computer Systems (EuroSys 2025) , 2025
  5. EuroSys
    Astraea: Towards Fair and Efficient Learning-based Congestion Control
    Xudong Liao*Han Tian*Chaoliang ZengXinchen Wan, and Kai Chen
    In Proceedings of the 19th ACM European Conference on Computer Systems (EuroSys 2024) , 2024
  6. NSDI
    Accelerating Neural Recommendation Training with Embedding Scheduling
    Chaoliang Zeng*Xudong Liao*, Xiaodian Cheng, Han TianXinchen WanHao Wang, and Kai Chen
    In Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 2024) , 2024
  7. SIGMOD
    Scalable and Efficient Full-Graph GNN Training for Large Graphs
    Xinchen Wan, Kaiqiang Xu, Xudong LiaoYilun JinKai Chen , and Xin Jin
    In Proceedings of the ACM on Management of Data (SIGMOD 2023) , 2023
  8. CoNEXT
    Spine: An Efficient DRL-Based Congestion Control with Ultra-Low Overhead
    Han Tian*Xudong Liao*Chaoliang ZengJunxue Zhang, and Kai Chen
    In Proceedings of the 18th International Conference on Emerging Networking EXperiments and Technologies (CoNEXT 2022) , 2022
  9. EuroSys
    Multi-Objective Congestion Control
    Yiqing Ma, Han TianXudong LiaoJunxue Zhang , Weiyan Wang, Kai Chen , and Xin Jin
    In Proceedings of the 17th European Conference on Computer Systems (EuroSys 2022) , 2022