Xudong Liao

PhD Candidate, HKUST, Hong Kong SAR, China

xudong.jpg

I am a Ph.D candidate in Hong Kong University of Science and Technology (HKUST), advised by Prof. Kai Chen. Before that, I received my B.Eng in Software Engineering in Wuhan University (Outstanding Graduate) in 2020.

In my research projects, I focus on:

  • developing application-oriented optimizations for distributed systems, including Herald. These systems are designed to enhance performance by leveraging unique application characteristics, such as utilizing embedding access patterns in DLRM training within Herald.
  • building performant congestion control (CC) schemes using reinforcement learning techniques, including Astraea, Spine, MOCC and Jury. These initiatives are driven by my goal to make Deep Reinforcement Learning (DRL)-based CC schemes fair, efficient and also practical for real-world deployment.

I was fortunate to be advised by Prof. Yanjiao Chen during my time at WHU. Additionally, I am fortunate to collaborate closely with Prof. Guyue Liu from Peking University and Dr. Zhizhen Zhong from MIT on several recent projects.

Research Interests

  • Machine Learning System
  • Optical Network
  • Congestion Control
  • Datacenter Networking

news

Jan 10, 2024 Our paper Astraea accepted in EuroSys 2024!
Dec 07, 2023 Co-first paper Herald accepted in NSDI 2024!
Feb 24, 2023 Co-authored paper G3 accepted in SIGMOD 2023!
Nov 30, 2022 Co-first paper Spine accepted in CoNEXT 2022!

selected publications

* equal contribution

  1. arXiv
    mFabric: An Efficient and Scalable Fabric for Mixture-of-Experts Training
    Xudong Liao, Yijun Sun, Han TianXinchen WanYilun Jin , Zilong Wang, Zhenghang Ren, Xinyang Huang, Wenxue Li, Kin Fai Tse, Zhizhen Zhong, Guyue Liu , Ying Zhang, Xiaofeng Ye , Yiming Zhang, and Kai Chen
    arXiv:2501.03905, 2025
  2. OSDI
    Enabling Efficient GPU Communication over Multiple NICs with FuseLink
    Zhenghang Ren, Yuxuan Li , Zilong Wang, Xinyang Huang, Wenxue Li, Kaiqiang Xu, Xudong Liao, Yijun Sun, Bowen Liu, Han TianJunxue Zhang , Mingfei Wang, Zhizhen Zhong, Guyue Liu , Ying Zhang, and Kai Chen
    In Proceedings of the 19th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2025) , 2025
  3. INFOCOM
    A Generic and Efficient Communication Framework for Message-level In-Network Computing
    Xinchen Wan, Luyang Li, Han TianXudong Liao, Xinyang Huang, Chaoliang Zeng , Zilong Wang, Xinyu Yang, Ke Cheng, Qingsong Ning, Guyue Liu, Layong Luo, and Kai Chen
    In Proceedings of the IEEE International Conference on Computer Communications (INFOCOM 2025) , 2025
  4. EuroSys
    Achieving Fairness Generalizability for Learning-based Congestion Control with Jury
    Han TianXudong Liao, Decang Sun, Chaoliang ZengYilun JinJunxue ZhangXinchen Wan , Zilong Wang , Yong Wang, and Kai Chen
    In Proceedings of the 20th ACM European Conference on Computer Systems (EuroSys 2025) , 2025
  5. EuroSys
    Astraea: Towards Fair and Efficient Learning-based Congestion Control
    Xudong Liao*Han Tian*Chaoliang ZengXinchen Wan, and Kai Chen
    In Proceedings of the 19th ACM European Conference on Computer Systems (EuroSys 2024) , 2024
  6. NSDI
    Accelerating Neural Recommendation Training with Embedding Scheduling
    Chaoliang Zeng*Xudong Liao*, Xiaodian Cheng, Han TianXinchen WanHao Wang, and Kai Chen
    In Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 2024) , 2024
  7. SIGMOD
    Scalable and Efficient Full-Graph GNN Training for Large Graphs
    Xinchen Wan, Kaiqiang Xu, Xudong LiaoYilun JinKai Chen , and Xin Jin
    In Proceedings of the ACM on Management of Data (SIGMOD 2023) , 2023
  8. CoNEXT
    Spine: An Efficient DRL-Based Congestion Control with Ultra-Low Overhead
    Han Tian*Xudong Liao*Chaoliang ZengJunxue Zhang, and Kai Chen
    In Proceedings of the 18th International Conference on Emerging Networking EXperiments and Technologies (CoNEXT 2022) , 2022
  9. EuroSys
    Multi-Objective Congestion Control
    Yiqing Ma, Han TianXudong LiaoJunxue Zhang , Weiyan Wang, Kai Chen , and Xin Jin
    In Proceedings of the 17th European Conference on Computer Systems (EuroSys 2022) , 2022