Xudong Liao

PhD Candidate, HKUST, Hong Kong SAR, China

xudong.jpg

I am a Ph.D candidate in Hong Kong University of Science and Technology (HKUST), advised by Prof. Kai Chen. Before that, I received my B.Eng in Software Engineering in Wuhan University (Outstanding Graduate) in 2020.

In my research projects, I focus on:

  • developing application-oriented optimizations for distributed systems, including Herald. These systems are designed to enhance performance by leveraging unique application characteristics, such as utilizing embedding access patterns in DLRM training within Herald.
  • building performant congestion control (CC) schemes using reinforcement learning techniques, including Astraea, Spine, MOCC. These initiatives are driven by my goal to make Deep Reinforcement Learning (DRL)-based CC schemes fair, efficient and also practical for real-world deployment.

I was fortunate to be advised by Prof. Yanjiao Chen during my time at WHU. Additionally, I am fortunate to collaborate closely with Prof. Guyue Liu from Peking University and Dr. Zhizhen Zhong from MIT on several recent projects.

Research Interests

  • Machine Learning System
  • Optical Network
  • Congestion Control
  • Datacenter Networking

news

Jan 10, 2024 Our paper Astraea accepted in EuroSys 2024!
Dec 07, 2023 Co-first paper Herald accepted in NSDI 2024!
Feb 24, 2023 Co-authored paper G3 accepted in SIGMOD 2023!
Nov 30, 2022 Co-first paper Spine accepted in CoNEXT 2022!

selected publications

* equal contribution

  1. arXiv
    mFabric: An Efficient and Scalable Fabric for Mixture-of-Experts Training
    Xudong Liao ,  Yijun Sun ,  Han TianXinchen WanYilun Jin ,  Zilong Wang ,  Zhenghang Ren ,  Xinyang Huang ,  Wenxue Li ,  Kin Fai Tse ,  Zhizhen Zhong ,  Guyue Liu ,  Ying Zhang ,  Xiaofeng Ye ,  Yiming Zhang ,  and  Kai Chen
    arXiv:2501.03905, 2025
  2. INFOCOM
    A Generic and Efficient Communication Framework for Message-level In-Network Computing
    Xinchen Wan ,  Luyang Li ,  Han TianXudong Liao ,  Xinyang Huang ,  Chaoliang Zeng ,  Zilong Wang ,  Xinyu Yang ,  Ke Cheng ,  Qingsong Ning ,  Guyue Liu ,  Layong Luo ,  and  Kai Chen
    In Proceedings of the IEEE International Conference on Computer Communications (INFOCOM 2025) , 2025
  3. ASPLOS
    Design and Operation of Shared Machine Learning Clusters on Campus
    Kaiqiang Xu ,  Decang Sun ,  Hao Wang ,  Zhenghang Ren ,  Xinchen WanXudong Liao ,  Zilong Wang ,  Junxue Zhang ,  and  Kai Chen
    In Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2025) , 2025
  4. EuroSys
    Achieving Fairness Generalizability for Learning-based Congestion Control with Jury
    Han TianXudong Liao ,  Decang Sun ,  Chaoliang ZengYilun JinJunxue ZhangXinchen Wan ,  Zilong Wang ,  Yong Wang ,  and  Kai Chen
    In Proceedings of the 20th ACM European Conference on Computer Systems (EuroSys 2025) , 2025
  5. EuroSys
    Astraea: Towards Fair and Efficient Learning-based Congestion Control
    Xudong Liao*Han Tian*Chaoliang ZengXinchen Wan ,  and  Kai Chen
    In Proceedings of the 19th ACM European Conference on Computer Systems (EuroSys 2024) , 2024
  6. NSDI
    Accelerating Neural Recommendation Training with Embedding Scheduling
    Chaoliang Zeng*Xudong Liao* ,  Xiaodian Cheng ,  Han TianXinchen WanHao Wang ,  and  Kai Chen
    In Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 2024) , 2024
  7. SIGMOD
    Scalable and Efficient Full-Graph GNN Training for Large Graphs
    Xinchen Wan ,  Kaiqiang Xu ,  Xudong LiaoYilun JinKai Chen ,  and  Xin Jin
    In Proceedings of the ACM on Management of Data (SIGMOD 2023) , 2023
  8. CoNEXT
    Spine: An Efficient DRL-Based Congestion Control with Ultra-Low Overhead
    Han Tian*Xudong Liao*Chaoliang ZengJunxue Zhang ,  and  Kai Chen
    In Proceedings of the 18th International Conference on Emerging Networking EXperiments and Technologies (CoNEXT 2022) , 2022
  9. EuroSys
    Multi-Objective Congestion Control
    Yiqing Ma ,  Han TianXudong LiaoJunxue Zhang ,  Weiyan Wang ,  Kai Chen ,  and  Xin Jin
    In Proceedings of the 17th European Conference on Computer Systems (EuroSys 2022) , 2022