User Tools

Site Tools


You are not allowed to perform this action
pub:members:rong_chen

Rong Chen (陈榕)

SJTU-2012 EuroSysPC-2015

Professor at School of Software, Shanghai Jiao Tong University
Member of Institute of Parallel And Distributed Systems
Email: rongchen at sjtu dot edu dot cn (rongchen@sjtu.edu.cn)

中文主页


About Me

I have been a professor in School of Software at Shanghai Jiao Tong University and a member of Institute of Parallel And Distributed Systems (IPADS) since February 2012. I received a Ph.D degree (2011) from the Fudan University, under the guidance of Prof. Binyu Zang. My research interests include operating systems, distributed systems, and hardware-software co-design, with a current focus on building high-performance, scalable, and working systems. My work has been recognized by Best Paper Award from EuroSys (2024 and 2015), ICPP (2007), and APSys (2017), as well as Huawei OlympusMons Pioneer Award (2020). I am an ACM Distinguished Member and a CCF Distinguished Member.

I am currently looking for self-motivated students with good background on systems related area. Drop me an E-mail if you have interests to work with me.

Google Scholar | DBLP | CSRankings | Systems Circus | SOSP/OSDI Hall of Fame

Research

  • Operating System for GPU, XPU, and Beyond (2021–)
    • REEF: Microsecond-scale preemption for concurrent GPU-based DNN inferences. OSDI'22
    • UGache: A unified multi-GPU cache for embedding-based deep learning. SOSP'23
    • New! XSched: A general preemptive scheduling framework for diverse XPUs. OSDI'25
    • XDAG: Heterogeneous scheduling using fine-grained, multi-XPU abstraction.
    • POS: Concurrent OS-level GPU checkpointing and restoring. arXiv
    • PICKER: Accurate and fast launch-time validation for GPU kernels. arXiv
  • Modern AI Infrastructure (2020–)
    • GNNLab: Accelerating GNN training through systems-level innovation. EuroSys'21 | EuroSys'22 | VLDB'24
    • DISB: A new DNN inference serving benchmark for emerging AI applications.
    • Sirius: Colocating ML inference and training with fast GPU memory handover. ATC'25
    • Characterizing and Optimizing KVCache Cache at a Large Cloud Provider. ATC'25
    • BlitzServing: Fast and scalable LLM serving over serverless platform. OSDI'25
    • Fast and elastic LLM serving with parameter-centric memory management. arXiv
    • Fast MoE-based LLM serving using proactive caching. arXiv
  • Advanced Datacenter Networks (2020–)
    • librdpma: Characterizing and optimizing remote persistent memory. ATC'21
    • KRCORE: A microsecond-scale RDMA control plane for elastic computing. ATC'22
    • SmartnicKit: Characterizing SmartNICs/DPUs for accelerating distributed systems. OSDI'23
    • FissLock: Fast and scalable in-network lock management using lock fission. OSDI'24
    • Characterizing network requirements for GPU API remoting in AI applications. arXiv | github
  • Emerging Distributed Applications (2020–)
    • Vegito/GART: Retrofitting high-availability mechanism for hybrid transactional/analytical processing (HTAP). OSDI'21 | ATC'23
    • MITOSIS: Advancing serverless computing (FaaS) with kernel-level capabilities. OSDI'23 | EuroSys'24 | ACM TOCS | NSDI'25
    • Transactional index on disaggregated memory (DM) using repairable transaction. arXiv
    • An efficient, scalable, and fair locking mechanism for disaggregated memory (DM). arXiv
    • An efficient, scalable, and coherent caching framework for disaggregated memory (DM). arXiv
  • Wukong: Fast and Concurrent Querying over Big Graphs (2015–2021) ACM SIGOPS OSR
    • Fast and concurrent RDF queries with RDMA-based CPU/GPU graph exploration. OSDI'16 | ATC'18 | IEEE TPDS
    • Sub-millisecond stateful stream querying over fast-evolving linked data. SOSP'17
    • Type-centric optimizer for query processing on graph. APSys'18 | SOCC'21
    • Split live migration for traversal workloads on graph databases. ATC'19 | IEEE TPDS
  • Operating System Scalability, Reliability, and Availability (2004–2012)

Professional Services

  • Program Committee Members
  • Conference Organizers
    • ChinaSys: 2018 (PC Co-chair)
    • SOSP: 2021 (AMA Co-organizer)
    • ICPADS: 2021 (Track Co-chair for High-Performance Computing & Architecture)

Awards and Honors

More details

  • ACM Distinguished Member, 2024. Link
  • Best Paper Award of EuroSys, 2024. Link
  • Second Prize of the State Technological Invention Award, 2023. News
  • Honorable Mention of Gilles Muller Best Artifact Award of EuroSys, 2022. Link
  • Honorable Mention of Dennis M. Ritchie Doctoral Dissertation Award (Co-advisor), 2021. Link
  • CCF Distinguished Member, 2021. Link
  • Young Scholar of Chang Jiang Scholars Program by Ministry of Education, 2021.
  • Huawei OlympusMons Pioneer Award, 2020. Link
  • First Prize of Shanghai Science and Technology Award for Technical Invention, 2019.
  • First Prize of Technology Invention Award by Ministry of Education, 2018.
  • ACM ChinaSys Rising Star, 2018. Link
  • Candlelight Award of SJTU, 2017.
  • Best Paper Award of APSys, 2017. Link
  • Excellent Bachelor Thesis (Top 1%) of SJTU (Supervisor), 2017. Link
  • New Teaching Star of SJTU, 2016.
  • Best Paper Award of EuroSys, 2015. Link
  • Excellent Bachelor Thesis (Top 1%) of SJTU (Supervisor), 2014. Link
  • Best Paper Award of ICPP, 2007. Message

Visiting Experiences

  • 2014.9 - 2015.3 Visiting Professor, National University of Singapore
  • 2008.1 - 2008.6 Research Intern, Microsoft Research Asia (Advisor: Prof. Frans Kaashoek)

Selected Publications

{ * } means corresponding author

2025

  • [OSDI] XSched: Preemptive Scheduling for Diverse XPUs. Weihang Shen, Mingcong Han, Jialong Liu, Rong Chen*, and Haibo Chen. The 19th USENIX Symposium on Operating Systems Design and Implementation, Boston, MA, US, July 2025.
    paper | AE | github
    Awarded with USENIX Badges: Artifacts Available, Artifacts Functional, Results Reproduced. Link
  • [OSDI] BlitzScale: Fast and Live Large Model Autoscaling with O(1) Host Caching. Dingyan Zhang, Haotian Wang, Yang Liu, Xingda Wei, Yizhou Shan, Rong Chen, and Haibo Chen. The 19th USENIX Symposium on Operating Systems Design and Implementation, Boston, MA, US, July 2025.
    paper | AE | github
    Awarded with USENIX Badges: Artifacts Available. Link
  • [USENIX ATC] Colocating ML Inference and Training with Fast GPU Memory Handover. Jiali Wang, Yankui Wang, Mingcong Han, and Rong Chen*. 2025 USENIX Annual Technical Conference, Boston, MA, US, July 2025.
    paper | AE | github
    Awarded with USENIX Badges: Artifacts Available, Artifacts Functional, Results Reproduced.
  • [USENIX ATC] KVCache Cache in the Wild: Characterizing and Optimizing KVCache Cache at a Large Cloud Provider. Jiahao Wang, Jinbo Han, Xingda Wei, Sijie Shen, Dingyan Zhang, Chenguang Fang, Rong Chen, Wenyuan Yu, and Haibo Chen. 2025 USENIX Annual Technical Conference, Boston, MA, US, July 2025.
    paper | traces
  • [NSDI] ODRP: On-Demand Remote Paging with Programmable RDMA. Zixuan Wang, Xingda Wei, Jinyu Gu, Hongrui Xie, Rong Chen, and Haibo Chen. The 22nd USENIX Symposium on Networked Systems Design and Implementation, Philadelphia, PA, US, April 2025.
    paper | usenix.org | slides | talk

2024

  • [OSDI] Fast and Scalable In-network Lock Management Using Lock Fission. Hanze Zhang, Ke Cheng, Rong Chen*, and Haibo Chen. The 18th USENIX Symposium on Operating Systems Design and Implementation, Santa Clara, CA, US, July 2024.
    paper | usenix.org | slides | talk | github
  • [EuroSys] Serialization/Deserialization-free State Transfer in Serverless Workflows. Fangming Lu, Xingda Wei, Zhuobin Huang, Rong Chen, Mingyu Wu, and Haibo Chen. The 19th ACM SIGOPS European Conference on Computer Systems, Athens, Greece, April 2024.
    paper | ACM DL | AE | github
    :!: Best Paper Award | certificate | Link
    Awarded with ACM Badges v1.1: Artifact Available. Link
    Full version paper: Towards Serialization/Deserialization-free State Transfer in Serverless Workflows. ACM TOCS

2023

  • [SOSP] UGACHE: A Unified GPU Cache for Embedding-based Deep Learning Systems. Xiaoniu Song, Yiwen Zhang, Rong Chen*, and Haibo Chen. The 29th ACM Symposium on Operating Systems Principles, Koblenz, Germany, October 2023.
    paper | ACM DL | AE | github
    Awarded with ACM Badges v1.1: Artifact Available, Artifact Functional, Results Reproduced. link
  • [OSDI] No Provisioned Concurrency: Fast RDMA-codesigned Remote Fork for Serverless Computing. Xingda Wei, Fangming Lu, Tianxia Wang, Jinyu Gu, Yuhan Yang, Rong Chen*, and Haibo Chen. The 17th USENIX Symposium on Operating Systems Design and Implementation, Boston, US, July 2023.
    paper | usenix.org | slides | talk | MitosisOS
  • [OSDI] Characterizing Off-path SmartNIC for Accelerating Distributed Systems. Xingda Wei, Rongxin Cheng, Yuhan Yang, Rong Chen*, and Haibo Chen. The 17th USENIX Symposium on Operating Systems Design and Implementation, Boston, MA, US, July 2023.
    paper | usenix.org | slides | talk | Smartbench
  • [OSDI] Automated Verification of Idempotence for Stateful Serverless Applications. Haoran Ding, Zhaoguo Wang, Zhuohao Shen, Rong Chen, and Haibo Chen. The 17th USENIX Symposium on Operating Systems Design and Implementation, Boston, MA, US, July 2023.
    paper | usenix.org | slides | talk
  • [USENIX ATC] Bridging the Gap between Relational OLTP and Graph-based OLAP. Sijie Shen, Zihang Yao, Lin Shi, Lei Wang, Longbin Lai, Qian Tao, Li Su, Rong Chen*, Wenyuan Yu, Haibo Chen, Binyu Zang, and Jingren Zhou. 2023 USENIX Annual Technical Conference, Boston, MA, US, July 2023.
    paper | usenix.org | slides | talk | github
    Awarded with USENIX Badges: Artifacts Available. Link

2022

  • [OSDI] Microsecond-scale Preemption for Concurrent GPU-accelerated DNN Inferences. Mingcong Han, Hanze Zhang, Rong Chen*, and Haibo Chen. The 16th USENIX Symposium on Operating Systems Design and Implementation, Carlsbad, CA, US, July 2022.
    paper | usenix.org | slides | talk | AE | github | DISB
    Awarded with USENIX Badges: Artifacts Available, Artifacts Functional, Results Reproduced. link
  • [USENIX ATC] KRCORE: A Microsecond-scale RDMA Control Plane for Elastic Computing. Xingda Wei, Fangming Lu, Rong Chen*, and Haibo Chen. 2022 USENIX Annual Technical Conference, Carlsbad, CA, US. July 2022.
    paper | usenix.org | slides | talk | AE | github
    Awarded with USENIX Badges: Artifacts Available, Artifacts Functional, Results Reproduced. link
  • [EuroSys] GNNLab: A Factored System for Sample-based GNN Training over GPUs. Jianbang Yang, Dahai Tang, Xiaoniu Song, Lei Wang, Qiang Yin, Rong Chen*, Wenyuan Yu, and Jingren Zhou. The 17th ACM SIGOPS European Conference on Computer Systems, Rennes, France, April 2022.
    paper | ACM DL | AE | github
    :!: Best Artifact Award (Honorable Mention) | certificatelink
    Awarded with ACM Badges v1.1: Artifact Available, Artifact Functional, Results Reproduced. link

2021

  • [OSDI] Retrofitting High Availability Mechanism to Tame Hybrid Transaction/Analytical Processing. Sijie Shen, Rong Chen*, Haibo Chen, and Binyu Zang. The 15th USENIX Symposium on Operating Systems Design and Implementation, Santa Clara, CA, US, July 2021.
    paper | usenix.org | slides | talk | github
    Awarded with USENIX Badge: Artifacts Available. link
  • [USENIX ATC] Characterizing and Optimizing Remote Persistent Memory with RDMA and NVM. Xingda Wei, Xiating Xie, Rong Chen*, Haibo Chen, and Binyu Zang. 2021 USENIX Annual Technical Conference, July 2021.
    paper | usenix.org | slides | talk | github
  • [EuroSys] FlexGraph: A Flexible and Efficient Distributed Framework for GNN Training. Lei Wang, Qiang Yin, Chao Tian, Jianbang Yang, Rong Chen*, Wenyuan Yu, Zihang Yao, and Jingren Zhou. The 16th ACM SIGOPS European Conference on Computer Systems, Edinburgh, Scotland, UK, April 2021.
    paper | ACM DL | slides | long talk | short talk
  • [NSDI] Unifying Timestamp with Transaction Ordering for MVCC with Decentralized Scalar Timestamp. Xingda Wei, Rong Chen*, Haibo Chen, Zhaoguo Wang, Zhenhan Gong, and Binyu Zang. The 18th USENIX Symposium on Networked Systems Design and Implementation, Boston, MA, US, April 2021.
    paper | usenix.org | slides | talk | github

2020

  • [OSDI] Fast RDMA-based Ordered Key-Value Store using Remote Learned Cache. Xingda Wei, Rong Chen*, and Haibo Chen. The 14th USENIX Symposium on Operating Systems Design and Implementation, Banff, Alberta, Canada, November 2020.
    paper | ACM DL | usenix.org | slides | talk | github
    Awarded with USENIX Badges: Artifacts Available, Artifacts Functional, Results Reproduced. link
    Full version paper: XStore: Fast RDMA-Based Ordered Key-Value Store Using Remote Learned Cache. ACM TOS (selected for fast-track publication for OSDI 2020)

2019

  • [USENIX ATC] Pragh: Locality-preserving Graph Traversal with Split Live Migration. Xiating Xie, Xingda Wei, Rong Chen*, and Haibo Chen. USENIX Annual Technical Conference, Renton, US, July 2019.
    paper | usenix.org | lightning slides | slides | talk
    Full version paper: Locality-preserving Graph Traversal with Split Live Migration. IEEE TPDS

2018

  • [OSDI] Deconstructing RDMA-enabled Transaction Processing: Hybrid is Better! Xingda Wei, Zhiyuan Dong, Rong Chen*, and Haibo Chen. The 13th USENIX Symposium on Operating Systems Design and Implementation, Carlsbad, CA, US, October 2018.
    paper | usenix.org | slides | talk | github
  • [USENIX ATC] Fast and Concurrent RDF Queries using RDMA-assisted GPU Graph Exploration. Siyuan Wang, Chang Lou, Rong Chen*, and Haibo Chen. 2018 USENIX Annual Technical Conference, Boston, MA, US, July 2018.
    paper | usenix.org | slides | talk | github
    Full version paper: Wukong+G: Fast and Concurrent RDF Query Processing Using RDMA-Assisted GPU Graph Exploration. IEEE TPDS

2017

  • [SOSP] Sub-millisecond Stateful Stream Querying over Fast-evolving Linked Data. Yunhao Zhang, Rong Chen*, and Haibo Chen. The 26th ACM Symposium on Operating Systems Principles, Shanghai, China, October 2017.
    updated paper | poster | slides | github | ACM DL
  • [USENIX ATC] Replication-driven Live Reconfiguration for Fast Distributed Transaction Processing. Xingda Wei, Sijie Shen, Rong Chen*, and Haibo Chen. 2017 USENIX Annual Technical Conference, Santa Clara, CA, US, July 2017.
    paper | usenix.org | slides | talk
    Full version paper: DrTM+B: Replication-Driven Live Reconfiguration for Fast and General Distributed Transaction Processing. IEEE TPDS

2016

  • [OSDI] Fast and Concurrent RDF Queries with RDMA-based Distributed Graph Exploration. Jiaxin Shi, Youyang Yao, Rong Chen*, Haibo Chen, and Feifei Li. The 12th USENIX Symposium on Operating Systems Design and Implementation, Savannah, GA, US, November 2016.
    paper | ACM DL | usenix.org | slides | poster | talk | homepage | github
    Abridged version paper of Wukong project: Wukong: A Distributed Framework for Fast and Concurrent Graph Querying. ACM SIGOPS OSR
  • [EuroSys] Fast and General Distributed Transactions Using RDMA and HTM. Yanzhe Chen, Xingda Wei, Jiaxin Shi, Rong Chen*, and Haibo Chen. The 11th ACM European Conference on Computer Systems, London, UK, April 2016.
    paper | ACM DL | slides

2015

  • [SOSP] Fast In-memory Transaction Processing using RDMA and HTM. Xingda Wei, Jiaxin Shi, Yanzhe Chen, Rong Chen*, and Haibo Chen. The 25th ACM Symposium on Operating Systems Principles, Monterey, CA, USA, October 2015.
    paper | slides | poster | ACM DL | github | Featured on "The Morning Paper"
    Full version paper: Fast In-Memory Transaction Processing Using RDMA and HTM. ACM TOCS
  • [EuroSys] PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs.
    Rong Chen, Jiaxin Shi, Yanzhe Chen, and Haibo Chen. The 10th ACM SIGOPS European Conference on Computer Systems, Bordeaux, France, April 2015.
    paper | slides | poster | ACM DL | homepage | github
    :!: Best Paper Award | certificate | link
    Full version paper: PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs. ACM TOPC

2011

  • [EuroSys] A Case for Scaling Applications to Many-core with OS Clustering. Xiang Song, Haibo Chen, Rong Chen, Yuanxuan Wang, and Binyu Zang. The 6th ACM SIGOPS European Conference on Computer Systems, Salzburg, Austria, April 2011. paper | ACM DL

2010

  • [PACT] Tiled MapReduce: Optimizing Resource Usages of Data-parallel Applications on Multicore with Tiling. Rong Chen, Haibo Chen, and Binyu Zang. The 19th International Conference on Parallel Arch. and Compilation Techniques, Vienna, Austria, September 2010. pdf | slides | ACM DL
    Full version paper: Tiled-MapReduce: Efficient and Flexible MapReduce Processing on Multicore with Tiling. ACM TACO

2008

  • [OSDI] Corey: An Operating System for Many Cores. Silas Boyd-Wickizer, Haibo Chen, Rong Chen, Yandong Mao, Frans Kaashoek, Robert Morris, Aleksey Pesterev, Lex Stein, Ming Wu, Yuehua Dai, Yang Zhang, and Zheng Zhang. The 8th USENIX Symposium on Operating Systems Design and Implementation, San Diego, CA, December 2008. paper | ACM DL | homepage | code

Talks

  • [Chinese] 异构算力硬件调度关键技术研究, 2025. talk
  • [Chinese] 面向智能应用的算力硬件调度与管理, 2024. talk
  • Towards “Intelligence, Storage, Network”: Characterizing, Optimizing, and Outlooking, 2023. talk
  • [Chinese] 面向新算力硬件体系的调度技术挑战——操作系统视角, 2023. talk
  • [Chinese] 新硬件驱动的系统软件研究——性能、功能、智能, 2022. talk
  • [Chinese] 面向异构硬件体系的系统软件支撑和优化技术, 2022. talk
  • Evaluation-centric Research: Experiences from A Decade of Systems Research, 2022. talk
  • GNNLab: A Factored System for Sample-based GNN Training over GPUs, 2022. talk
  • [Chinese] 面向RDMA键值存储的智能缓存, 2021. talk
  • [Chinese] 《分布式系统》 课程建设与教学实践, 2021. talk
  • [Chinese] 基于功能复用构建高性价比分布式系统——以分布式数据复制为例, 2020. talk
  • [Chinese] 基于新型硬件体系的图计算系统栈, 2020. talk
  • Building In-memory Graph Store for Fast and Concurrent Querying, 2018. talk
  • Fast and Concurrent RDF Queries using RDMA-assisted GPU Graph Exploration, 2017. talk
  • Building Efficient In-memory Computing Systems by Combing Hardware Features, 2017. talk

Teaching

Students

Alumni

PhD

Master

  • 2025: Ke Cheng (CFFEX), Guanwen Peng (Kunlun Core), Yiwen Zhang (Nvidia)
  • 2024: Baorong Ding (Jump Trading), Fangming Lu (Apple), Lin Shi (AliCloud), Tianxia Wang (Kunlun Core)
  • 2023: Yushen Xu (JQ Investments), Zihang Yao (Optiver)
  • 2022: Jinqi Wu (ByteDance), Jianbang Yang (Baidu)
  • 2021: Zhenhan Gong (Huawei), Xuehan Ke (Tencent), Yilun Wu (ABC), Xiating Xie (Baidu), Wenhao Zhang (Microsoft)
  • 2020: Ning Wang (Huawei), Siyuan Wang (Huawei), Yaozeng Zeng (Paypal)
  • 2019: Youyang Yao (Alibaba), Xiaoli Zhou (Alibaba)
  • 2018: Xiang Xue (Alibaba)
  • 2017: Yanzhe Chen (Google), Jiaxin Shi (Baidu), Xingda Wei (Ph.D @IPADS/SJTU)
  • 2015: Peng Wang (Alibaba), Chenning Xie (Google)

Bachelor

  • Ke Zhong, Ph.D Student at University of Pennsylvnia (2019)
  • Chang Lou (visiting), Ph.D Student at John Hopkins University (2018)
  • Yunhao Zhang, Ph.D Student at U. Cornell (2017)
  • Kaiyuan Zhang, Ph.D student at University of Washington (2015)
pub/members/rong_chen.txt · Last modified: 2025/06/13 01:56 by realstolz