Yinwei Dai

I am a third-year Computer Science Ph.D. student at Princeton University, working with Prof. Ravi Netravali. I am affiliated with Princeton Systems for AI Lab (SAIL).

I obtained my M.S.E. and B.S.E. in Computer Science at University of Michigan, where I worked with Prof. Mosharaf Chowdhury and Prof. Harsha V. Madhyastha on projects related to networked systems, and B.S.E in Electrical and Computer Engineering from Shanghai Jiao Tong University

Email  /  CV  /  Google Scholar  /  X  /  Github

profile photo
Research

My research interests are at the intersection of networked systems and machine learning. Recently, I have been working on improving the efficiency and scalability of systems for LLM inference.

Publications
Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving
Yinwei Dai*, Rui Pan*, Anand Iyer, Kai Li, Ravi Netravali
SOSP, 2024 Acceptance Rate: 17.34% / Github / Paper / Slides

We present Apparate, the first system that automatically injects and manages Early Exits for serving a wide range of models.

Improving DNN Inference Throughput Using Practical, Per-Input Compute Adaptation
Anand Iyer, Mingyu Guan, Yinwei Dai, Rui Pan, Swapnil Gandhi, Ravi Netravali
SOSP, 2024 Acceptance Rate: 17.34% / Github / Paper

We present E3 to address the detrimental trade-off that Early Exits introduce between compute savings (from exits) and resource utilization (from batching) in EE-DNNs.

Auxo: Efficient Federated Learning via Scalable Client Clustering
Jiachen Liu, Fan Lai, Yinwei Dai, Aditya Akella, Harsha Madhyastha, Mosharaf Chowdhury
SoCC, 2023 Acceptance Rate: 31% / Github / Paper

We propose Auxo, a scalable FL system that enables the server to decompose the large-scale FL task into groups with smaller intra-cohort heterogeneity.

ModelKeeper: Accelerating DNN Training via Automated Training Warmup
Fan Lai, Yinwei Dai, Harsha Madhyastha, Mosharaf Chowdhury
NSDI, 2023 Acceptance Rate: 18.38% / Github / Paper / Talk

We introduce ModelKeeper, a cluster-scale model service framework to accelerate DNN training, by reducing the computation needed for achieving the same model performance via automated model transformation.

FedScale: Benchmarking Model and System Performance of Federated Learning
Fan Lai, Yinwei Dai, Sanjay Singapuram, Jiachen Liu, Xiangfeng Zhu, Harsha Madhyastha, Mosharaf Chowdhury
ICML, 2022 Acceptance Rate: 21.94% / Website / Github
Deployed at Linkedin Best Paper Award at SOSP ResilientFL 2021

We present FedScale, a diverse set of challenging and realistic benchmark datasets to facilitate scalable, comprehensive, and reproducible federated learning (FL) research.

Work Experience
Microsoft Research, 2025/05 - 2025/08

Research Intern, Intelligent Networked Systems Group.

Teaching
COS 316: Principles of Computer System Design, Fall 2023

COS 418: Distributed Systems, Winter 2024

EECS 442 Computer Vision, Winter 2022

EECS 489 Computer Network, Fall 2021

Service
Conference Reviewer: NeurIPS (Datasets and Benchmarks) 2022, 2023, 2024

Journal Reviewer: Transactions on Mobile Computing 2022

Artifact Evaluation Committee: SIGCOMM 2022, MLSys 2023

Misc
My name in Chinese: name photo

If you want to chat with me, please send me an email!