Yinwei Dai

I am a fourth-year Computer Science Ph.D. student at Princeton University, working with Prof. Ravi Netravali. I am affiliated with Princeton Systems for AI Lab (SAIL).

I obtained my M.S.E. and B.S.E. in Computer Science at the University of Michigan, where I worked with Prof. Mosharaf Chowdhury and Prof. Harsha V. Madhyastha on projects related to networked systems, and B.S.E in Electrical and Computer Engineering from Shanghai Jiao Tong University

profile photo
Research

My research lies at the intersection of systems and machine learning. I build efficient and scalable systems for serving/training AI through algorithm–system co-design—treating algorithms and the systems that run them as a joint design space, rather than two fixed layers. My work spans the growing landscape of pervasive AI: from monolithic models to compound agentic systems, and from digital to physical AI.

I am on the job market for full-time industry positions starting in Fall 2026, targeting research scientist roles in AI systems and infrastructure—particularly serving or training systems for LLMs, agents, and physical AI. If you think there's a good fit, please don't hesitate to reach out.

Selected Publications / All
Aragog: Just-in-Time Model Routing for Scalable Serving of Agentic Workflows
Yinwei Dai, Zhuofu Chen, Anand Iyer, Ravi Netravali
arXiv, 2025

We present Aragog, a system that progressively adapts request configurations throughout execution for scalable serving of agentic workflows.

Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving
Yinwei Dai*, Rui Pan*, Anand Iyer, Kai Li, Ravi Netravali
SOSP, 2024 Acceptance Rate: 17.34% / Github / Paper / Slides

We present Apparate, the first system that automatically injects and manages Early Exits for serving a wide range of models.

ModelKeeper: Accelerating DNN Training via Automated Training Warmup
Fan Lai, Yinwei Dai, Harsha Madhyastha, Mosharaf Chowdhury
NSDI, 2023 Acceptance Rate: 18.38% / Github / Paper / Talk

We introduce ModelKeeper, a cluster-scale model service framework to accelerate DNN training, by reducing the computation needed for achieving the same model performance via automated model transformation.

FedScale: Benchmarking Model and System Performance of Federated Learning
Fan Lai, Yinwei Dai, Sanjay Singapuram, Jiachen Liu, Xiangfeng Zhu, Harsha Madhyastha, Mosharaf Chowdhury
ICML, 2022 Acceptance Rate: 21.94% / Website / Github
Deployed at Linkedin Best Paper Award at SOSP ResilientFL 2021

We present FedScale, a diverse set of challenging and realistic benchmark datasets to facilitate scalable, comprehensive, and reproducible federated learning (FL) research.

Work Experience
Meta, 2026/05 - 2026/08

Research Intern, AI and Systems Co-Design Team.
Training Infra for LLMs

Microsoft Research, 2025/05 - 2025/08

Research Intern, Intelligent Networked Systems Group.
Serving Infra for Physical AI

Teaching
COS 316: Principles of Computer System Design, Fall 2023

COS 418: Distributed Systems, Winter 2024
EECS 442 Computer Vision, Winter 2022

EECS 489 Computer Network, Fall 2021
Service
Conference Reviewer: NeurIPS (Main and D&B Track ) 2022-2025; MLSys 2026

Journal Reviewer: Transactions on Mobile Computing 2022, 2025

Artifact Evaluation Committee: SIGCOMM 2022, MLSys 2023

Misc
My name in Chinese: name photo

If you want to chat with me, please send me an email!