|
Research
My research lies at the intersection of systems and machine learning. I build efficient and scalable systems for serving/training AI through algorithm–system co-design—treating algorithms and the systems that run them as a joint design space, rather than two fixed layers. My work spans the growing landscape of pervasive AI: from monolithic models to compound agentic systems, and from digital to physical AI.
I am on the job market for full-time industry positions starting in Fall 2026, targeting research scientist roles in AI systems and infrastructure—particularly serving or training systems for LLMs, agents, and physical AI. If you think there's a good fit, please don't hesitate to reach out.
|
|
Selected Publications
/
All
|
|
|
Aragog: Just-in-Time Model Routing for Scalable Serving of Agentic Workflows
Yinwei Dai,
Zhuofu Chen,
Anand Iyer,
Ravi Netravali
arXiv, 2025
We present Aragog, a system that progressively adapts request configurations throughout execution for scalable serving of agentic workflows.
|
|
|
Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving
Yinwei Dai*,
Rui Pan*,
Anand Iyer,
Kai Li,
Ravi Netravali
SOSP, 2024 Acceptance Rate: 17.34%
/ Github
/ Paper
/ Slides
We present Apparate, the first system that automatically injects and manages Early Exits for serving a wide range of models.
|
|
|
ModelKeeper: Accelerating DNN Training via Automated Training Warmup
Fan Lai,
Yinwei Dai,
Harsha Madhyastha,
Mosharaf Chowdhury
NSDI, 2023 Acceptance Rate: 18.38%
/ Github
/ Paper
/ Talk
We introduce ModelKeeper, a cluster-scale model service framework to accelerate DNN training, by reducing the computation needed for achieving the same model performance via automated model transformation.
|
|
|
FedScale: Benchmarking Model and System Performance of Federated Learning
Fan Lai,
Yinwei Dai, Sanjay Singapuram,
Jiachen Liu,
Xiangfeng Zhu,
Harsha Madhyastha,
Mosharaf Chowdhury
ICML, 2022 Acceptance Rate: 21.94%
/ Website
/ Github
Deployed at Linkedin Best Paper Award at SOSP ResilientFL 2021
We present FedScale, a diverse set of challenging and realistic benchmark datasets to facilitate scalable, comprehensive, and reproducible federated learning (FL) research.
|
Conference Reviewer: NeurIPS (Main and D&B Track ) 2022-2025; MLSys 2026
Journal Reviewer: Transactions on Mobile Computing 2022, 2025
Artifact Evaluation Committee: SIGCOMM 2022, MLSys 2023
|
My name in Chinese:
If you want to chat with me, please send me an email!
|
|