Research
My research interests are at the intersection of networked systems and
machine learning. Recently, I have been working on improving the efficiency
and scalability of systems for LLM inference.
|
|
Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML
Serving
Yinwei Dai*,
Rui Pan*,
Anand Iyer,
Kai Li,
Ravi Netravali
SOSP, 2024
Acceptance Rate: 17.34% / Github
/ Paper /
Slides
We present Apparate, the first system that automatically injects and manages
Early Exits for serving a wide range of models.
|
|
Improving DNN Inference Throughput Using Practical, Per-Input Compute Adaptation
Anand Iyer,
Mingyu Guan,
Yinwei Dai,
Rui Pan, Swapnil Gandhi,
Ravi Netravali
SOSP, 2024
Acceptance Rate: 17.34% / Github / Paper
We present E3 to address the detrimental trade-off that Early Exits introduce
between compute savings (from exits) and resource utilization (from batching)
in EE-DNNs.
|
|
Auxo: Efficient Federated Learning via Scalable Client Clustering
Jiachen Liu,
Fan Lai,
Yinwei Dai,
Aditya Akella,
Harsha Madhyastha,
Mosharaf Chowdhury
SoCC, 2023 Acceptance Rate: 31%
/ Github / Paper
We propose Auxo, a scalable FL system that enables the server to decompose the
large-scale FL task into groups with smaller intra-cohort heterogeneity.
|
|
ModelKeeper: Accelerating DNN Training via Automated Training Warmup
Fan Lai,
Yinwei Dai,
Harsha Madhyastha,
Mosharaf Chowdhury
NSDI, 2023
Acceptance Rate: 18.38% / Github
/ Paper
/ Talk
We introduce ModelKeeper, a cluster-scale model service framework to accelerate
DNN training, by reducing the computation needed for achieving the same model
performance via automated model transformation.
|
|
FedScale: Benchmarking Model and System Performance of Federated Learning
Fan Lai,
Yinwei Dai, Sanjay Singapuram,
Jiachen Liu,
Xiangfeng Zhu,
Harsha Madhyastha,
Mosharaf Chowdhury
ICML, 2022
Acceptance Rate: 21.94% / Website /
Github
Deployed at Linkedin
Best Paper Award at SOSP ResilientFL 2021
We present FedScale, a diverse set of challenging and realistic benchmark datasets
to facilitate scalable, comprehensive, and reproducible federated learning (FL)
research.
|
Conference Reviewer: NeurIPS (Datasets and Benchmarks) 2022, 2023, 2024
Journal Reviewer: Transactions on Mobile Computing 2022
Artifact Evaluation Committee: SIGCOMM 2022, MLSys 2023
|
My name in Chinese:
If you want to chat with me, please send me an email!
|
|