Sailor: Automating Distributed Training over Dynamic, Heterogeneous, and Geo-distributed Clusters.
SOSP 2025
Foteini Strati, Zhendong Zhang , George Manos, Ixeia Sánchez Périz, Qinghao Hu , Tiancheng Chen, Berk Buzcu, Song Han , Pamela Delgado, Ana KlimovicPaper: https://doi.org/10.1145/3731569.3764839
Published on