skip to main content
10.1145/3230543.3230569acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Free Access

Sincronia: near-optimal network design for coflows

Published:07 August 2018Publication History

ABSTRACT

We present Sincronia, a near-optimal network design for coflows that can be implemented on top on any transport layer (for flows) that supports priority scheduling. Sincronia achieves this using a key technical result --- we show that given a "right" ordering of coflows, any per-flow rate allocation mechanism achieves average coflow completion time within 4X of the optimal as long as (co)flows are prioritized with respect to the ordering.

Sincronia uses a simple greedy mechanism to periodically order all unfinished coflows; each host sets priorities for its flows using corresponding coflow order and offloads the flow scheduling and rate allocation to the underlying priority-enabled transport layer. We evaluate Sincronia over a real testbed comprising 16-servers and commodity switches, and using simulations across a variety of workloads. Evaluation results suggest that Sincronia not only admits a practical, near-optimal design but also improves upon state-of-the-art network designs for coflows (sometimes by as much as 8X).

References

  1. 2018. Sincronia Repository. https://github.com/sincronia-coflow.Google ScholarGoogle Scholar
  2. Saksham Agarwal, Shijin Rajakrishnan, Akshay Narayan, Rachit Agarwal, David Shmoys, and Amin Vahdat. 2018. Sincronia: Near-Optimal Network Design for Coflows. In Tech Report.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Saba Ahmadi, Samir Khuller, Manish Purohit, and Sheng Yang. 2017. On scheduling coflows. In MOS IPCO.Google ScholarGoogle Scholar
  4. Mohammad Alizadeh, Shuang Yang, Milad Sharif, Sachin Katti, Nick McKeown, Balaji Prabhakar, and Scott Shenker. 2013. pFabric: Minimal Near-optimal Datacenter Transport. In ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Nikhil Bansal and Subhash Khot. 2010. Inapproximability of hyper-graph vertex cover and applications to scheduling problems. In EATCS ICALP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Kwok Ho Chan, Jozef Babiarz, and Fred Baker. 2006. Configuration Guidelines for DiffServ Service Classes. https://tools/ietf.org/html/rfc4594.Google ScholarGoogle Scholar
  7. Mosharaf Chowdhury and Ion Stoica. 2012. Coflow: A networking abstraction for cluster applications. In ACM HotNets. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Mosharaf Chowdhury and Ion Stoica. 2015. Efficient coflow scheduling without prior knowledge. In ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Mosharaf Chowdhury, Matei Zaharia, Justin Ma, Michael I Jordan, and Ion Stoica. 2011. Managing data transfers in computer clusters with orchestra. In ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Mosharaf Chowdhury, Yuan Zhong, and Ion Stoica. 2014. Efficient coflow scheduling with varys. In ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: simplified data processing on large clusters. In USENIX OSDI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Fahad R Dogar, Thomas Karagiannis, Hitesh Ballani, and Antony Rowstron. 2014. Decentralized task-aware scheduling for data center networks. In ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Peter X Gao, Akshay Narayan, Gautam Kumar, Rachit Agarwal, Sylvia Ratnasamy, and Scott Shenker. 2015. phost: Distributed near-optimal datacenter transport over commodity network fabric. In ACM CoNEXT. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Naveen Garg, Amit Kumar, and Vinayaka Pandit. 2007. Order scheduling models: hardness and algorithms. In IARCS FSTTCS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Mark Handley, Costin Raiciu, Alexandru Agache, Andrei Voinescu, Andrew Moore, Gianni Antichi, and Marcin Wojcik. 2017. Re-architecting datacenter networks and stacks for low latency and high performance. In ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Chi-Yao Hong, Matthew Caesar, and P Godfrey. 2012. Finishing flows quickly with preemptive scheduling. In ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly. 2007. Dryad: distributed data-parallel programs from sequential building blocks. In ACM EuroSys. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hamidreza Jahanjou, Erez Kantor, and Rajmohan Rajaraman. 2017. Asymptotically Optimal Approximation Algorithms for Coflow Scheduling. In ACM SPAA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Samir Khuller, Jingling Li, Pascal Sturmfels, Kevin Sun, and Prayaag Venkat. 2018. Select and Permute: An Improved Online Framework for Scheduling to Minimize Weighted Completion Time. In LATIN.Google ScholarGoogle Scholar
  20. Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M Hellerstein. 2012. Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment, 5(8): 716--727. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Grzegorz Malewicz, Matthew H Austern, Aart JC Bik, James C Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: a system for large-scale graph processing. In ACM SIGMOD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Zhen Qiu, Cliff Stein, and Yuan Zhong. 2015. Minimizing the total weighted completion time of coflows in datacenter networks. In ACM SPAA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Thomas A. Roemer. 2006. A note on the complexity of the concurrent open shop problem. In Journal of Scheduling, 9(4): 389--396. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Sushant Sachdeva and Rishi Saket. 2013. Optimal inapproximability for scheduling problems via structural hardness for hypergraph vertex cover. In IEEE CCC.Google ScholarGoogle Scholar
  25. Christo Wilson, Hitesh Ballani, Thomas Karagiannis, and Ant Rowtron. 2011. Better never than late: Meeting deadlines in datacenter networks. In ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Úlfar Erlingsson, Pradeep Kumar Gunda, and Jon Currey. 2008. DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language.. In USENIX OSDI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J Franklin, Scott Shenker, and Ion Stoica. 2010. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In USENIX NSDI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Hong Zhang, Li Chen, Bairen Yi, Kai Chen, Mosharaf Chowdhury, and Yanhui Geng. 2016. CODA: Toward automatically identifying and scheduling coflows in the dark. In ACM SIGCOMM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Yangming Zhao, Kai Chen, Wei Bai, Minlan Yu, Chen Tian, Yanhui Geng, Yiming Zhang, Dan Li, and Sheng Wang. 2015. Rapier: Integrating routing and scheduling for coflow-aware data center networks. In IEEE INFOCOM.Google ScholarGoogle Scholar

Index Terms

  1. Sincronia: near-optimal network design for coflows

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGCOMM '18: Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication
        August 2018
        604 pages
        ISBN:9781450355674
        DOI:10.1145/3230543

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 7 August 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate554of3,547submissions,16%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader