skip to main content
10.1145/2371536.2371558acmconferencesArticle/Chapter ViewAbstractPublication PagesicacConference Proceedingsconference-collections
research-article

On the design of decentralized control architectures for workload consolidation in large-scale server clusters

Published:18 September 2012Publication History

ABSTRACT

This paper develops a fully decentralized control architecture to address the workload consolidation problem in large-scale server clusters wherein the cluster's processing capacity is dynamically tuned to satisfy the service level agreements (SLAs) associated with the incoming workload while consolidating the workload onto the fewest number of servers. In a decentralized setting, this problem is decomposed into simpler subproblems, each of which is mapped to a server and solved by a controller assigned to that server. Though control loops on different servers run independently of each other, they are implicitly coupled via the shared high-level performance goal and interactions between controllers may result in undesired system behavior such as SLA violations and frequent switching of cores on and off. Using the proposed architecture as the reference, we analyze how the organization of individual controllers within the control structure affects its overall performance for large clusters of up to thousand servers. Our studies indicate that the control structure, when organized as a causal system in which a precedence relation exists among the individual controllers, achieves a high degree of SLA satisfaction (> 98%) while significantly reducing the corresponding switching cost.

References

  1. T. Atwood. Right architecture for the right workload: The application tier. Technical report, Sun Microsystems Report, %Enterprise Systems Products, Jul. 2004.Google ScholarGoogle Scholar
  2. Y. Chen, D. Gmach, C. Hyser, Z. Wang, C. Bash, C. Hoover, and S. Singhal. Integrated management of application performance, power and cooling in data centers. In Network Operations and Mgmt. Symposium, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  3. R. Das, J. O. Kephart, C. Lefurgy, G. Tesauro, D. W. Levine, and H. Chan. Autonomic multi-agent management of power and performance in data centers. In Conf. Autonomous agents and multiagent systems, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. W. B. Dunbar and R. M. Murray. Distributed receding horizon control for multi-vehicle formation stabilization. Automatica, 42(4):549--558, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Guez, I. Rusnak, and I. B. Kana. Multiple objectives optimization approach to adaptive and learning control. Intl. Journal of Control, 56(2):469--482, September 1992.Google ScholarGoogle ScholarCross RefCross Ref
  6. J. Hellerstein, S. Singhal, and Q. Wang. Research challenges in control engineering of computing systems. IEEE Trans. Network & Service Mgmt., 6(4):206--211, Dec. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. L. Hellerstein, Y. Diao, S. Parekh, and D. M. Tilbury. Feedback Control of Computing Systems. Wiley-IEEE Press, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y.-C. Ho and K.-C. Chu. Team decision theory and information structures in optimal control problems--Part I. Automatic Control, IEEE Transactions on, 17(1):15 -- 22, Feb. 1972.Google ScholarGoogle Scholar
  9. D. Kusic, J. Kephart, J. Hanson, N. Kandasamy, and G. Jiang. Power and performance management of virtualized computing environments via lookahead control. Cluster Computing, 12:1--15, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Leon-Garcia. Probability, statistics, and random processes for electrical engineering. Prentice Hall, 2008.Google ScholarGoogle Scholar
  11. J. M. Maciejowski. Predictive Control with Constraints. Prentice Hall, London, 2002.Google ScholarGoogle Scholar
  12. S. G. Makridakis, S. C. Wheelwright, and R. J. Hyndman. Forecasting: methods and applications. Wiley, 1998.Google ScholarGoogle Scholar
  13. X. Meng, C. Isci, J. Kephart, L. Zhang, E. Bouillet, and D. Pendarakis. Efficient resource provisioning in compute clouds via VM multiplexing. In Intl Conf. on Autonomic computing, New York, NY, USA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. Tesauro. Reinforcement learning in autonomic computing: A manifesto and case studies. IEEE Internet Computing, 11:22--30, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Wang and N. Kandasamy. Workload consolidation in virtualized computing systems via hierarchical control. Intel Technology Journal, 16, June 2012.Google ScholarGoogle Scholar
  16. R. Wang, D. M. Kusic, and N. Kandasamy. A distributed control framework for performance management of virtualized computing environments. In Intl. Conf. on Autonomic computing, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. X. Wang, M. Chen, C. Lefurgy, and T. Keller. Ship: Scalable hierarchical power control for large-scale data centers. In Intl. Conf. on Parallel Architectures and Compilation Techniques, sept. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. Wang, X. Wang, M. Chen, and X. Zhu. Power-efficient response time guarantees for virtualized enterprise servers. In Real-Time Systems Symp., 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. On the design of decentralized control architectures for workload consolidation in large-scale server clusters

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ICAC '12: Proceedings of the 9th international conference on Autonomic computing
          September 2012
          222 pages
          ISBN:9781450315203
          DOI:10.1145/2371536

          Copyright © 2012 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 18 September 2012

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader