ABSTRACT
Designers of content distribution networks often need to determine how changes to infrastructure deployment and configuration affect service response times when they deploy a new data center, change ISP peering, or change the mapping of clients to servers. Today, the designers use coarse, back-of-the-envelope calculations, or costly field deployments; they need better ways to evaluate the effects of such hypothetical "what-if" questions before the actual deployments. This paper presents What-If Scenario Evaluator (WISE), a tool that predicts the effects of possible configuration and deployment changes in content distribution networks. WISE makes three contributions: (1) an algorithm that uses traces from existing deployments to learn causality among factors that affect service response-time distributions; (2) an algorithm that uses the learned causal structure to estimate a dataset that is representative of the hypothetical scenario that a designer may wish to evaluate, and uses these datasets to predict future response-time distributions; (3) a scenario specification language that allows a network designer to easily express hypothetical deployment scenarios without being cognizant of the dependencies between variables that affect service response times. Our evaluation, both in a controlled setting and in a real-world field deployment at a large, global CDN, shows that WISE can quickly and accurately predict service response-time distributions for many practical What-If scenarios.
- Akamai Technologies. www.akamai.comGoogle Scholar
- M. Arlitt, B. Krishnamurthy, J. Mogul. Predicting Short-transfer Latency from TCP Arcana: A Trace-based Validation. IMC'2005. Google ScholarDigital Library
- L. A. Barroso, J. Dean, U. Holzle. Web Search for a Planet: The Google Cluster Architecture. IEEE Micro. Vol. 23, No. 2. pp 22--28 Google ScholarDigital Library
- P. Bahl, R. Chandra, A. Greenberg, S. Kandula, D. Maltz, M. Zhang. Towards Highly Reliable Enterprise Network Services via Inference of Multi-level Dependencies. ACM SIGCOMM 2007. Google ScholarDigital Library
- N. Cardwell, S. Savage, T. Anderson. Modeling TCP Latency. IEEE Infocomm 2000.Google Scholar
- G. Cooper. A Simple Constraint-Based Algorithm for Efficiently Mining Observational Databases for Causal Relationships. Data Mining and Knowledge Discovery 1, 203--224. 1997. Google ScholarDigital Library
- Emulab Network Testbed. http://www.emulab.netGoogle Scholar
- N. Feamster and J. Rexford. Network-Wide Prediction of BGP Routes. IEEE/ACM Transactions on Networking. Vol. 15. pp. 253--266 Google ScholarDigital Library
- M. Freedman, E. Freudenthal, D. Mazieres. Democratizing Content Publication with Coral. USENIX NSDI 2004. Google ScholarDigital Library
- A. Gray, A. Moore, 'N-Body' Problems in Statistical Learning. Advances in Neural Information Processing Systems 13. 2000.Google Scholar
- Lucene Hadoop. http://lucene.apache.org/hadoop/Google Scholar
- Q. He, C. Dovrolis, M. Ammar. On the Predictability of Large Transfer TCP Throughput. ACM SIGCOMM 2006. Google ScholarDigital Library
- A. Barbir, et al. Known Content Network Request Routing Mechanisms. IETF RFC 3568. July 2003. Google ScholarDigital Library
- S. Kandula, D. Katabi, J. Vasseur. Shrink: A Tool for Failure Diagnosis in IP Networks. MineNet Workshop SIGCOMM 2005. Google ScholarDigital Library
- R. Kompella, J. Yates, A. Greenberg, A. Snoeren. IP Fault Localization Via Risk Modeling. USENIX NSDI 2005. Google ScholarDigital Library
- J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. USENIX OSDI 2004. Google ScholarDigital Library
- M. Mirza, J. Sommers, P. Barford, X. Zhu. A Machine Learning Approach to TCP Throughput Prediction. ACM SIGMETRICS 2007. Google ScholarDigital Library
- Netezza http://www.netezza.com/Google Scholar
- J. Padhye, V. Firoiu, D. Towsley, and J. Kurose. Modeling TCP Throughput: A Simple Model and its Empirical Validation. IEEE/ACM Transactions on Networking. Vol 8. pp. 135--145 Google ScholarDigital Library
- J. Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press. 2003. Google ScholarDigital Library
- I. Rish, M. Brodie, S. Ma. Efficient Fault Diagnosis Using Probing. AAAI Spring Symposium on DMDP. 2002.Google Scholar
- R. Pike, S. Dorward, R. Griesemer, and S. Quinlan. Interpreting the Data: Parallel Analysis with Sawzall. Scientific Programming Journal. Vol. 13. pp. 227--298. Google ScholarDigital Library
- P. Sprites, C. Glymour. An Algorithm for fast recovery of sparse causal graphs. Social Science Computer Review 9. USENIX Symposium on Internet Technologies and Systems. 1997.Google Scholar
- M. Tariq, A. Zeitoun, V. Valancius, N. Feamster, M. Ammar. Answering "What-if" Deployment and Configuration Questions with WISE. Georgia Tech Technical Report GT-CS-08-02. February 2008.Google Scholar
- L. Wasserman. All of Statistics: A Concise Course in Statistical Inference. Springer Texts in Statistics. 2003.Google Scholar
- J. Wolberg. Data Analysis Using the Method of Least Squares. Springer. Feb 2006.Google Scholar
Index Terms
- Answering what-if deployment and configuration questions with wise
Recommendations
Answering what-if deployment and configuration questions with wise
Designers of content distribution networks often need to determine how changes to infrastructure deployment and configuration affect service response times when they deploy a new data center, change ISP peering, or change the mapping of clients to ...
Answering: techniques and deployment experience
Designers of content distribution networks (CDNs) often need to determine how changes to infrastructure deployment and configuration affect service response times when they deploy a new data center, change ISP peering, or change the mapping of clients ...
Comments