Parity models: erasure-coded resilience for prediction serving systems

Authors:
Jack Kosaian

Carnegie Mellon University

Carnegie Mellon University
View Profile

,
K. V. Rashmi

Carnegie Mellon University

Carnegie Mellon University
View Profile

,
Shivaram Venkataraman

University of Wisconsin-Madison

University of Wisconsin-Madison
View Profile

SOSP '19: Proceedings of the 27th ACM Symposium on Operating Systems PrinciplesOctober 2019Pages 30–46https://doi.org/10.1145/3341301.3359654

Published:27 October 2019Publication History

SOSP '19: Proceedings of the 27th ACM Symposium on Operating Systems Principles

Pages 30–46

ABSTRACT

Machine learning models are becoming the primary work-horses for many applications. Services deploy models through prediction serving systems that take in queries and return predictions by performing inference on models. Prediction serving systems are commonly run on many machines in cluster settings, and thus are prone to slowdowns and failures that inflate tail latency. Erasure coding is a popular technique for achieving resource-efficient resilience to data unavailability in storage and communication systems. However, existing approaches for imparting erasure-coded resilience to distributed computation apply only to a severely limited class of functions, precluding their use for many serving workloads, such as neural network inference.

We introduce parity models, a new approach for enabling erasure-coded resilience in prediction serving systems. A parity model is a neural network trained to transform erasure-coded queries into a form that enables a decoder to reconstruct slow or failed predictions. We implement parity models in ParM, a prediction serving system that makes use of erasure-coded resilience. ParM encodes multiple queries into a "parity query," performs inference over parity queries using parity models, and decodes approximations of unavailable predictions by using the output of a parity model. We showcase the applicability of parity models to image classification, speech recognition, and object localization tasks. Using parity models, ParM reduces the gap between 99.9th percentile and median latency by up to 3.5X, while maintaining the same median. These results display the potential of parity models to unlock a new avenue to imparting resource-efficient resilience to prediction serving systems.

References

Amazon Alexa. https://developer.amazon.com/alexa. Last accessed 01 September 2019.Google Scholar
Amazon EC2 C5 Instances. https://aws.amazon.com/ec2/instance-types/c5/. Last accessed 01 September 2019.Google Scholar
Azure Machine Learning Studio. https://azure.microsoft.com/en-us/services/machine-learning-studio/. Last accessed 01 September 2019.Google Scholar
Google Cloud AI. https://cloud.google.com/products/machine-learning/. Last accessed 01 September 2019.Google Scholar
Google lens: real-time answers to questions about the world around you. https://bit.ly/2MHAOLq. Last accessed 01 September 2019.Google Scholar
HDFS RAID. http://www.slideshare.net/ydn/hdfs-raid-facebook. Last accessed 01 September 2019.Google Scholar
iOS Siri. https://www.apple.com/ios/siri/. Last accessed 01 September 2019.Google Scholar
Machine Learning on AWS. https://aws.amazon.com/machine-learning/. Last accessed 01 September 2019.Google Scholar
Model Server for Apache MXNet. https://github.com/awslabs/mxnet-model-server. Last accessed 01 September 2019.Google Scholar
NVIDIA TensorRT. https://developer.nvidia.com/tensorrt. Last accessed 01 September 2019.Google Scholar
OpenCV. https://opencv.org/. Last accessed 01 September 2019.Google Scholar
PyTorch. https://pytorch.org/. Last accessed 01 September 2019.Google Scholar
Speculative Execution in Hadoop MapReduce. https://data-flair.training/blogs/speculative-execution-in-hadoop-mapreduce/. Last accessed 01 September 2019.Google Scholar
Asirra: A CAPTCHA That Exploits Interest-aligned Manual Image Categorization. In Proceedings of the 14th ACM Conference on Computer and Communications Security (CCS 07) (2007).Google Scholar
Agarwal, D., Long, B., Traupman, J., Xin, D., and Zhang, L. LASER: A Scalable Response Prediction Platform for Online Advertising. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining (WSDM 14) (2014).Google ScholarDigital Library
Alex Krizhevsky and Vinod Nair and Geoffrey Hinton. The CIFAR-10 and CIFAR-100 Datasets. https://www.cs.toronto.edu/~kriz/cifar.html.Google Scholar
Alipourfard, O., Liu, H. H., Chen, J., Venkataraman, S., Yu, M., and Zhang, M. CherryPick: Adaptively Unearthing the Best Cloud Configurations for Big Data Analytics. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17) (2017).Google ScholarDigital Library
Ananthanarayanan, G., Ghodsi, A., Shenker, S., and Stoica, I. Effective Straggler Mitigation: Attack of the Clones. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13) (2013).Google Scholar
Ananthanarayanan, G., Kandula, S., Greenberg, A. G., Stoica, I., Lu, Y., Saha, B., and Harris, E. Reining in the Outliers in Map-Reduce Clusters using Mantri. In 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI 10) (2010).Google Scholar
Aoudia, F. A., and Hoydis, J. End-to-End Learning of Communications Systems Without a Channel Model. arXiv preprint arXiv:1804.02276 (2018).Google Scholar
Baylor, D., Breck, E., Cheng, H.-T., Fiedel, N., Foo, C. Y., Haque, Z., Haykal, S., Ispir, M., Jain, V., Koc, L., et al. TFX: A Tensorflow-Based Production-Scale Machine Learning Platform. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 17) (2017).Google ScholarDigital Library
Chen, T., Moreau, T., Jiang, Z., Zheng, L., Yan, E., Shen, H., Cowan, M., Wang, L., Hu, Y., Ceze, L., Guestrin, C., and Krishnamurthy, A. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18).Google Scholar
Chung, E., Fowers, J., Ovtcharov, K., Papamichael, M., Caulfield, A., Massengill, T., Liu, M., Lo, D., Alkalay, S., Haselman, M., et al. Serving DNNs in Real Time at Datacenter Scale with Project Brainwave. IEEE Micro 38, 2 (2018), 8--20.Google Scholar
Crankshaw, D., Sela, G.-E., Zumar, C., Mo, X., Gonzalez, J. E., Stoica, I., and Tumanov, A. InferLine: ML Inference Pipeline Composition Framework. arXiv preprint arXiv:1812.01776 (2018).Google Scholar
Crankshaw, D., Wang, X., Zhou, G., Franklin, M. J., Gonzalez, J. E., and Stoica, I. Clipper: A Low-Latency Online Prediction Serving System. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17) (2017).Google Scholar
Dean, J., and Barroso, L. A. The Tail at Scale. Communications of the ACM 56, 2 (2013), 74--80.Google ScholarDigital Library
Diederik P. Kingma and Jimmy Ba. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations (ICLR 15) (2015).Google Scholar
Dutta, S., Bai, Z., Jeong, H., Low, T. M., and Grover, P. A Unified Coded Deep Neural Network Training Strategy Based on Generalized Polydot Codes for Matrix Multiplication. In Proceedings of the 2018 IEEE International Symposium on Information Theory (ISIT 18) (2018).Google ScholarCross Ref
Dutta, S., Cadambe, V., and Grover, P. Short-dot: Computing Large Linear Transforms Distributedly Using Coded Short Dot Products. In Advances In Neural Information Processing Systems (NIPS 16) (2016).Google Scholar
Dutta, S., Cadambe, V., and Grover, P. Coded Convolution for Parallel and Distributed Computing Within a Deadline. In Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT 17) (2017).Google ScholarCross Ref
Gardner, K., Zbarsky, S., Doroudi, S., Harchol-Balter, M., and Hyytia, E. Reducing Latency via Redundant Requests: Exact Analysis. ACM SIGMETRICS Performance Evaluation Review 43, 1 (2015), 347--360.Google Scholar
Glorot, X., and Bengio, Y. Understanding the Difficulty of Training Deep Feedforward Neural Networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 10) (2010).Google Scholar
Grosvenor, M. P., Schwarzkopf, M., Gog, I., Watson, R. N. M., Moore, A. W., Hand, S., and Crowcroft, J. Queues Don't Matter When You Can JUMP Them! In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15) (2015).Google Scholar
Gujarati, A., Elnikety, S., He, Y., McKinley, K. S., and Brandenburg, B. B. Swayam: Distributed Autoscaling to Meet SLAs of Machine Learning Inference Services with Resource Efficiency. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference (Middleware 17) (2017).Google ScholarDigital Library
Hao, M., Li, H., Tong, M. H., Pakha, C., Suminto, R. O., Stuardo, C. A., Chien, A. A., and Gunawi, H. S. MittOS: Supporting Millisecond Tail Tolerance with Fast Rejecting SLO-Aware OS Interface. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP 17) (2017).Google ScholarDigital Library
Harchol-Balter, M. Performance Modeling and Design of Computer Systems: Queueing Theory in Action. Cambridge University Press, 2013.Google ScholarDigital Library
Harlap, A., Cui, H., Dai, W., Wei, J., Ganger, G. R., Gibbons, P. B., Gibson, G. A., and Xing, E. P. Addressing the Straggler Problem for Iterative Convergent Parallel ML. In Proceedings of the Seventh ACM Symposium on Cloud Computing (SoCC 16) (2016).Google ScholarDigital Library
Hauswald, J., Kang, Y., Laurenzano, M. A., Chen, Q., Li, C., Mudge, T., Dreslinski, R. G., Mars, J., and Tang, L. DjiNN and Tonic: DNN as a Service and Its Implications for Future Warehouse Scale Computers. In 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA 15) (2015).Google ScholarDigital Library
Hazelwood, K., Bird, S., Brooks, D., Chintala, S., Diril, U., Dzhulgakov, D., Fawzy, M., Jia, B., Jia, Y., Kalro, A., et al. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA 18) (2018).Google ScholarCross Ref
He, K., Zhang, X., Ren, S., and Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 16) (2016).Google ScholarCross Ref
Ho, Q., Cipar, J., Cui, H., Lee, S., Kim, J. K., Gibbons, P. B., Gibson, G. A., Ganger, G., and Xing, E. P. More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server. In Advances in Neural Information Processing Systems (NIPS 13) (2013).Google Scholar
Hu, H., Dey, D., Bagnell, J. A., and Hebert, M. Learning Anytime Predictions in Neural Networks via Adaptive Loss Balancing. arXiv preprint arXiv:1708.06832 (2018).Google Scholar
Huang, C., Simitci, H., Xu, Y., Ogus, A., Calder, B., Gopalan, P., Li, J., and Yekhanin, S. Erasure Coding in Windows Azure Storage. In 2012 USENIX Annual Technical Conference (USENIX ATC 12) (2012).Google Scholar
Iorgulescu, C., Azimi, R., Kwon, Y., Elnikety, S., Syamala, M., Narasayya, V., Herodotou, H., Tomita, P., Chen, A., Zhang, J., and Wang, J. PerfIso: Performance Isolation for Commercial Latency-Sensitive Services. In 2018 USENIX Annual Technical Conference (USENIX ATC 18) (2018).Google ScholarDigital Library
Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., Adam, H., and Kalenichenko, D. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. arXiv preprint arXiv:1712.05877 (2017).Google Scholar
Jiang, A. H., Wong, D. L.-K., Canel, C., Tang, L., Misra, I., Kaminsky, M., Kozuch, M. A., Pillai, P., Andersen, D. G., and Ganger, G. R. Mainstream: Dynamic Stem-Sharing for Multi-Tenant Video Processing. In 2018 USENIX Annual Technical Conference (USENIX ATC 18) (2018).Google Scholar
Joshi, G., Liu, Y., and Soljanin, E. On the Delay-Storage Trade-Off in Content Download From Coded Distributed Storage Systems. IEEE JSAC, 5 (2014), 989--997.Google Scholar
Jouppi, N. P., Young, C., Patil, N., Patterson, D., Agrawal, G., Bajwa, R., Bates, S., Bhatia, S., Boden, N., Borchers, A., et al. In-Datacenter Performance Analysis of a Tensor Processing Unit. In 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA 17) (2017).Google ScholarDigital Library
Kim, H., Jiang, Y., Rana, R., Kannan, S., Oh, S., and Viswanath, P. Communication Algorithms via Deep Learning. In International Conference on Learning Representations (ICLR 18) (2018).Google Scholar
Kosaian, J., Rashmi, K. V., and Venkataraman, S. Learning a Code: Machine Learning for Approximate Non-Linear Coded Computation. arXiv preprint arXiv:1806.01259 (2018).Google Scholar
Krizhevsky, A., Sutskever, I., and Hinton, G. E. Imagenet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems (NIPS 12) (2012).Google Scholar
LeCun, Y. The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/.Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. Gradient-based Learning Applied to Document Recognition. Proceedings of the IEEE 86, 11 (1998), 2278--2324.Google Scholar
Lee, K., Lam, M., Pedarsani, R., Papailiopoulos, D., and Ramchandran, K. Speeding Up Distributed Machine Learning Using Codes. IEEE Transactions on Information Theory (July 2018).Google Scholar
Lee, Y., Scolari, A., Chun, B.-G., Santambrogio, M. D., Weimer, M., and Interlandi, M. PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) (2018).Google Scholar
Lee, Y., Scolari, A., Interlandi, M., Weimer, M., and Chun, B.-G. Towards High-Performance Prediction Serving Systems. NIPS ML Systems Workshop (2017).Google Scholar
Li, S., Maddah-Ali, M. A., and Avestimehr, A. S. A Unified Coding Framework for Distributed Computing With Straggling Servers. In 2016 IEEE Globecom Workshops (GC Wkshps) (2016).Google ScholarCross Ref
Li, Z. L., Liang, C.-J. M., He, W., Zhu, L., Dai, W., Jiang, J., and Sun, G. Metis: Robustly Tuning Tail Latencies of Cloud Systems. In 2018 USENIX Annual Technical Conference (USENIX ATC 18) (2018).Google Scholar
Liang, G., and Kozat, U. C. FAST CLOUD: Pushing the Envelope on Delay Performance of Cloud Storage with Coding. arXiv:1301.1294 (Jan. 2013).Google Scholar
Liu, Y., Wang, Y., Yu, R., Li, M., Sharma, V., and Wang, Y. Optimizing CNN Model Inference on CPUs. In 2019 USENIX Annual Technical Conference (USENIX ATC 19) (2019).Google Scholar
Mace, J., Bodik, P., Musuvathi, M., Fonseca, R., and Varadarajan, K. 2DFQ: Two-Dimensional Fair Queuing for Multi-Tenant Cloud Services. In Proceedings of the 2016 ACM SIGCOMM Conference (SIGCOMM 16) (2016).Google ScholarDigital Library
Mallick, A., Chaudhari, M., and Joshi, G. Rateless Codes for Near-Perfect Load Balancing in Distributed Matrix-Vector Multiplication. arXiv preprint arXiv:1804.10331 (2018).Google Scholar
Nachmani, E., Marciano, E., Lugosch, L., Gross, W. J., Burshtein, D., and Be'ery, Y. Deep Learning Methods for Improved Decoding of Linear Codes. IEEE Journal of Selected Topics in Signal Processing 12, 1 (2018), 119--131.Google Scholar
Olston, C., Fiedel, N., Gorovoy, K., Harmsen, J., Lao, L., Li, F., Rajashekhar, V., Ramesh, S., and Soyke, J. TensorFlow-Serving: Flexible, High-Performance ML Serving. NIPS ML Systems Workshop (2017).Google Scholar
Park, J., Naumov, M., Basu, P., Deng, S., Kalaiah, A., Khudia, D., Law, J., Malani, P., Malevich, A., Nadathur, S., et al. Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications. arXiv preprint arXiv:1811.09886 (2018).Google Scholar
Patterson, D. A., Gibson, G., and Katz, R. H. A Case for Redundant Arrays of Inexpensive Disks (RAID). In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD 88) (1988).Google ScholarDigital Library
Rashmi, K. V., Chowdhury, M., Kosaian, J., Stoica, I., and Ramchandran, K. EC-Cache: Load-Balanced, Low-Latency Cluster Caching with Online Erasure Coding. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) (2016).Google Scholar
Rashmi, K. V., Shah, N. B., Gu, D., Kuang, H., Borthakur, D., and Ramchandran, K. A Hitchhiker's Guide to Fast and Efficient Data Reconstruction in Erasure-Coded Data Centers. In Proceedings of the 2014 ACM SIGCOMM Conference (SIGCOMM 14) (2014).Google ScholarDigital Library
Recht, B., Re, C., Wright, S., and Niu, F. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent. In Advances in Neural Information Processing Systems (NIPS 11) (2011).Google Scholar
Reed, I. S., and Solomon, G. Polynomial Codes Over Certain Finite Fields. Journal of the society for industrial and applied mathematics 8, 2 (1960), 300--304.Google Scholar
Reisizadeh, A., Prakash, S., Pedarsani, R., and Avestimehr, S. Coded Computation Over Heterogeneous Clusters. In Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT 17) (2017).Google ScholarCross Ref
Richardson, T., and Urbanke, R. Modern Coding Theory. Cambridge University Press, 2008.Google ScholarDigital Library
Rizzo, L. Effective Erasure Codes for Reliable Computer Communication Protocols. ACM SIGCOMM Computer Communication Review 27, 2 (1997), 24--36.Google ScholarDigital Library
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., and Fei-Fei, L. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211--252.Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.-C. MobilenetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 18) (2018).Google ScholarCross Ref
Shah, N. B., Lee, K., and Ramchandran, K. When do Redundant Requests Reduce Latency? IEEE Transactions on Communications 64, 2 (2016), 715--722.Google ScholarCross Ref
Simonyan, K., and Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. In International Conference on Learning Representations (ICLR 15) (2015).Google Scholar
So, J., Guler, B., Avestimehr, A. S., and Mohassel, P. CodedPrivateML: A Fast and Privacy-Preserving Framework for Distributed Machine Learning. arXiv preprint arXiv:1902.00641 (2019).Google Scholar
Suresh, L., Canini, M., Schmid, S., and Feldmann, A. C3: Cutting Tail Latency in Cloud Data Stores via Adaptive Replica Selection. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15) (2015).Google Scholar
Venkataraman, S., Yang, Z., Franklin, M., Recht, B., and Stoica, I. Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16) (2016).Google ScholarDigital Library
Viola, P., and Jones, M. J. Robust Real-Time Face Detection. International Journal of Computer Vision 57, 2 (2004), 137--154.Google ScholarDigital Library
Wang, S., Liu, J., and Shroff, N. Coded Sparse Matrix Multiplication. In Proceedings of the International Conference on Machine Learning (ICML 18) (2018).Google Scholar
Wang, W., Gao, J., Zhang, M., Wang, S., Chen, G., Ng, T. K., Ooi, B. C., Shao, J., and Reyad, M. Rafiki: Machine Learning as an Analytics Service System. Proceedings of the VLDB Endowment 12, 2 (2018), 128--140.Google Scholar
Wang, X., Luo, Y., Crankshaw, D., Tumanov, A., Yu, F., and Gonzalez, J. E. IDK Cascades: Fast Deep Learning by Learning not to Overthink. In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI 18) (2018).Google Scholar
Warden, P. Speech commands: A Dataset for Limited-Vocabulary Speech Recognition. arXiv preprint arXiv:1804.03209 (2018).Google Scholar
Welinder, P., Branson, S., Mita, T., Wah, C., Schroff, F., Belongie, S., and Perona, P. Caltech-UCSD Birds 200. Tech. Rep. CNS-TR-2010-001, California Institute of Technology, 2010.Google Scholar
Xiao, H., Rasul, K., and Vollgraf, R. Fashion-Mnist: A Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv preprint arXiv:1708.07747 (2017).Google Scholar
Xu, Y., Musgrave, Z., Noble, B., and Bailey, M. Bobtail: Avoiding Long Tails in the Cloud. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13) (2013).Google Scholar
Yadwadkar, N. J., Ananthanarayanan, G., and Katz, R. Wrangler: Predictable and Faster Jobs using Fewer Resources. In Proceedings of the ACM Symposium on Cloud Computing (SoCC 14) (2014).Google ScholarDigital Library
Yadwadkar, N. J., Hariharan, B., Gonzalez, J. E., Smith, B., and Katz, R. H. Selecting the Best VM Across Multiple Public Clouds: A Data-Driven Performance Modeling Approach. In Proceedings of the ACM Symposium on Cloud Computing (SoCC 17) (2017).Google ScholarDigital Library
Yan, S., Li, H., Hao, M., Tong, M. H., Sundararaman, S., Chien, A. A., and Gunawi, H. S. Tiny-Tail Flash: Near-Perfect Elimination of Garbage Collection Tail Latencies in NAND SSDs. In 15th USENIX Conference on File and Storage Technologies (FAST 17) (2017).Google ScholarDigital Library
Yu, Q., Maddah-Ali, M., and Avestimehr, S. Polynomial Codes: An Optimal Design for High-Dimensional Coded Matrix Multiplication. In Advances in Neural Information Processing Systems (NIPS 17) (2017).Google Scholar
Yu, Q., Raviv, N., So, J., and Avestimehr, A. S. Lagrange Coded Computing: Optimal Design for Resiliency, Security and Privacy. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS 19) (2019).Google Scholar
Zaharia, M., Konwinski, A., Joseph, A. D., Katz, R., and Stoica, I. Improving MapReduce Performance in Heterogeneous Environments. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI 08) (2008).Google ScholarDigital Library
Zhang, H., Ananthanarayanan, G., Bodik, P., Philipose, M., Bahl, P., and Freedman, M. J. Live Video Analytics at Scale with Approximation and Delay-Tolerance. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17) (2017).Google Scholar
Zhang, M., Rajbhandari, S., Wang, W., and He, Y. DeepCPU: Serving RNN-based Deep Learning Models 10x Faster. In 2018 USENIX Annual Technical Conference (USENIX ATC 18) (2018).Google Scholar
Zoph, B., and Le, Q. V. Neural Architecture Search with Reinforcement Learning. arXiv preprint arXiv:1611.01578 (2016).Google Scholar

Index Terms

Parity models: erasure-coded resilience for prediction serving systems
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Redundancy
    2. Reliability

Recommendations

Efficient erasure-coded data updates based on file class predictions and hybrid writes
Abstract
A small update write can lead to a partial write to an erasure coding group in erasure-coded storage systems, resulting in a time-consuming write-after-read. This paper presents a data delta and logging based writing approach, named ...
Graphical abstract

Display Omitted
Highlights
- We aim to minimize the execution time of partial writes.
- We use file class ...
Read More
A Highly Reliable Storage Systems Based on SSD Array for IoE Environment

Devices in IoE Internet of Everything environment generate massive data from various sensors. To store and process the rapidly incoming large-scale data, SSDs are used for improving performance and reliability of storage systems. However, they have ...
Read More
H-V: An Improved Coding Layout Based on Erasure Coded Storage System
Database Systems for Advanced Applications. DASFAA 2022 International Workshops
Abstract
The failure of a single unreliable commodity components is very common in large-scale distributed storage systems. In order to ensure the reliability of data in large-scale distributed storage systems, a lot of studies have emerged one after ... $^{}$ $^{^{}^{}}$ $^{}$ $^{}$
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SOSP '19: Proceedings of the 27th ACM Symposium on Operating Systems Principles
October 2019
615 pages
ISBN:9781450368735
DOI:10.1145/3341301
General Chairs:
Tim Brecht
University of Waterloo
,
Carey Williamson
University of Calgary
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 October 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Artifacts Available
- Artifacts Evaluated & Functional
Author Tags
erasure coding
inference
machine learning
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate131of716submissions,18%
Upcoming Conference
SOSP '24

Sponsor:

sigops

ACM SIGOPS 29th Symposium on Operating Systems Principles

November 5 - 8, 2024

Austin , TX , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 21
  Total Citations
  View Citations
- 2,439
  Total Downloads
- Downloads (Last 12 months)305
- Downloads (Last 6 weeks)28
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Parity models: erasure-coded resilience for prediction serving systems

SOSP '19: Proceedings of the 27th ACM Symposium on Operating Systems Principles

ABSTRACT

References

Cited By

Index Terms

Recommendations

Efficient erasure-coded data updates based on file class predictions and hybrid writes

A Highly Reliable Storage Systems Based on SSD Array for IoE Environment

H-V: An Improved Coding Layout Based on Erasure Coded Storage System