skip to main content
research-article

Do you know your IQ?: a research agenda for information quality in systems

Published:21 January 2010Publication History
Skip Abstract Section

Abstract

Information quality (IQ) is a measure of how fit information is for a purpose. Sometimes called Quality of Information (QoI) by analogy with Quality of Service (QoS), it quantifies whether the correct information is being used to make a decision or take an action. Not understanding when information is of adequate quality can lead to bad decisions and catastrophic effects, including system outages, increased costs, lost revenue -- and worse. Quantifying information quality can help improve decision making, but the ultimate goal should be to select or construct information producers that have the appropriate balance between information quality and the cost of providing it. In this paper, we provide a brief introduction to the field, argue the case for applying information quality metrics in the systems domain, and propose a research agenda to explore this space.

References

  1. S. Agarwala, Y. Chen, D. Milojicic, and K. Schwan, "QMON: QoS- and utility-aware monitoring in enterprise systems", 3rd IEEE International Conference on Autonomic Computing (ICAC), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. Aggarwal and P. Yu, "A survey of uncertain data algorithms and applications," IEEE Trans. on Knowledge and Data Engineering, Vol. 21, No. 5, May 2009, pp. 609--623. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Aguilera, J. Mogul, J. Wiener, P. Reynolds, and A. Muthitacharoen, "Performance debugging for distributed systems of black boxes," Proc. SOSP, 2003, pp. 74--89. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Arlitt, K. Farkas, S. Iyer, S. P. Kumaresan, S. Rafaeli, "Data assurance: a prerequisite for IT automation", HPL-TR-2005-212, HP Laboratories, November 2005.Google ScholarGoogle Scholar
  5. P. Barham, A. Donnelly, R. Isaacs, and R. Mortier, "Using Magpie for request extraction and workload modelling," Proc. OSDI, 2004, pp. 259--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Bartlet-Ros, G. Iannaccone, J. Sanjuas-Cuxart, D. Amores-Lopez and J. Sole-Pareta, "Load shedding in network monitoring applications," Proc. USENIX Annual Technical Conf., 2007, pp. 59--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Y. Beth, B. Plale and D. Gannon,"A survey of data provenance in e-Science," SIGMOD Record, Vol. 34, 2005, pp. 31--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. I. Cohen, M. Goldszmidt, T. Kelly, J. Symons, and J. Chase, "Correlating instrumentation data to system states: a building block for automated diagnosis and control," Proc. OSDI, 2004, pp. 231--244. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. I. Cohen, S. Zhang, M. Goldszmidt, J. Symons, T. Kelly, and A. Fox,"Capturing, indexing, clustering, and retrieving system history, Proc. SOSP, 2005, pp. 105--118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. Cohen, N. Duffield, C. Lund, M. Thorup, "Confident estimation for multistage measurement sampling and aggregation,", Proc. SIGMETRICS, 2008, pp. 109--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. N. Dalvi and D. Suciu, "Management of probabilistic data: foundations and challenges," Proc. PODS, 2007, pp. 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Frank Dravos. "Information quality: the quest for justification", Business Intelligence Journal 7(2), Spring 2002.Google ScholarGoogle Scholar
  13. S. Duan, S. Babu and K. Munagala, "Fa: A system for automating failure diagnosis," Proc. ICDE, 2009, pp.1012--1023. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. V.F. Grasso, J.L. Beck,. and G. Manfredi, "Seismic early warning systems: procedure for automated decision making," Technical report EERL-2005-02, Caltech, Pasadena, CA, November 2005.Google ScholarGoogle Scholar
  15. R. Harji, "Harness Information to Deliver Enhanced Business Performance," Enterprise Search Summit, New York, NY, May 2009.Google ScholarGoogle Scholar
  16. N. Jain, P. Mahajan, D. Kit, P. Yalagandula, M. Dahlin, and Y. Zhang, "Network imprecision: a new consistency metric for scalable monitoring," Proc. OSDI'08, December 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Kahneman, P. Slovic and A. Tversky, Judgment under Uncertainty : Heuristics and Biases, Cambridge University Press, April 1982.Google ScholarGoogle ScholarCross RefCross Ref
  18. J. Kiernan and E. Terze, "EventSummarizer: a tool for summarizing large event sequences," Proc. 12th Intl. Conf. on Extending Database Technnology (EDBT'09), March 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Krishna, "Calculating the Value of Information," The Data Warehousing Institute (TDWI) New York City Chapter, June 10, 2009.Google ScholarGoogle Scholar
  20. M. Mesnier, M. Wachs, R. Sambasivan, A. Zheng, and G. Ganger, "Modeling the relative fitness of storage," Proc. SIGMETRICS, 2007, pp. 37--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. Murty and M. Welsh, "Towards a dependable architecture for Internet-scale sensing," Proc. 2nd Workshop on Hot Topics in Dependability (HotDep '06), November 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Preece, P. Missier, S. Embury, B. Jin and M. Greenwood,"An ontology-based approach to handling information quality in e-Science", Concurrency and Computation: Practice and Experience 20:253--264, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Rajbhandari, O. Rana and I. Wootten, "A fuzzy model for calculating workflow trust using provenance data," Proc. of 15th ACM Mardi Gras Conf., 2008, pp. 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. E. Thereska, B. Salmon, J. Strunk, M. Wachs, M. Abd-El-Malik, J. Lopez and G. Ganger, "Stardust: tracking activity in a distributed storage system," Proc. SIGMETRICS, June 2006, pp. 3--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. C. Wang, K.-L. Ma, "A statistical approach to volume data quality assessment," IEEE Trans on Visualization and Computer Graphics 14(3): 590--602, May/June 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. Wang, E. Michelakis, M. Garofalakis, and J. Hellerstein, "BayesStore: Managing Large, Uncertain Data Repositories with Probabilistic Graphical Models," Proc. VLDB, 2008, pp. 340--351. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Widom, "Trio: a system for data, uncertainty, and lineage," In C. Aggarwal, editor, Managing and Mining Uncertain Data, Springer, 2009, pp. 113--148.Google ScholarGoogle Scholar
  28. "NATO bombing of the Chinese embassy in Belgrade", Wikipedia, Dec. 2008.Google ScholarGoogle Scholar

Index Terms

  1. Do you know your IQ?: a research agenda for information quality in systems

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader