skip to main content
10.1145/2908131.2908198acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
extended-abstract

Towards web documents quality assessment for digital humanities scholars

Published:22 May 2016Publication History

ABSTRACT

We present a framework for assessing the quality of Web documents, and a baseline of three quality dimensions: trustworthiness, objectivity and basic scholarly quality. Assessing Web document quality is a "deep data" problem necessitating approaches to handle both data size and complexity.

References

  1. AlchemyAPI, Inc. Alchemyapi. http://www.alchemyapi.com, 2015.Google ScholarGoogle Scholar
  2. American Library Association. Evaluating information: A basic checklist. Technical report, American Library Association, 1994.Google ScholarGoogle Scholar
  3. M. Anderka, B. Stein, and N. Lipka. Predicting quality flaws in user-generated content: The case of wikipedia. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '12, pages 981--990, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Bessi, M. Coletto, G. Davidescu, A. Scala, G. Caldarelli, and W. Quattrociocchi. Science vs conspiracy: Collective narratives in the age of misinformation. volume 2, 2015.Google ScholarGoogle Scholar
  5. L. Bottou. Stochastic learning. In O. Bousquet and U. von Luxburg, editors, Advanced Lectures on Machine Learning, Lecture Notes in Artificial Intelligence, LNAI 3176, pages 146--168. Springer Verlag, Berlin, 2004.Google ScholarGoogle Scholar
  6. D. Ceolin, P. Groth, V. Maccatrozzo, W. Fokkink, W. R. van Hage, and A. Nottamkandath. Combining user reputation and provenance analysis for trust assessment.Google ScholarGoogle Scholar
  7. C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20(3):273--297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. H. Dalip, M. A. Gonçalves, M. Cristo, and P. Calado. Automatic assessment of document quality in web collaborative digital libraries. Journal Data and Information Quality, 2(3):14:1--14:30, Dec. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. V. de Boer, M. Hildebrand, L. Aroyo, P. Leenheer, C. Dijkshoorn, B. Tesfa, and G. Schreiber. Knowledge Engineering and Knowledge Management: 18th International Conference, EKAW 2012, Galway City, Ireland, October 8-12, 2012. Proceedings, chapter Nichesourcing: Harnessing the Power of Crowds of Experts, pages 16--20. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. De Jong and P. Schellens. Toward a document evaluation methodology: What does research tell us about the validity and reliability of evaluation methods? 2000.Google ScholarGoogle Scholar
  11. Google, Inc. Safe Browsing -- Transparency Report. https://www.google.com/transparencyreport/safebrowsing/.Google ScholarGoogle Scholar
  12. O. Hartig and J. Zhao. Using web data provenance for quality assessment. In Proceedings of the Intenational Workshop on Semantic Web and Provenance Management, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Howell and W. Prevenier. From Reliable Sources: An Introduction to Historical Methods. Cornell University Press, 2001.Google ScholarGoogle Scholar
  14. O. Inel, K. Khamkham, T. Cristea, A. Dumitrache, A. Rutjes, J. Ploeg, L. Romaszko, L. Aroyo, and R.-J. Sips. The Semantic Web -- ISWC 2014: 13th International Semantic Web Conference, Riva del Garda, Italy, October 19-23, 2014. Proceedings, Part II, chapter CrowdTruth: Machine-Human Computation Framework for Harnessing Disagreement in Gathering Annotated Data, pages 486--504. Springer International Publishing, Cham, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. International Organization for Standardization. ISO/IEC 25012:2008 Software engineering -- Software product Quality Requirements and Evaluation (SQuaRE) -- Data quality model. Technical report, International Organization for Standardization, 2008.Google ScholarGoogle Scholar
  16. Y. W. Lee, D. M. Strong, B. K. Kahn, and R. Y. Wang. Aimq: A methodology for information quality assessment. Inf. Manage., 40(2):133--146, Dec. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Nottamkandath, J. Oosterman, D. Ceolin, G. K. D. de Vries, and W. Fokkink. Predicting quality of crowdsourced annotations using graph kernels. In Trust Management IX, pages 134--148. Springer International Publishing.Google ScholarGoogle Scholar
  18. WOT Services, Ltd. http://www.mywot.com, 2006.Google ScholarGoogle Scholar
  19. A. Zaveri, A. Rula, A. Maurino, R. Pietrobon, J. Lehmann, and S. Auer. Quality assessment for linked data: A survey. Semantic Web Journal, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  20. H. Zhu, Y. Ma, and G. Su. Collaboratively assessing information quality on the web. In ICIS sigIQ Workshop, 2011.Google ScholarGoogle Scholar

Index Terms

  1. Towards web documents quality assessment for digital humanities scholars

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          WebSci '16: Proceedings of the 8th ACM Conference on Web Science
          May 2016
          392 pages
          ISBN:9781450342087
          DOI:10.1145/2908131

          Copyright © 2016 Owner/Author

          Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 22 May 2016

          Check for updates

          Qualifiers

          • extended-abstract

          Acceptance Rates

          WebSci '16 Paper Acceptance Rate13of70submissions,19%Overall Acceptance Rate218of875submissions,25%

          Upcoming Conference

          Websci '24
          16th ACM Web Science Conference
          May 21 - 24, 2024
          Stuttgart , Germany

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader