ABSTRACT
We present a framework for assessing the quality of Web documents, and a baseline of three quality dimensions: trustworthiness, objectivity and basic scholarly quality. Assessing Web document quality is a "deep data" problem necessitating approaches to handle both data size and complexity.
- AlchemyAPI, Inc. Alchemyapi. http://www.alchemyapi.com, 2015.Google Scholar
- American Library Association. Evaluating information: A basic checklist. Technical report, American Library Association, 1994.Google Scholar
- M. Anderka, B. Stein, and N. Lipka. Predicting quality flaws in user-generated content: The case of wikipedia. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '12, pages 981--990, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
- A. Bessi, M. Coletto, G. Davidescu, A. Scala, G. Caldarelli, and W. Quattrociocchi. Science vs conspiracy: Collective narratives in the age of misinformation. volume 2, 2015.Google Scholar
- L. Bottou. Stochastic learning. In O. Bousquet and U. von Luxburg, editors, Advanced Lectures on Machine Learning, Lecture Notes in Artificial Intelligence, LNAI 3176, pages 146--168. Springer Verlag, Berlin, 2004.Google Scholar
- D. Ceolin, P. Groth, V. Maccatrozzo, W. Fokkink, W. R. van Hage, and A. Nottamkandath. Combining user reputation and provenance analysis for trust assessment.Google Scholar
- C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20(3):273--297. Google ScholarDigital Library
- D. H. Dalip, M. A. Gonçalves, M. Cristo, and P. Calado. Automatic assessment of document quality in web collaborative digital libraries. Journal Data and Information Quality, 2(3):14:1--14:30, Dec. 2011. Google ScholarDigital Library
- V. de Boer, M. Hildebrand, L. Aroyo, P. Leenheer, C. Dijkshoorn, B. Tesfa, and G. Schreiber. Knowledge Engineering and Knowledge Management: 18th International Conference, EKAW 2012, Galway City, Ireland, October 8-12, 2012. Proceedings, chapter Nichesourcing: Harnessing the Power of Crowds of Experts, pages 16--20. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012. Google ScholarDigital Library
- M. De Jong and P. Schellens. Toward a document evaluation methodology: What does research tell us about the validity and reliability of evaluation methods? 2000.Google Scholar
- Google, Inc. Safe Browsing -- Transparency Report. https://www.google.com/transparencyreport/safebrowsing/.Google Scholar
- O. Hartig and J. Zhao. Using web data provenance for quality assessment. In Proceedings of the Intenational Workshop on Semantic Web and Provenance Management, 2009. Google ScholarDigital Library
- M. Howell and W. Prevenier. From Reliable Sources: An Introduction to Historical Methods. Cornell University Press, 2001.Google Scholar
- O. Inel, K. Khamkham, T. Cristea, A. Dumitrache, A. Rutjes, J. Ploeg, L. Romaszko, L. Aroyo, and R.-J. Sips. The Semantic Web -- ISWC 2014: 13th International Semantic Web Conference, Riva del Garda, Italy, October 19-23, 2014. Proceedings, Part II, chapter CrowdTruth: Machine-Human Computation Framework for Harnessing Disagreement in Gathering Annotated Data, pages 486--504. Springer International Publishing, Cham, 2014. Google ScholarDigital Library
- International Organization for Standardization. ISO/IEC 25012:2008 Software engineering -- Software product Quality Requirements and Evaluation (SQuaRE) -- Data quality model. Technical report, International Organization for Standardization, 2008.Google Scholar
- Y. W. Lee, D. M. Strong, B. K. Kahn, and R. Y. Wang. Aimq: A methodology for information quality assessment. Inf. Manage., 40(2):133--146, Dec. 2002. Google ScholarDigital Library
- A. Nottamkandath, J. Oosterman, D. Ceolin, G. K. D. de Vries, and W. Fokkink. Predicting quality of crowdsourced annotations using graph kernels. In Trust Management IX, pages 134--148. Springer International Publishing.Google Scholar
- WOT Services, Ltd. http://www.mywot.com, 2006.Google Scholar
- A. Zaveri, A. Rula, A. Maurino, R. Pietrobon, J. Lehmann, and S. Auer. Quality assessment for linked data: A survey. Semantic Web Journal, 2015.Google ScholarCross Ref
- H. Zhu, Y. Ma, and G. Su. Collaboratively assessing information quality on the web. In ICIS sigIQ Workshop, 2011.Google Scholar
Index Terms
- Towards web documents quality assessment for digital humanities scholars
Recommendations
Digital Humanities: Crafts and Occupations
AIUCD '14: Proceedings of the Third AIUCD Annual Conference on Humanities and Their Methods in the Digital EcosystemThis panel aims at discussing about new crafts and occupations in the Digital Humanities domain. Some experts in different contexts (publishing house, computer agency, freelance professional, library, archive, and university) are asked to reflect on the ...
Archives' call to Digital Humanities: a case study of Portuguese Municipal Archives
TEEM 2017: Proceedings of the 5th International Conference on Technological Ecosystems for Enhancing MulticulturalityThe availability of digitised cultural heritage content held by archives and other memory institutions improves their visibility, facilitate and increases access to information, allowing new kinds of research of digital heritage, namely Digital ...
Improving Digital Libraries' Provision of Digital Humanities Datasets: A Case Study of HTRC Literature Dataset
JCDL '20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020This paper investigates the limitations and challenges of the curated datasets provided by digital libraries in support of digital humanities (DH) research. Our presented work provides a use case utilizing an English literature dataset of 178,381 ...
Comments