skip to main content
research-article

Who Models the World?: Collaborative Ontology Creation and User Roles in Wikidata

Published:01 November 2018Publication History
Skip Abstract Section

Abstract

Wikidata is a collaborative knowledge graph which is central to many academic and industry IT projects. Its users are responsible for maintaining the schema that organises this knowledge into classes, properties, and attributes, which together form the Wikidata 'ontology'. In this paper, we study the relationship between different Wikidata user roles and the quality of the Wikidata ontology. To do so we first propose a framework to evaluate the ontology as it evolves. We then cluster editing activities to identify user roles in monthly time frames. Finally, we explore how each role impacts the ontology. Our analysis shows that the Wikidata ontology has uneven breadth and depth. We identified two user roles: contributors and leaders. The second category is positively associated to ontology depth, with no significant effect on other features. Further work should investigate other dimensions to define user profiles and their influence on the knowledge graph.

References

  1. B. Thomas Adler and Luca de Alfaro. 2007. A content-driven reputation system for the Wikipedia. In WWW. ACM, 261--270. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ofer Arazy, Felipe Ortega, Oded Nov, M. Lisa Yeo, and Adam Balila. 2015. Functional Roles and Career Paths in Wikipedia. In CSCW. ACM , 1092--1105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Michael Ashburner, Catherine A Ball, Judith A Blake, David Botstein, Heather Butler, J Michael Cherry, Allan P Davis, Kara Dolinski, Selina S Dwight, Janan T Eppig, et almbox. 2000. Gene Ontology: tool for the unification of biology. Nature genetics , Vol. 25, 1 (2000), 25.Google ScholarGoogle Scholar
  4. Samuel Barbosa, Dan Cosley, Amit Sharma, and Roberto M. Cesar Jr. 2016. Averaging Gone Wrong: Using Time-Aware Analyses to Better Understand Behavior. In WWW. ACM, 829--841. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Jonathan BL Bard and Seung Y Rhee. 2004. Ontologies in biology: design, applications and future challenges. Nature Reviews Genetics , Vol. 5, 3 (2004), 213.Google ScholarGoogle ScholarCross RefCross Ref
  6. Kurt D. Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD Conference. ACM, 1247--1250. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Janez Brank, Marko Grobelnik, and Dunja Mladenić. 2005. A survey of ontology evaluation techniques. In Proceedings of the Conference on Data Mining and Data Warehouses (SiKDD) .Google ScholarGoogle Scholar
  8. Freddy Brasileiro, Jo a o Paulo A. Almeida, Victorio Albani de Carvalho, and Giancarlo Guizzardi. 2016. Applying a Multi-Level Modeling Theory to Assess Taxonomic Hierarchies in Wikidata. In WWW (Companion Volume). ACM, 975--980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Linus Dahlander and Siobhan O'Mahony. 2011. Progressing to the center: Coordinating project work. Organization science , Vol. 22, 4 (2011), 961--979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Martin Doerr. 2003. The CIDOC conceptual reference module: an ontological approach to semantic interoperability of metadata. AI magazine , Vol. 24, 3 (2003), 75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Fredo Erxleben, Michael Gü nther, Markus Krö tzsch, Julian Mendez, and Denny Vrandevc ić. 2014. Introducing Wikidata to the Linked Data Web. In Semantic Web Conference (1) (Lecture Notes in Computer Science), Vol. 8796. Springer, 50--65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Sean M. Falconer, Tania Tudorache, and Natalya Fridman Noy. 2011. An analysis of collaborative patterns in large-scale ontology development projects. In K-CAP. ACM, 25--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Miriam Ferná ndez, Chwhynny Overbeeke, Marta Sabou, and Enrico Motta. 2009. What Makes a Good Ontology? A Case-Study in Fine-Grained Knowledge Reuse. In ASWC (Lecture Notes in Computer Science), Vol. 5926. Springer, 61--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Aldo Gangemi, Carola Catenacci, Massimiliano Ciaramita, and Jos Lehmann. 2006. Modelling Ontology Evaluation and Validation. In ESWC (Lecture Notes in Computer Science), Vol. 4011. Springer, 140--154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Andrew Hall, Sarah McRoberts, Jacob Thebault-Spieker, Yilun Lin, Shilad Sen, Brent Hecht, and Loren Terveen. 2017. Freedom vs Standardization: Structured Data Generation in a Peer Production Community. In CHI. ACM, 6352--6362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Brent J. Hecht and Darren Gergle. 2010. The tower of Babel meets web 2.0: user-generated content and its applications in a multilingual context. In CHI. ACM , 291--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Hlomani Hlomani and Deborah Stacey. 2014. Approaches, methods, metrics, measures, and subjectivity in ontology evaluation: A survey. Semantic Web Journal , Vol. 1, 5 (2014), 1--11.Google ScholarGoogle Scholar
  18. Aniket Kittur, Ed Chi, Bryan A Pendleton, Bongwon Suh, and Todd Mytkowicz. 2007. Power of the few vs. wisdom of the crowd: Wikipedia and the rise of the bourgeoisie. World Wide Web , Vol. 1, 2 (2007), 19.Google ScholarGoogle Scholar
  19. Aniket Kittur and Robert E. Kraut. 2008. Harnessing the wisdom of crowds in Wikipedia: quality through coordination. In CSCW. ACM, 37--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Robert E Kraut, Paul Resnick, Sara Kiesler, Moira Burke, Yan Chen, Niki Kittur, Joseph Konstan, Yuqing Ren, and John Riedl. 2012. Building successful online communities: Evidence-based social design .Mit Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Kari Kuutti. 1996. Activity theory as a potential framework for human-computer interaction research. Context and consciousness: Activity theory and human-computer interaction , Vol. 1744 (1996). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Birger Lantow. 2016. OntoMetrics: Application of On-line Ontology Metric Calculation. In BIR Workshops (CEUR Workshop Proceedings), Vol. 1684. CEUR-WS.org.Google ScholarGoogle Scholar
  23. Jean Lave and Etienne Wenger. 1991. Situated learning: Legitimate peripheral participation .Cambridge University Press.Google ScholarGoogle Scholar
  24. Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N. Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick van Kleef, Sö ren Auer, and Christian Bizer. 2015. DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web , Vol. 6, 2 (2015), 167--195.Google ScholarGoogle ScholarCross RefCross Ref
  25. Jun Liu and Sudha Ram. 2011. Who does what: Collaboration patterns in the wikipedia and their impact on article quality. ACM Trans. Management Inf. Syst. , Vol. 2, 2 (2011), 11:1--11:23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Farzaneh Mahdisoltani, Joanna Biega, and Fabian M. Suchanek. 2015. YAGO3: A Knowledge Base from Multilingual Wikipedias. In CIDR. www.cidrdb.org.Google ScholarGoogle Scholar
  27. Claudia Mü ller-Birn, Benjamin Karran, Janette Lehmann, and Markus Luczak-Rö sch. 2015. Peer-production system or collaborative ontology engineering effort: what is Wikidata?. In OpenSym. ACM, 20:1--20:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Juan Francisco Garc'i a Navarro, Francisco José Garc'i a-Pe n alvo, and Roberto Theró n. 2010. A Survey on Ontology Metrics. In WSKS (1) (Communications in Computer and Information Science), Vol. 111. Springer, 22--27.Google ScholarGoogle Scholar
  29. Natalya F Noy, Deborah L McGuinness, et almbox. 2001. Ontology development 101: A guide to creating your first ontology. (2001).Google ScholarGoogle Scholar
  30. Siobhán O'Mahony and Fabrizio Ferraro. 2007. The emergence of governance in an open source community. Academy of Management Journal , Vol. 50, 5 (2007), 1079--1106.Google ScholarGoogle ScholarCross RefCross Ref
  31. Anthony M. Orme, Haining Yao, and Letha H. Etzkorn. 2006. Coupling Metrics for Ontology-Based Systems. IEEE Software , Vol. 23, 2 (Mar 2006), 102--108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Heiko Paulheim. 2017. Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic Web , Vol. 8, 3 (2017), 489--508.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Alessandro Piscopo, Christopher Phethean, and Elena Simperl. 2017a. What Makes a Good Collaborative Knowledge Graph: Group Composition and Quality in Wikidata. In SocInfo (1) (Lecture Notes in Computer Science), Vol. 10539. Springer, 305--322.Google ScholarGoogle Scholar
  34. Alessandro Piscopo, Christopher Phethean, and Elena Simperl. 2017b. Wikidatians are Born: Paths to Full Participation in a Collaborative Structured Knowledge Base. In HICSS. AIS Electronic Library (AISeL).Google ScholarGoogle Scholar
  35. Jennifer Preece and Ben Shneiderman. 2009. The Reader-to-Leader framework: Motivating technology-mediated social participation. AIS transactions on human-computer interaction , Vol. 1, 1 (2009), 5.Google ScholarGoogle Scholar
  36. Cristina Sarasua, Alessandro Checco, Gianluca Demartini, Djellel Difallah, Michael Feldman, and Lydia Pintscher. 2018. The Evolution of Power and Standard Wikidata Editors: Comparing Editing Behavior over Time to Predict Lifespan and Volume of Edits . Journal of Computer Supported Cooperative Work (2018).Google ScholarGoogle Scholar
  37. Miguel-Á ngel Sicilia, Daniel Rodr'i guez, Elena Garc'i a Barriocanal, and Salvador Sá nchez Alonso. 2012. Empirical findings on ontology metrics. Expert Syst. Appl. , Vol. 39, 8 (2012), 6706--6711. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Thomas Steiner. 2014. Bots vs. Wikipedians, Anons vs. Logged-Ins (Redux): A Global Study of Edit Activity on Wikipedia and Wikidata. In OpenSym. ACM , 25:1--25:7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Samir Tartir and Ismailcem Budak Arpinar. 2007. Ontology Evaluation and Ranking using OntoQA. In ICSC. IEEE Computer Society, 185--192. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Adolfo Lozano Tello and Asunció n Gó mez-Pé rez. 2004. ONTOMETRIC: A Method to Choose the Appropriate Ontology. J. Database Manag. , Vol. 15, 2 (2004), 1--18.Google ScholarGoogle ScholarCross RefCross Ref
  41. Robert Tibshirani, Guenther Walther, and Trevor Hastie. 2001. Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , Vol. 63, 2 (2001), 411--423.Google ScholarGoogle ScholarCross RefCross Ref
  42. Markel Vigo, Caroline Jay, and Robert Stevens. 2015. Constructing Conceptual Knowledge Artefacts: Activity Patterns in the Ontology Authoring Process. In CHI. ACM , 3385--3394. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Johanna Völker, Denny Vrandevc ić, and York Sure. 2005. Automatic evaluation of ontologies (AEON). In International Semantic Web Conference. Springer, 716--731. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Denny Vrandevc ić. 2010. Ontology evaluation . Ph.D. Dissertation. Karlsruhe Institute of Technology.Google ScholarGoogle Scholar
  45. Denny Vrandevc ić and Markus Krö tzsch. 2014. Wikidata: a free collaborative knowledgebase. Commun. ACM , Vol. 57, 10 (2014), 78--85. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. W3C OWL Working Group. 2012. OWL 2 Web Ontology Language Document Overview (Second Edition) - W3C Recommendation 11 December 2012 . (Dec. 2012). http://www.w3.org/TR/owl2-overview/Google ScholarGoogle Scholar
  47. Simon Walk, Philipp Singer, Markus Strohmaier, Tania Tudorache, Mark A. Musen, and Natalya Fridman Noy. 2014. Discovering Beaten Paths in Collaborative Ontology-Engineering Projects using Markov Chains. Journal of Biomedical Informatics , Vol. 51 (2014), 254--271. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Howard T. Welser, Eric Gleave, Danyel Fisher, and Marc A. Smith. 2007. Visualizing the Signatures of Social Roles in Online Discussion Groups. Journal of Social Structure , Vol. 8 (2007).Google ScholarGoogle Scholar
  49. Wikidata. 2018a. Wikidata:Bots . https://www.wikidata.org/wiki/Wikidata:Bots . (2018). {Online; accessed 08-April-2018}.Google ScholarGoogle Scholar
  50. Wikidata. 2018b. Wikidata:User access levels . https://www.wikidata.org/wiki/Wikidata:User_access_levels . (2018). {Online; accessed 08-April-2018}.Google ScholarGoogle Scholar
  51. Haining Yao, Anthony Mark Orme, and Letha Etzkorn. 2005. Cohesion metrics for ontology design and application. Journal of Computer Science , Vol. 1, 1 (2005), 107--113.Google ScholarGoogle ScholarCross RefCross Ref
  52. Jonathan Yu, James A Thom, and Audrey Tam. 2007. Ontology evaluation using Wikipedia categories for browsing. In Proceedings of the sixteenth ACM conference on Conference on Information and Knowledge Management. ACM, 223--232. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Yang Zhe, Dalu Zhang, and YE Chuan. 2006. Evaluation metrics for ontology complexity and evolution analysis. In e-Business Engineering, 2006. ICEBE'06. IEEE International Conference on. IEEE, 162--170. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Who Models the World?: Collaborative Ontology Creation and User Roles in Wikidata

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the ACM on Human-Computer Interaction
        Proceedings of the ACM on Human-Computer Interaction  Volume 2, Issue CSCW
        November 2018
        4104 pages
        EISSN:2573-0142
        DOI:10.1145/3290265
        Issue’s Table of Contents

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 November 2018
        Published in pacmhci Volume 2, Issue CSCW

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader