Abstract
Wikidata is a collaborative knowledge graph which is central to many academic and industry IT projects. Its users are responsible for maintaining the schema that organises this knowledge into classes, properties, and attributes, which together form the Wikidata 'ontology'. In this paper, we study the relationship between different Wikidata user roles and the quality of the Wikidata ontology. To do so we first propose a framework to evaluate the ontology as it evolves. We then cluster editing activities to identify user roles in monthly time frames. Finally, we explore how each role impacts the ontology. Our analysis shows that the Wikidata ontology has uneven breadth and depth. We identified two user roles: contributors and leaders. The second category is positively associated to ontology depth, with no significant effect on other features. Further work should investigate other dimensions to define user profiles and their influence on the knowledge graph.
- B. Thomas Adler and Luca de Alfaro. 2007. A content-driven reputation system for the Wikipedia. In WWW. ACM, 261--270. Google ScholarDigital Library
- Ofer Arazy, Felipe Ortega, Oded Nov, M. Lisa Yeo, and Adam Balila. 2015. Functional Roles and Career Paths in Wikipedia. In CSCW. ACM , 1092--1105. Google ScholarDigital Library
- Michael Ashburner, Catherine A Ball, Judith A Blake, David Botstein, Heather Butler, J Michael Cherry, Allan P Davis, Kara Dolinski, Selina S Dwight, Janan T Eppig, et almbox. 2000. Gene Ontology: tool for the unification of biology. Nature genetics , Vol. 25, 1 (2000), 25.Google Scholar
- Samuel Barbosa, Dan Cosley, Amit Sharma, and Roberto M. Cesar Jr. 2016. Averaging Gone Wrong: Using Time-Aware Analyses to Better Understand Behavior. In WWW. ACM, 829--841. Google ScholarDigital Library
- Jonathan BL Bard and Seung Y Rhee. 2004. Ontologies in biology: design, applications and future challenges. Nature Reviews Genetics , Vol. 5, 3 (2004), 213.Google ScholarCross Ref
- Kurt D. Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD Conference. ACM, 1247--1250. Google ScholarDigital Library
- Janez Brank, Marko Grobelnik, and Dunja Mladenić. 2005. A survey of ontology evaluation techniques. In Proceedings of the Conference on Data Mining and Data Warehouses (SiKDD) .Google Scholar
- Freddy Brasileiro, Jo a o Paulo A. Almeida, Victorio Albani de Carvalho, and Giancarlo Guizzardi. 2016. Applying a Multi-Level Modeling Theory to Assess Taxonomic Hierarchies in Wikidata. In WWW (Companion Volume). ACM, 975--980. Google ScholarDigital Library
- Linus Dahlander and Siobhan O'Mahony. 2011. Progressing to the center: Coordinating project work. Organization science , Vol. 22, 4 (2011), 961--979. Google ScholarDigital Library
- Martin Doerr. 2003. The CIDOC conceptual reference module: an ontological approach to semantic interoperability of metadata. AI magazine , Vol. 24, 3 (2003), 75. Google ScholarDigital Library
- Fredo Erxleben, Michael Gü nther, Markus Krö tzsch, Julian Mendez, and Denny Vrandevc ić. 2014. Introducing Wikidata to the Linked Data Web. In Semantic Web Conference (1) (Lecture Notes in Computer Science), Vol. 8796. Springer, 50--65. Google ScholarDigital Library
- Sean M. Falconer, Tania Tudorache, and Natalya Fridman Noy. 2011. An analysis of collaborative patterns in large-scale ontology development projects. In K-CAP. ACM, 25--32. Google ScholarDigital Library
- Miriam Ferná ndez, Chwhynny Overbeeke, Marta Sabou, and Enrico Motta. 2009. What Makes a Good Ontology? A Case-Study in Fine-Grained Knowledge Reuse. In ASWC (Lecture Notes in Computer Science), Vol. 5926. Springer, 61--75. Google ScholarDigital Library
- Aldo Gangemi, Carola Catenacci, Massimiliano Ciaramita, and Jos Lehmann. 2006. Modelling Ontology Evaluation and Validation. In ESWC (Lecture Notes in Computer Science), Vol. 4011. Springer, 140--154. Google ScholarDigital Library
- Andrew Hall, Sarah McRoberts, Jacob Thebault-Spieker, Yilun Lin, Shilad Sen, Brent Hecht, and Loren Terveen. 2017. Freedom vs Standardization: Structured Data Generation in a Peer Production Community. In CHI. ACM, 6352--6362. Google ScholarDigital Library
- Brent J. Hecht and Darren Gergle. 2010. The tower of Babel meets web 2.0: user-generated content and its applications in a multilingual context. In CHI. ACM , 291--300. Google ScholarDigital Library
- Hlomani Hlomani and Deborah Stacey. 2014. Approaches, methods, metrics, measures, and subjectivity in ontology evaluation: A survey. Semantic Web Journal , Vol. 1, 5 (2014), 1--11.Google Scholar
- Aniket Kittur, Ed Chi, Bryan A Pendleton, Bongwon Suh, and Todd Mytkowicz. 2007. Power of the few vs. wisdom of the crowd: Wikipedia and the rise of the bourgeoisie. World Wide Web , Vol. 1, 2 (2007), 19.Google Scholar
- Aniket Kittur and Robert E. Kraut. 2008. Harnessing the wisdom of crowds in Wikipedia: quality through coordination. In CSCW. ACM, 37--46. Google ScholarDigital Library
- Robert E Kraut, Paul Resnick, Sara Kiesler, Moira Burke, Yan Chen, Niki Kittur, Joseph Konstan, Yuqing Ren, and John Riedl. 2012. Building successful online communities: Evidence-based social design .Mit Press. Google ScholarDigital Library
- Kari Kuutti. 1996. Activity theory as a potential framework for human-computer interaction research. Context and consciousness: Activity theory and human-computer interaction , Vol. 1744 (1996). Google ScholarDigital Library
- Birger Lantow. 2016. OntoMetrics: Application of On-line Ontology Metric Calculation. In BIR Workshops (CEUR Workshop Proceedings), Vol. 1684. CEUR-WS.org.Google Scholar
- Jean Lave and Etienne Wenger. 1991. Situated learning: Legitimate peripheral participation .Cambridge University Press.Google Scholar
- Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N. Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick van Kleef, Sö ren Auer, and Christian Bizer. 2015. DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web , Vol. 6, 2 (2015), 167--195.Google ScholarCross Ref
- Jun Liu and Sudha Ram. 2011. Who does what: Collaboration patterns in the wikipedia and their impact on article quality. ACM Trans. Management Inf. Syst. , Vol. 2, 2 (2011), 11:1--11:23. Google ScholarDigital Library
- Farzaneh Mahdisoltani, Joanna Biega, and Fabian M. Suchanek. 2015. YAGO3: A Knowledge Base from Multilingual Wikipedias. In CIDR. www.cidrdb.org.Google Scholar
- Claudia Mü ller-Birn, Benjamin Karran, Janette Lehmann, and Markus Luczak-Rö sch. 2015. Peer-production system or collaborative ontology engineering effort: what is Wikidata?. In OpenSym. ACM, 20:1--20:10. Google ScholarDigital Library
- Juan Francisco Garc'i a Navarro, Francisco José Garc'i a-Pe n alvo, and Roberto Theró n. 2010. A Survey on Ontology Metrics. In WSKS (1) (Communications in Computer and Information Science), Vol. 111. Springer, 22--27.Google Scholar
- Natalya F Noy, Deborah L McGuinness, et almbox. 2001. Ontology development 101: A guide to creating your first ontology. (2001).Google Scholar
- Siobhán O'Mahony and Fabrizio Ferraro. 2007. The emergence of governance in an open source community. Academy of Management Journal , Vol. 50, 5 (2007), 1079--1106.Google ScholarCross Ref
- Anthony M. Orme, Haining Yao, and Letha H. Etzkorn. 2006. Coupling Metrics for Ontology-Based Systems. IEEE Software , Vol. 23, 2 (Mar 2006), 102--108. Google ScholarDigital Library
- Heiko Paulheim. 2017. Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic Web , Vol. 8, 3 (2017), 489--508.Google ScholarDigital Library
- Alessandro Piscopo, Christopher Phethean, and Elena Simperl. 2017a. What Makes a Good Collaborative Knowledge Graph: Group Composition and Quality in Wikidata. In SocInfo (1) (Lecture Notes in Computer Science), Vol. 10539. Springer, 305--322.Google Scholar
- Alessandro Piscopo, Christopher Phethean, and Elena Simperl. 2017b. Wikidatians are Born: Paths to Full Participation in a Collaborative Structured Knowledge Base. In HICSS. AIS Electronic Library (AISeL).Google Scholar
- Jennifer Preece and Ben Shneiderman. 2009. The Reader-to-Leader framework: Motivating technology-mediated social participation. AIS transactions on human-computer interaction , Vol. 1, 1 (2009), 5.Google Scholar
- Cristina Sarasua, Alessandro Checco, Gianluca Demartini, Djellel Difallah, Michael Feldman, and Lydia Pintscher. 2018. The Evolution of Power and Standard Wikidata Editors: Comparing Editing Behavior over Time to Predict Lifespan and Volume of Edits . Journal of Computer Supported Cooperative Work (2018).Google Scholar
- Miguel-Á ngel Sicilia, Daniel Rodr'i guez, Elena Garc'i a Barriocanal, and Salvador Sá nchez Alonso. 2012. Empirical findings on ontology metrics. Expert Syst. Appl. , Vol. 39, 8 (2012), 6706--6711. Google ScholarDigital Library
- Thomas Steiner. 2014. Bots vs. Wikipedians, Anons vs. Logged-Ins (Redux): A Global Study of Edit Activity on Wikipedia and Wikidata. In OpenSym. ACM , 25:1--25:7. Google ScholarDigital Library
- Samir Tartir and Ismailcem Budak Arpinar. 2007. Ontology Evaluation and Ranking using OntoQA. In ICSC. IEEE Computer Society, 185--192. Google ScholarDigital Library
- Adolfo Lozano Tello and Asunció n Gó mez-Pé rez. 2004. ONTOMETRIC: A Method to Choose the Appropriate Ontology. J. Database Manag. , Vol. 15, 2 (2004), 1--18.Google ScholarCross Ref
- Robert Tibshirani, Guenther Walther, and Trevor Hastie. 2001. Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , Vol. 63, 2 (2001), 411--423.Google ScholarCross Ref
- Markel Vigo, Caroline Jay, and Robert Stevens. 2015. Constructing Conceptual Knowledge Artefacts: Activity Patterns in the Ontology Authoring Process. In CHI. ACM , 3385--3394. Google ScholarDigital Library
- Johanna Völker, Denny Vrandevc ić, and York Sure. 2005. Automatic evaluation of ontologies (AEON). In International Semantic Web Conference. Springer, 716--731. Google ScholarDigital Library
- Denny Vrandevc ić. 2010. Ontology evaluation . Ph.D. Dissertation. Karlsruhe Institute of Technology.Google Scholar
- Denny Vrandevc ić and Markus Krö tzsch. 2014. Wikidata: a free collaborative knowledgebase. Commun. ACM , Vol. 57, 10 (2014), 78--85. Google ScholarDigital Library
- W3C OWL Working Group. 2012. OWL 2 Web Ontology Language Document Overview (Second Edition) - W3C Recommendation 11 December 2012 . (Dec. 2012). http://www.w3.org/TR/owl2-overview/Google Scholar
- Simon Walk, Philipp Singer, Markus Strohmaier, Tania Tudorache, Mark A. Musen, and Natalya Fridman Noy. 2014. Discovering Beaten Paths in Collaborative Ontology-Engineering Projects using Markov Chains. Journal of Biomedical Informatics , Vol. 51 (2014), 254--271. Google ScholarDigital Library
- Howard T. Welser, Eric Gleave, Danyel Fisher, and Marc A. Smith. 2007. Visualizing the Signatures of Social Roles in Online Discussion Groups. Journal of Social Structure , Vol. 8 (2007).Google Scholar
- Wikidata. 2018a. Wikidata:Bots . https://www.wikidata.org/wiki/Wikidata:Bots . (2018). {Online; accessed 08-April-2018}.Google Scholar
- Wikidata. 2018b. Wikidata:User access levels . https://www.wikidata.org/wiki/Wikidata:User_access_levels . (2018). {Online; accessed 08-April-2018}.Google Scholar
- Haining Yao, Anthony Mark Orme, and Letha Etzkorn. 2005. Cohesion metrics for ontology design and application. Journal of Computer Science , Vol. 1, 1 (2005), 107--113.Google ScholarCross Ref
- Jonathan Yu, James A Thom, and Audrey Tam. 2007. Ontology evaluation using Wikipedia categories for browsing. In Proceedings of the sixteenth ACM conference on Conference on Information and Knowledge Management. ACM, 223--232. Google ScholarDigital Library
- Yang Zhe, Dalu Zhang, and YE Chuan. 2006. Evaluation metrics for ontology complexity and evolution analysis. In e-Business Engineering, 2006. ICEBE'06. IEEE International Conference on. IEEE, 162--170. Google ScholarDigital Library
Index Terms
Who Models the World?: Collaborative Ontology Creation and User Roles in Wikidata
Recommendations
An analysis of discussions in collaborative knowledge engineering through the lens of Wikidata
AbstractWe study discussions in Wikidata, the world’s largest open-source collaborative knowledge graph (KG). This is important because it helps KG community managers understand how discussions are used and inform the design of collaborative ...
Combining AceWiki with a CAPTCHA System for Collaborative Knowledge Acquisition
ICTAI '12: Proceedings of the 2012 IEEE 24th International Conference on Tools with Artificial Intelligence - Volume 01Formalized knowledge representation methods allow to build useful and semantically enriched knowledge bases which can be shared and reasoned upon. Unfortunately, knowledge acquisition for such formalized systems is often a time-consuming and tedious ...
Addressing semantic heterogeneity through multiple knowledge base assisted merging of domain-specific ontologies
With the development of the Semantic Web (SW), the creation of ontologies to formally conceptualize our understanding of various domains has widely increased in number. However, the conceptual and terminological differences (a.k.a semantic heterogeneity ...
Comments