skip to main content
10.1145/2872427.2874809acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article
Open Access

From Freebase to Wikidata: The Great Migration

Published:11 April 2016Publication History

ABSTRACT

Collaborative knowledge bases that make their data freely available in a machine-readable form are central for the data strategy of many projects and organizations. The two major collaborative knowledge bases are Wikimedia's Wikidata and Google's Freebase. Due to the success of Wikidata, Google decided in 2014 to offer the content of Freebase to the Wikidata community. In this paper, we report on the ongoing transfer efforts and data mapping challenges, and provide an analysis of the effort so far. We describe the Primary Sources Tool, which aims to facilitate this and future data migrations. Throughout the migration, we have gained deep insights into both Wikidata and Freebase, and share and discuss detailed statistics on both knowledge bases.

References

  1. P. Ayers, C. Matthews, and B. Yates. How Wikipedia Works: And How You Can Be a Part of It. No Starch Press, Sept. 2008.Google ScholarGoogle Scholar
  2. R. Bennett, C. Hengel-Dittrich, E. T. O'Neill, and B. B. Tillett. VIAF (Virtual International Authority File): Linking die Deutsche Bibliothek and Library of Congress Name Authority Files. In World Library and Information Congress:nth72 IFLA General Conference and Council, 2006.Google ScholarGoogle Scholar
  3. K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, SIGMOD'08, pages 1247--1250, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. E. Campbell, A. Eisenberg, and J. Melton. XML Schema. SIGMOD Rec., 32(2):96--101, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. N. Choi, I.-Y. Song, and H. Han. A Survey on Ontology Mapping. ACM Sigmod Record, 35(3):34--41, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Cyganiak, D. Wood, and M. Lanthaler. RDF 1.1 Concepts and Abstract Syntax. World Wide Web Consortium, Feb. 2014. https://www.w3.org/TR/rdf11-concepts/.Google ScholarGoogle Scholar
  7. H.-J. Dai, C.-Y. Wu, R. Tsai, W. Hsu, et al. From Entity Recognition to Entity Linking: A Survey of Advanced Entity Linking Techniques. In The 26th Annual Conference of the Japanese Society for Artificial Intelligence, pages 1--10, 2012.Google ScholarGoogle Scholar
  8. A. Doan and A. Y. Halevy. Semantic Integration Research in the Database Community: A Brief Survey. AI Magazine, 26(1):83, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. X. Dong, E. Gabrilovich, et al. Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 601--610. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Douglas. Announcement: From Freebase to Wikidata, Dec 2014. https://groups.google.com/d/msg/freebase-discuss/s_BPoL92edc/Y585r7_2E1YJ.Google ScholarGoogle Scholar
  11. F. Flöck, D. Laniado, F. Stadthaus, and M. Acosta. Towards Better Visual Tools for Exploring Wikipedia Article Development--The Use Case of "Gamergate Controversy". In Ninth International AAAI Conference on Web and Social Media, 2015.Google ScholarGoogle Scholar
  12. M. Färber, B. Ell, C. Menne, and A. Rettinger. A Comparative Survey of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO. Semantic Web Journal, July 2015. http://www.semantic-web-journal.net/content/comparative-survey-dbpedia-freebase-opencyc-wikidata-and-yago (submitted, pending major revision).Google ScholarGoogle Scholar
  13. A. Gesmundo and K. Hall. Projecting the Knowledge Graph to Syntactic Parsing. In G. Bouma and Y. Parmentier, editors, Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014, April 26--30, 2014, Gothenburg, Sweden, pages 28--32. The Association for Computer Linguistics, 2014. Google ScholarGoogle ScholarCross RefCross Ref
  14. B. Hachey, W. Radford, J. Nothman, M. Honnibal, and J. R. Curran. Evaluating Entity Linking with Wikipedia. Artificial intelligence, 194:130--150, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Harris. Google is Shutting Down its Freebase Knowledge Base. GigaOM, Dec. 2014. https://gigaom.com/2014/12/16/google-is-shutting-down-its-freebase-knowledge-base/.Google ScholarGoogle Scholar
  16. T. Heath and C. Bizer. Linked Data: Evolving the Web into a Global Data Space. Synthesis Lectures on the Semantic Web: Theory and Technology. Morgan and Claypool, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Hernández, A. Hogan, and M. Krötzsch. Reifying RDF: What Works Well With Wikidata? In T. Liebig and A. Fokoue, editors, Proceedings of the 11th International Workshop on Scalable Semantic Web Knowledge Base Systems, volume 1457 of CEUR, pages 32--47. CEUR-WS.org, 2015.Google ScholarGoogle Scholar
  18. M. Horridge, T. Tudorache, C. Nuylas, J. Vendetti, N. F. Noy, and M. A. Musen. WebProtege: A Collaborative Web Based Platform for Editing Biomedical Ontologies. Bioinformatics, pages 1--2, May 2014. Google ScholarGoogle ScholarCross RefCross Ref
  19. B. Hyland, G. Atemezing, and B. Villazon-Terrazas. Best Practices for Publishing Linked Data. W3C Working Group Note. World Wide Web Consortium, Jan. 2014. http://www.w3.org/TR/ld-bp/.Google ScholarGoogle Scholar
  20. J. Lehmann, R. Isele, M. Jakob, A. Jentzsch, D. Kontokostas, P. N. Mendes, S. Hellmann, M. Morsey, P. van Kleef, S. Auer, et al. DBpedia--A Large-Scale, Multilingual Knowledge Base Extracted from Wikipedia. Semantic Web Journal, 5:1--29, 2014.Google ScholarGoogle Scholar
  21. P. Miller, R. Styles, and T. Heath. Open Data Commons, a License for Open Data. In Proceedings of the Linked Data on the Web workshop, Beijing, China, Apr. 2008.Google ScholarGoogle Scholar
  22. D. Milne and I. H. Witten. An Open-Source Toolkit for Mining Wikipedia. Artificial Intelligence, 194:222--239, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. Moskaliuk, J. Kimmerle, and U. Cress. Collaborative Knowledge Building with Wikis: The Impact of Redundancy and Polarity. Computers & Education, 58(4):1049--1057, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. D. Peters. Expanding the Public Domain: Part Zero. Creative Commons, Mar. 2009. http://creativecommons.org/weblog/entry/13304.Google ScholarGoogle Scholar
  25. R. Press. Ontology and Database Mapping: A Survey of Current Implementations and Future Directions. Journal of Web Engineering, 7(1):001--024, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. V. Rodríguez-Doncel, M. C. Suárez-Figueroa, A. Gómez-Pérez, and M. Poveda. License Linked Data Resources Pattern. In Proceedings of thenth4 International Workshop on Ontology Patterns, Sydney, Australia, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Schindler and D. Vrandevcić. Introducing New Features to Wikipedia: Case Studies for Web Science. IEEE Intelligent Systems, (1):56--61, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. W. Shen, J. Wang, and J. Han. Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions. Knowledge and Data Engineering, IEEE Transactions on, 27(2):443--460, 2015.Google ScholarGoogle Scholar
  29. A. Singhal. Introducing the Knowledge Graph: Things, not Strings. Official Google Blog, May 2012. http://googleblog.blogspot.com/2012/05/introducing-knowledge-graph-things-not.html.Google ScholarGoogle Scholar
  30. A. Swartz. MusicBrainz: A Semantic Web Service. IEEE Intelligent Systems, 17:76--77, Jan. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. D. Vrandevcić and M. Krötzsch. Wikidata: A Free Collaborative Knowledgebase. Communications of the ACM, 57(10):78--85, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. D. Vrandečić. Wikidata Requirements. Wikimedia Foundation, Apr. 2012. https://meta.wikimedia.org/w/index.php?title=Wikidata/Notes/Requirements&oldid=3646045.Google ScholarGoogle Scholar
  33. R. West, E. Gabrilovich, K. Murphy, S. Sun, R. Gupta, and D. Lin. Knowledge Base Completion via Search-based Question Answering. In Proceedings of the 23rd International Conference on World Wide Web, WWW'14, pages 515--526, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. E. Zachte. Statistics Wikidata. Wikimedia Foundation, Sept. 2015. http://stats.wikimedia.org/wikispecial/EN/TablesWikipediaWIKIDATA.htm.Google ScholarGoogle Scholar

Index Terms

  1. From Freebase to Wikidata: The Great Migration

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            WWW '16: Proceedings of the 25th International Conference on World Wide Web
            April 2016
            1482 pages
            ISBN:9781450341431

            Copyright © 2016 Copyright is held by the International World Wide Web Conference Committee (IW3C2)

            Publisher

            International World Wide Web Conferences Steering Committee

            Republic and Canton of Geneva, Switzerland

            Publication History

            • Published: 11 April 2016

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            WWW '16 Paper Acceptance Rate115of727submissions,16%Overall Acceptance Rate1,899of8,196submissions,23%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader