research-article

Leveraging Knowledge Bases for Contextual Entity Exploration

Authors:
Joonseok Lee

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

,
Ariel Fuxman

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

,
Bo Zhao

LinkedIn, Mountain View, CA, USA

LinkedIn, Mountain View, CA, USA
View Profile

,
Yuanhua Lv

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data MiningAugust 2015Pages 1949–1958https://doi.org/10.1145/2783258.2788564

Published:10 August 2015Publication History

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Pages 1949–1958

ABSTRACT

Users today are constantly switching back and forth from applications where they consume or create content (such as e-books and productivity suites like Microsoft Office and Google Docs) to search engines where they satisfy their information needs. Unfortunately, though, this leads to a suboptimal user experience as the search engine lacks any knowledge about the content that the user is authoring or consuming in the application. As a result, productivity suites are starting to incorporate features that let the user "explore while they work". Existing work in the literature that can be applied to this problem takes a standard bag-of-words information retrieval approach, which consists of automatically creating a query that includes not only the target phrase or entity chosen by the user but also relevant terms from the context. While these approaches have been successful, they are inherently limited to returning results (documents) that have a syntactic match with the keywords in the query.

We argue that the limitations of these approaches can be overcome by leveraging semantic signals from a knowledge graph built from knowledge bases such as Wikipedia. We present a system called Lewis for retrieving contextually relevant entity results leveraging a knowledge graph, and perform a large scale crowdsourcing experiment in the context of an e-reader scenario, which shows that Lewis can outperform the state-of-the-art contextual entity recommendation systems by more than 20% in terms of the MAP score.

Supplemental Material

p1949.m4v

m4v

2.8 GB

Download

References

G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6):734--749, 2005. Google ScholarDigital Library
A. Agarwal, S. Chakrabarti, and S. Aggarwal. Learning to rank networked entities. In Proc. of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006. Google ScholarDigital Library
K. Balog, A. P. de Vries, P. Serdyukov, P. Thomas, and T. Westerveld. Overview of the trec 2009 entity track. In Proc. of the Text Retrieval Conference Working Notes, 2009.Google Scholar
K. Balog and H. Ramampiaro. Cumulative citation recommendation: Classification vs. ranking. In Proc. of the International ACM SIGIR Conference, 2013. Google ScholarDigital Library
I. Bordino, Y. Mejova, and M. Lalmas. Penguins in sweaters, or serendipitous entity search on user-generated content. In Proc. of the ACM International Conference on Information Knowledge Management, 2013. Google ScholarDigital Library
C. Buckley and S. E. Robertson. Relevance feedback track overview: Trec 2008. In Proc. of the Text Retrieval Conference, 2008.Google Scholar
C. Buckley, G. Salton, J. Allan, and A. Singhal. Automatic query expansion using smart: Trec 3. In Proc. of the Text Retrieval Conference, 1994.Google Scholar
W. Chen, W. Hsu, and M. L. Lee. Tagcloud-based explanation with feedback for recommender systems. In Proc. of the International ACM SIGIR Conference, 2013. Google ScholarDigital Library
S. Cucerzan. Large-scale named entity disambiguation based on wikipedia data. In EMNLP-CoNLL, 2007.Google Scholar
J. Dalton, L. Dietz, and J. Allan. Entity query feature expansion using knowledge base links. In Proc. of the International ACM SIGIR conference on Research and Development in Information Retrieval, 2014. Google ScholarDigital Library
L. Finkelstein, E. Gabrilovich, Y. Matias, E. Rivlin, Z. Solan, G. Wolfman, and E. Ruppin. Placing search in context: The concept revisited. In Proc. of the International World Wide Web Conference, 2001. Google ScholarDigital Library
L. C. Freeman. A set of measures of centrality based on betweenness. Sociometry, pages 35--41, 1977.Google ScholarCross Ref
A. Fuxman, P. Pantel, Y. Lv, A. Chandra, P. Chilakamarri, M. Gamon, D. Hamilton, B. Kohlmeier, D. Narayanan, E. Papalexakis, and B. Zhao. Contextual insights. In Proc. of the Companion Publication of the International Conference on World Wide Web Companion, 2014. Google ScholarDigital Library
S. Gottipati and J. Jiang. Linking entities to a knowledge base with query expansion. In Proc. of the Conference on Empirical Methods in Natural Language Processing, 2011. Google ScholarDigital Library
S. Gouws, G. Van Rooyen, and H. A. Engelbrecht. Measuring conceptual similarity by spreading activation over wikipedia's hyperlink structure. In Proc. of Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources, 2010.Google Scholar
J. L. Herlocker, J. A. Konstan, and J. Riedl. Explaining collaborative filtering recommendations. In Proc. of the ACM Conference on Computer Supported Cooperative Work, 2000. Google ScholarDigital Library
J. Hoffart, S. Seufert, D. B. Nguyen, M. Theobald, and G. Weikum. Kore: Keyphrase overlap relatedness for entity disambiguation. In Proc. of the ACM International Conference on Information and Knowledge Management, 2012. Google ScholarDigital Library
G. Jeh and J. Widom. Scaling personalized web search. In Proc. of the International Conference on World Wide Web, 2003. Google ScholarDigital Library
R. Kraft, C. C. Chang, F. Maghoul, and R. Kumar. Searching with context. In Proc. of the International World Wide Web Conference, 2006. Google ScholarDigital Library
S. Kulkarni, A. Singh, G. Ramakrishnan, and S. Chakrabarti. Collective annotation of wikipedia entities in web text. In Proc. of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2009. Google ScholarDigital Library
V. Lavrenko and W. B. Croft. Relevance-based language models. In Proc. of the International ACM SIGIR Conference, 2001. Google ScholarDigital Library
S. Lee, S.-i. Song, M. Kahng, D. Lee, and S.-g. Lee. Random walk based entity ranking on graph for multidimensional recommendation. In Proc. of the ACM Conference on Recommender Systems, 2011. Google ScholarDigital Library
Y. Lv and A. Fuxman. In situ insights. In Proc. of the International ACM SIGIR Conference, 2015. Google ScholarDigital Library
Y. Lv, T. Moon, P. Kolari, Z. Zheng, X. Wang, and Y. Chang. Learning to model relatedness for news recommendation. In Proc. of the International World Wide Web Conference, 2011. Google ScholarDigital Library
Y. Lv and C. Zhai. Positional relevance model for pseudo-relevance feedback. In Proc. of the International ACM SIGIR Conference, 2010. Google ScholarDigital Library
R. Mihalcea, C. Corley, and C. Strapparava. Corpus-based and knowledge-based measures of text semantic similarity. In Proc. of the National Conference on Artificial Intelligence, 2006. Google ScholarDigital Library
R. Mihalcea and A. Csomai. Wikify!: linking documents to encyclopedic knowledge. In Proc. of the ACM Conference on Information and Knowledge Management, 2007. Google ScholarDigital Library
D. Milne and I. Witten. An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In Proc. of AAAI Workshop on Wikipedia and Artificial Intelligence, 2008.Google Scholar
D. Odijk, E. Meij, and M. de Rijke. Feeding the second screen: Semantic linking based on subtitles. In Proc. of the Conference on Open Research Areas in Information Retrieval, 2013. Google ScholarDigital Library
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. 1999.Google Scholar
D. Petkova and W. B. Croft. Proximity-based document representation for named entity retrieval. In Proc. of the ACM Conference on Information and Knowledge Management, 2007. Google ScholarDigital Library
B. Ribeiro-Neto, M. Cristo, P. B. Golgher, and E. Silva de Moura. Impedance coupling in content-targeted advertising. In Proc. of the International ACM SIGIR Conference, 2005. Google ScholarDigital Library
S. Robertson and I. Soboroff. The trec 2002 filtering track report. In Proc. of the Text Retrieval Conference, 2002.Google Scholar
S. E. Robertson and K. S. Jones. Relevance weighting of search terms. Journal of the American Society of Information Science, 27(3):129--146, 1976.Google ScholarCross Ref
J. J. Rocchio. Relevance feedback in information retrieval. In In The SMART Retrieval System: Experiments in Automatic Document Processing. Prentice-Hall Inc., 1971.Google Scholar
M. Strube and S. P. Ponzetto. Wikirelate! computing semantic relatedness using wikipedia. In Proc. of the AAAI Conference on Artificial Intelligence. Google ScholarDigital Library
P. Symeonidis, A. Nanopoulos, and Y. Manolopoulos. Providing justifications in recommender systems. IEEE Transactions on Systems, Man and Cybernetics, Part A, 38(6):1262--1272, 2008. Google ScholarDigital Library
A.-M. Vercoustre, J. A. Thom, and J. Pehcevski. Entity ranking in wikipedia. In Proc. of the ACM Symposium on Applied Computing, 2008. Google ScholarDigital Library
J. Vig, S. Sen, and J. Riedl. Tagsplanations: explaining recommendations using tags. In Proc. of the International Conference on Intelligent User Interfaces, 2009. Google ScholarDigital Library
N. Voskarides, D. Odijk, M. Tsagkias, W. Weerkamp, and M. de Rijke. Query-dependent contextualization of streaming data. In Proc. of the European Conference on Information Retrieval, 2014.Google ScholarCross Ref
E. Yeh, D. Ramage, C. D. Manning, E. Agirre, and A. Soroa. Wikiwalk: Random walks on wikipedia for semantic relatedness. In Proc. of the Workshop on Graph-based Methods for Natural Language Processing, 2009. Google ScholarDigital Library
M. A. Yosef, J. Hoffart, I. Bordino, M. Spaniol, and G. Weikum. Aida: An online tool for accurate disambiguation of named entities in text and tables. Proc. of the VLDB Endowment, 4(12):1450--1453, 2011.Google ScholarDigital Library
C. Yu, L. V. Lakshmanan, and S. Amer-Yahia. Recommendation diversification using explanations. In Proc. of the IEEE International Conference on Data Engineering, 2009. Google ScholarDigital Library
M. Zhou and K. C.-C. Chang. Entity-centric document filtering: boosting feature mapping through meta-features. In Proc. of the ACM International Conference on Information and Knowledge Management, 2013. Google ScholarDigital Library

Index Terms

Leveraging Knowledge Bases for Contextual Entity Exploration
1. Information systems
  1. Information systems applications

Recommendations

Generic and Scalable Framework for Automated Time-series Anomaly Detection

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

This paper introduces a generic and scalable framework for automated anomaly detection on large scale time-series data. Early detection of anomalies plays a key role in maintaining consistency of person's data and protects corporations against malicious ...

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
August 2015
2378 pages
ISBN:9781450336642
DOI:10.1145/2783258
General Chairs:
Longbing Cao
University of Technology, Sydney
,
Chengqi Zhang
University of Technology, Sydney
,
Program Chairs:
Thorsten Joachims
Cornell University
,
Geoff Webb
Monash University
,
Dragos D. Margineantu
Boeing Research
,
Graham Williams
Australian Taxation Office
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 August 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
context
context-selection betweenness
entity recommendation
knowledge base
semantic
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '15 Paper Acceptance Rate160of819submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 624
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Leveraging Knowledge Bases for Contextual Entity Exploration

KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Generic and Scalable Framework for Automated Time-series Anomaly Detection