ABSTRACT
Users today are constantly switching back and forth from applications where they consume or create content (such as e-books and productivity suites like Microsoft Office and Google Docs) to search engines where they satisfy their information needs. Unfortunately, though, this leads to a suboptimal user experience as the search engine lacks any knowledge about the content that the user is authoring or consuming in the application. As a result, productivity suites are starting to incorporate features that let the user "explore while they work". Existing work in the literature that can be applied to this problem takes a standard bag-of-words information retrieval approach, which consists of automatically creating a query that includes not only the target phrase or entity chosen by the user but also relevant terms from the context. While these approaches have been successful, they are inherently limited to returning results (documents) that have a syntactic match with the keywords in the query.
We argue that the limitations of these approaches can be overcome by leveraging semantic signals from a knowledge graph built from knowledge bases such as Wikipedia. We present a system called Lewis for retrieving contextually relevant entity results leveraging a knowledge graph, and perform a large scale crowdsourcing experiment in the context of an e-reader scenario, which shows that Lewis can outperform the state-of-the-art contextual entity recommendation systems by more than 20% in terms of the MAP score.
Supplemental Material
- G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6):734--749, 2005. Google ScholarDigital Library
- A. Agarwal, S. Chakrabarti, and S. Aggarwal. Learning to rank networked entities. In Proc. of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006. Google ScholarDigital Library
- K. Balog, A. P. de Vries, P. Serdyukov, P. Thomas, and T. Westerveld. Overview of the trec 2009 entity track. In Proc. of the Text Retrieval Conference Working Notes, 2009.Google Scholar
- K. Balog and H. Ramampiaro. Cumulative citation recommendation: Classification vs. ranking. In Proc. of the International ACM SIGIR Conference, 2013. Google ScholarDigital Library
- I. Bordino, Y. Mejova, and M. Lalmas. Penguins in sweaters, or serendipitous entity search on user-generated content. In Proc. of the ACM International Conference on Information Knowledge Management, 2013. Google ScholarDigital Library
- C. Buckley and S. E. Robertson. Relevance feedback track overview: Trec 2008. In Proc. of the Text Retrieval Conference, 2008.Google Scholar
- C. Buckley, G. Salton, J. Allan, and A. Singhal. Automatic query expansion using smart: Trec 3. In Proc. of the Text Retrieval Conference, 1994.Google Scholar
- W. Chen, W. Hsu, and M. L. Lee. Tagcloud-based explanation with feedback for recommender systems. In Proc. of the International ACM SIGIR Conference, 2013. Google ScholarDigital Library
- S. Cucerzan. Large-scale named entity disambiguation based on wikipedia data. In EMNLP-CoNLL, 2007.Google Scholar
- J. Dalton, L. Dietz, and J. Allan. Entity query feature expansion using knowledge base links. In Proc. of the International ACM SIGIR conference on Research and Development in Information Retrieval, 2014. Google ScholarDigital Library
- L. Finkelstein, E. Gabrilovich, Y. Matias, E. Rivlin, Z. Solan, G. Wolfman, and E. Ruppin. Placing search in context: The concept revisited. In Proc. of the International World Wide Web Conference, 2001. Google ScholarDigital Library
- L. C. Freeman. A set of measures of centrality based on betweenness. Sociometry, pages 35--41, 1977.Google ScholarCross Ref
- A. Fuxman, P. Pantel, Y. Lv, A. Chandra, P. Chilakamarri, M. Gamon, D. Hamilton, B. Kohlmeier, D. Narayanan, E. Papalexakis, and B. Zhao. Contextual insights. In Proc. of the Companion Publication of the International Conference on World Wide Web Companion, 2014. Google ScholarDigital Library
- S. Gottipati and J. Jiang. Linking entities to a knowledge base with query expansion. In Proc. of the Conference on Empirical Methods in Natural Language Processing, 2011. Google ScholarDigital Library
- S. Gouws, G. Van Rooyen, and H. A. Engelbrecht. Measuring conceptual similarity by spreading activation over wikipedia's hyperlink structure. In Proc. of Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources, 2010.Google Scholar
- J. L. Herlocker, J. A. Konstan, and J. Riedl. Explaining collaborative filtering recommendations. In Proc. of the ACM Conference on Computer Supported Cooperative Work, 2000. Google ScholarDigital Library
- J. Hoffart, S. Seufert, D. B. Nguyen, M. Theobald, and G. Weikum. Kore: Keyphrase overlap relatedness for entity disambiguation. In Proc. of the ACM International Conference on Information and Knowledge Management, 2012. Google ScholarDigital Library
- G. Jeh and J. Widom. Scaling personalized web search. In Proc. of the International Conference on World Wide Web, 2003. Google ScholarDigital Library
- R. Kraft, C. C. Chang, F. Maghoul, and R. Kumar. Searching with context. In Proc. of the International World Wide Web Conference, 2006. Google ScholarDigital Library
- S. Kulkarni, A. Singh, G. Ramakrishnan, and S. Chakrabarti. Collective annotation of wikipedia entities in web text. In Proc. of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2009. Google ScholarDigital Library
- V. Lavrenko and W. B. Croft. Relevance-based language models. In Proc. of the International ACM SIGIR Conference, 2001. Google ScholarDigital Library
- S. Lee, S.-i. Song, M. Kahng, D. Lee, and S.-g. Lee. Random walk based entity ranking on graph for multidimensional recommendation. In Proc. of the ACM Conference on Recommender Systems, 2011. Google ScholarDigital Library
- Y. Lv and A. Fuxman. In situ insights. In Proc. of the International ACM SIGIR Conference, 2015. Google ScholarDigital Library
- Y. Lv, T. Moon, P. Kolari, Z. Zheng, X. Wang, and Y. Chang. Learning to model relatedness for news recommendation. In Proc. of the International World Wide Web Conference, 2011. Google ScholarDigital Library
- Y. Lv and C. Zhai. Positional relevance model for pseudo-relevance feedback. In Proc. of the International ACM SIGIR Conference, 2010. Google ScholarDigital Library
- R. Mihalcea, C. Corley, and C. Strapparava. Corpus-based and knowledge-based measures of text semantic similarity. In Proc. of the National Conference on Artificial Intelligence, 2006. Google ScholarDigital Library
- R. Mihalcea and A. Csomai. Wikify!: linking documents to encyclopedic knowledge. In Proc. of the ACM Conference on Information and Knowledge Management, 2007. Google ScholarDigital Library
- D. Milne and I. Witten. An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In Proc. of AAAI Workshop on Wikipedia and Artificial Intelligence, 2008.Google Scholar
- D. Odijk, E. Meij, and M. de Rijke. Feeding the second screen: Semantic linking based on subtitles. In Proc. of the Conference on Open Research Areas in Information Retrieval, 2013. Google ScholarDigital Library
- L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. 1999.Google Scholar
- D. Petkova and W. B. Croft. Proximity-based document representation for named entity retrieval. In Proc. of the ACM Conference on Information and Knowledge Management, 2007. Google ScholarDigital Library
- B. Ribeiro-Neto, M. Cristo, P. B. Golgher, and E. Silva de Moura. Impedance coupling in content-targeted advertising. In Proc. of the International ACM SIGIR Conference, 2005. Google ScholarDigital Library
- S. Robertson and I. Soboroff. The trec 2002 filtering track report. In Proc. of the Text Retrieval Conference, 2002.Google Scholar
- S. E. Robertson and K. S. Jones. Relevance weighting of search terms. Journal of the American Society of Information Science, 27(3):129--146, 1976.Google ScholarCross Ref
- J. J. Rocchio. Relevance feedback in information retrieval. In In The SMART Retrieval System: Experiments in Automatic Document Processing. Prentice-Hall Inc., 1971.Google Scholar
- M. Strube and S. P. Ponzetto. Wikirelate! computing semantic relatedness using wikipedia. In Proc. of the AAAI Conference on Artificial Intelligence. Google ScholarDigital Library
- P. Symeonidis, A. Nanopoulos, and Y. Manolopoulos. Providing justifications in recommender systems. IEEE Transactions on Systems, Man and Cybernetics, Part A, 38(6):1262--1272, 2008. Google ScholarDigital Library
- A.-M. Vercoustre, J. A. Thom, and J. Pehcevski. Entity ranking in wikipedia. In Proc. of the ACM Symposium on Applied Computing, 2008. Google ScholarDigital Library
- J. Vig, S. Sen, and J. Riedl. Tagsplanations: explaining recommendations using tags. In Proc. of the International Conference on Intelligent User Interfaces, 2009. Google ScholarDigital Library
- N. Voskarides, D. Odijk, M. Tsagkias, W. Weerkamp, and M. de Rijke. Query-dependent contextualization of streaming data. In Proc. of the European Conference on Information Retrieval, 2014.Google ScholarCross Ref
- E. Yeh, D. Ramage, C. D. Manning, E. Agirre, and A. Soroa. Wikiwalk: Random walks on wikipedia for semantic relatedness. In Proc. of the Workshop on Graph-based Methods for Natural Language Processing, 2009. Google ScholarDigital Library
- M. A. Yosef, J. Hoffart, I. Bordino, M. Spaniol, and G. Weikum. Aida: An online tool for accurate disambiguation of named entities in text and tables. Proc. of the VLDB Endowment, 4(12):1450--1453, 2011.Google ScholarDigital Library
- C. Yu, L. V. Lakshmanan, and S. Amer-Yahia. Recommendation diversification using explanations. In Proc. of the IEEE International Conference on Data Engineering, 2009. Google ScholarDigital Library
- M. Zhou and K. C.-C. Chang. Entity-centric document filtering: boosting feature mapping through meta-features. In Proc. of the ACM International Conference on Information and Knowledge Management, 2013. Google ScholarDigital Library
Index Terms
- Leveraging Knowledge Bases for Contextual Entity Exploration
Recommendations
Generic and Scalable Framework for Automated Time-series Anomaly Detection
This paper introduces a generic and scalable framework for automated anomaly detection on large scale time-series data. Early detection of anomalies plays a key role in maintaining consistency of person's data and protects corporations against malicious ...
Comments