ABSTRACT
Tracking developments in the highly dynamic data-technology landscape are vital to keeping up with novel technologies and tools, in the various areas of Artificial Intelligence (AI). However, It is difficult to keep track of all the relevant technology keywords. In this paper, we propose a novel system that addresses this problem. This tool is used to automatically detect the existence of new technologies and tools in text, and extract terms used to describe these new technologies. The extracted new terms can be logged as new AI technologies as they are found on-the-fly in the web. It can be subsequently classified into the relevant semantic labels and AI domains. Our proposed tool is based on a two-stage cascading model--the first stage classifies if the sentence contains a technology term or not; and the second stage identifies the technology keyword in the sentence. We obtain a competitive accuracy for both tasks of sentence classification and text identification.
- C. C. Aggarwal and C. Zhai. 2012. A survey of text classification algorithms. In Mining text data. Springer, 163--222.Google Scholar
- S. Chakrabarti, B. Dom, R. Agrawal, and P. Raghavan. 1997. Using taxonomy, discriminants, and signatures for navigating in text databases. In VLDB, Vol. 97. 446--455. Google ScholarDigital Library
- J. R. Finkel, T. Grenager, and C. Manning. 2005. Incorporating non-local information into information extraction systems by Gibbs sampling. In Proceedings of the 43rd annual meeting on association for computational linguistics. Association for Computational Linguistics, 363--370. Google ScholarDigital Library
- M. Hossari, S. Dev, M. Nicholson, K. McCabe, A. Nautiyal, C. Conran, J. Tang, X. Wei, and F. Pitie. 2018. ADNet: A Deep Network for Detecting Adverts. In Proc. Irish Conference on Artificial Intelligence and Cognitive Science (AICS 2018).Google Scholar
- A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov. 2016. Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016).Google Scholar
- J. D. Kelleher and B. Tierney. 2018. Data Science. The MIT Press. Google ScholarDigital Library
- A. Nautiyal, K. McCabe, M. Hossari, S. Dev, M. Nicholson, C. Conran, D. McKibben, J. Tang, X. Wei, and F. Pitié. 2018. An Advert Creation System for Next-Gen Publicity. In Proc. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD).Google Scholar
- M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz. 1998. A Bayesian approach to filtering junk e-mail. In Learning for Text Categorization: Papers from the 1998 workshop, Vol. 62. Madison, Wisconsin, 98--105.Google Scholar
Index Terms
TEST: A Terminology Extraction System for Technology Related Terms
Recommendations
Co-clustering sentences and terms for multi-document summarization
CICLing'11: Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part IITwo issues are crucial to multi-document summarization: diversity and redundancy. Content within some topically-related articles are usually redundant while the topic is delivered from diverse perspectives. This paper presents a co-clustering based ...
Discovering "title-like" terms
This paper examines the feasibility of discovering "title-like" terms using a decision tree classifier from the document. The premise of discovering title-like terms is that title terms and title-like terms should behave similarly in the document. This ...
Comparison of feature selection methods for sentiment analysis
AI'10: Proceedings of the 23rd Canadian conference on Advances in Artificial IntelligenceSentiment analysis is a sub-field of Natural Language Processing and involves automatically classifying input text according to the sentiment expressed in it Sentiment analysis is similar to topical text classification but has a significant contextual ...
Comments