Unsupervised Approaches for Textual Semantic Annotation, A Survey

Authors:
Xiaofeng Liao

University of Amsterdam, Amsterdam, Netherlands

University of Amsterdam, Amsterdam, Netherlands

0000-0002-4706-1084
View Profile

,
Zhiming Zhao

University of Amsterdam, Amsterdam, Netherlands

University of Amsterdam, Amsterdam, Netherlands
View Profile

Authors Info & Claims

ACM Computing Surveys Volume 52 Issue 4Article No.: 66pp 1–45https://doi.org/10.1145/3324473

Published:30 August 2019Publication History

ACM Computing Surveys

Abstract

Semantic annotation is a crucial part of achieving the vision of the Semantic Web and has long been a research topic among various communities. The most challenging problem in reaching the Semantic Web’s real potential is the gap between a large amount of unlabeled existing/new data and the limited annotation capability available. To resolve this problem, numerous works have been carried out to increase the degree of automation of semantic annotation from manual to semi-automatic to fully automatic. The richness of these works has been well-investigated by numerous surveys focusing on different aspects of the problem. However, a comprehensive survey targeting unsupervised approaches for semantic annotation is still missing and is urgently needed. To better understand the state-of-the-art of semantic annotation in the textual domain adopting unsupervised approaches, this article investigates existing literature and presents a survey to answer three research questions: (1) To what extent can semantic annotation be performed in a fully automatic manner by using an unsupervised way? (2) What kind of unsupervised approaches for semantic annotation already exist in literature? (3) What characteristics and relationships do these approaches have?

In contrast to existing surveys, this article helps the reader get an insight into the state-of-art of semantic annotation using unsupervised approaches. While examining the literature, this article also addresses the inconsistency in the terminology used in the literature to describe the various semantic annotation tools’ degree of automation and provides more consistent terminology. Based on this, a uniform summary of the degree of automation of the many semantic annotation tools that were previously investigated can now be presented.

References

Mohamed Farouk Abdel Hady, Abubakrelsedik Karali, Eslam Kamal, and Rania Ibrahim. 2014. Unsupervised active learning of CRF model for cross-lingual named entity recognition. In Artificial Neural Networks in Pattern Recognition, Neamat El Gayar, Friedhelm Schwenker, and Cheng Suen (Eds.). Springer International Publishing, Cham, 23--34. Google ScholarDigital Library
Saminda Abeyruwan, Ubbo Visser, Vance Lemmon, and Stephan Schürer. 2013. PrOntoLearn: Unsupervised lexico-semantic ontology generation using probabilistic methods. In Uncertainty Reasoning for the Semantic Web II. Springer, Berlin, 217--236. Google ScholarDigital Library
Eugene Agichtein and Luis Gravano. 2000. Snowball: Extracting relations from large plain-text collections. In Proceedings of the 5th ACM Conference on Digital Libraries (DL’00). ACM, New York, NY, 85--94. Google ScholarDigital Library
Alan Akbik, Larysa Visengeriyeva, Priska Herger, Holmer Hemsen, and Alexander Löser. 2012. Unsupervised discovery of relations and discriminative extraction patterns. Proceedings of COLING 2012, 17--32.Google Scholar
Saeed Albukhitan, Tarek Helmy, and Ahmed Alnazer. 2017. Arabic ontology learning using deep learning. In Proceedings of the International Conference on Web Intelligence (WI’17). ACM, New York, NY, 1138--1142. Google ScholarDigital Library
Anita Alicante, Anna Corazza, Francesco Isgrò, and Stefano Silvestri. 2016. Unsupervised entity and relation extraction from clinical records in Italian. Computers in Biology and Medicine 72 (2016), 263--275. Google ScholarDigital Library
Periklis Andritsos, Panayiotis Tsaparas, Renée J. Miller, and Kenneth C. Sevcik. 2004. LIMBO: Scalable Clustering of Categorical Data. Springer, Berlin, 123--146.Google Scholar
Isabelle Augenstein, Andreas Vlachos, and Diana Maynard. 2015. Extracting relations between non-standard entities using distant supervision and imitation learning. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 747--757.Google ScholarCross Ref
Amit Bagga and Breck Baldwin. 1998. Entity-based cross-document coreferencing using the vector space model. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics - Volume 1 (ACL’98/COLING’98). Association for Computational Linguistics, 79--85. Google ScholarDigital Library
Michele Banko, Michael J. Cafarella, Stephen Soderland, Matt Broadhead, and Oren Etzioni. 2007. Open information extraction from the web. In Proceedings of the 20th International Joint Conference on Artifical Intelligence (IJCAI’07). Morgan Kaufmann Publishers Inc., 2670--2676. http://dl.acm.org/citation.cfm?id=1625275.1625705 Google ScholarDigital Library
David S. Batista, Bruno Martins, and Mário J. Silva. 2015. Semi-supervised bootstrapping of relationship extractors with distributional semantics. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 499--504.Google Scholar
Tim Berners-Lee, James Hendler, and Ora Lassila. 2001. The semantic web. A new form of web content that is meaningful to computers will unleash a revolution of new possibilities. Scientific American 284, 5 (2001), 24--30.Google ScholarCross Ref
Indrajit Bhattacharya and Lise Getoor. 2006. A latent Dirichlet model for unsupervised entity resolution. In Proceedings of the 2006 SIAM International Conference on Data Mining. 47--58.Google ScholarCross Ref
Christian Bizer, Tom Heath, and Tim Berners-Lee. 2011. Linked data: The story so far. In Semantic Services, Interoperability and Web Applications: Emerging Concepts. IGI Global, 205--227.Google Scholar
David M. Blei, Andre Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research (2003), 993--1022. Google Scholar
Danushka Bollegala, Takanori Maehara, and Ken-ichi Kawarabayashi. 2015. Embedding semantic relations into word representations. In Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI’15). AAAI Press, 1222--1228. Google ScholarDigital Library
Danushka Tarupathi Bollegala, Yutaka Matsuo, and Mitsuru Ishizuka. 2010. Relational duality: Unsupervised extraction of semantic relations between entities on the web. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, New York, NY, 151--160. Google ScholarDigital Library
Kalina Bontcheva and Hamish Cunningham. 2011. Semantic Annotations and Retrieval: Manual, Semiautomatic, and Automatic Generation. Springer, Berlin, 77--116.Google Scholar
Adrien Bougouin, Florian Boudin, and Béatrice Daille. 2016. Keyphrase annotation with graph co-ranking. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics. 2945--2955.Google Scholar
Svetla Boytcheva. 2018. Indirect association rules mining in clinical texts. In Artificial Intelligence: Methodology, Systems, and Applications, Gennady Agre, Josef van Genabith, and Thierry Declerck (Eds.). Springer International Publishing, Cham, 36--47.Google Scholar
Sergey Brin. 1999. Extracting patterns and relations from the World Wide Web. In The World Wide Web and Databases: International Workshop WebDB’98. Selected Papers, Paolo Atzeni, Alberto Mendelzon, and Giansalvatore Mecca (Eds.). Springer, Berlin, 172--183. Google ScholarDigital Library
Paul Buitelaar and Philipp Cimiano. 2008. Ontology Learning and Population: Bridging the Gap Between Text and Knowledge. Frontiers in Artificial Intelligence and Applications, Vol. 167. IOS Press, Amsterdam. Google ScholarDigital Library
Paul Buitelaar and Srikanth Ramaka. 2005. Unsupervised ontology-based semantic tagging for knowledge markup. In Workshop on Learning in Web Search at 22nd International Conference on Machine Learning (ICML’05), Vol. 5. 26--32.Google Scholar
Andrew Carlson, Justin Betteridge, Bryan Kisiel, Burr Settles, Estevam R. Hruschka, Jr., and Tom M. Mitchell. 2010. Toward an architecture for never-ending language learning. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI’10). AAAI Press, 1306--1313. http://dl.acm.org/citation.cfm?id=2898607.2898816 Google ScholarDigital Library
Mercedes Arguello Casteleiro, George Demetriou, Warren Read, Maria Jesus Fernandez Prieto, Nava Maroto, Diego Maseda Fernandez, Goran Nenadic, Julie Klein, John Keane, and Robert Stevens. 2018. Deep learning meets ontologies: Experiments to anchor the cardiovascular disease ontology in the biomedical literature. Journal of Biomedical Semantics 9, 1 (2018), 13, 1--24.Google Scholar
Sam Chapman, Alexiei Dingli, and Fabio Ciravegna. 2004. Armadillo: Harvesting information for the semantic web. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’04). ACM, New York, NY, 598--598. Google ScholarDigital Library
Chaitanya Chemudugunta, America Holloway, Padhraic Smyth, and Mark Steyvers. 2008. Modeling documents by combining semantic concepts with unsupervised statistical learning. In The Semantic Web - ISWC 2008: Proceedings of the 7th International Semantic Web Conference (ISWC’08), Amit Sheth, Steffen Staab, Mike Dean, Massimo Paolucci, Diana Maynard, Timothy Finin, and Krishnaprasad Thirunarayan (Eds.). Springer, Berlin, 229--244. Google ScholarDigital Library
Jinxiu Chen, Donghong Ji, Chew Lim Tan, and Zhengyu Niu. 2005. Unsupervised feature selection for relation extraction. In Proceedings of the International Joint Conference on Natural Language Processing (and workshops) (IJCNLP’05). 262--267.Google Scholar
Ying Chen and James Martin. 2007. Towards robust unsupervised personal name disambiguation. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’07). 190--198.Google Scholar
Ziyan Chen, Yu Huang, Yuexian Liang, Yang Wang, Xingyu Fu, and Kun Fu. 2017. RGloVe: An improved approach of global vectors for distributional entity relation representation. Algorithms 10, 2 (2017), 1--11.Google ScholarCross Ref
Xiao Cheng and Dan Roth. 2013. Relational inference for wikification. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1787--1796.Google Scholar
Pao-Yu Chien and Pu-Jen Cheng. 2015. Semantic tagging of mathematical expressions. In Proceedings of the 24th International Conference on World Wide Web (WWW’15). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 195--204. Google ScholarDigital Library
Emil Şt Chifu and Ioan Alfred Letia. 2010. Unsupervised semantic annotation of Web service datatypes. In Proceedings of the 2010 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP’10). IEEE, 43--50. Google ScholarDigital Library
Massimiliano Ciaramita, Aldo Gangemi, Esther Ratsch, Jasmin Šarić, and Isabel Rojas. 2008. Unsupervised learning of semantic relations for molecular biology ontologies. Frontiers in Artificial Intelligence and Applications 167, 1 (2008), 91--104. Google ScholarDigital Library
Philipp Cimiano, Siegfried Handschuh, and Steffen Staab. 2004. Towards the self-annotating web. In Proceedings of the 13th International Conference on World Wide Web (WWW’04). ACM, New York, NY, 462--471. Google ScholarDigital Library
Philipp Cimiano, Günter Ladwig, and Steffen Staab. 2005. Gimme’ the context: Context-driven automatic semantic annotation with C-PANKOW. In Proceedings of the 14th International Conference on World Wide Web (WWW’05). ACM, New York, NY, 332--341. Google ScholarDigital Library
Fabio Ciravegna, Sam Chapman, Alexiei Dingli, and Yorick Wilks. 2004. Learning to Harvest Information for the Semantic Web. Springer, Berlin, 312--326.Google Scholar
Angel Conde, Mikel Larrañaga, Ana Arruarte, Jon A. Elorriaga, and Dan Roth. 2016. Litewi: A combined term extraction and entity linking method for eliciting educational ontologies from textbooks. Journal of the Association for Information Science and Technology 67, 2 (2016), 380--399. Google ScholarDigital Library
Jenny Copara, Jose Ochoa, Camilo Thorne, and Goran Glavas. 2016. Exploring unsupervised features in conditional random fields for spanish named entity recognition. In Proceedings of the 2016 5th Brazilian Conference on Intelligent Systems (BRACIS’16). 283--288.Google ScholarCross Ref
Stéphane Corlosquet, Renaud Delbru, Tim Clark, Axel Polleres, and Stefan Decker. 2009. Produce and consume linked data with Drupal! In The Semantic Web - ISWC 2009, Abraham Bernstein, David R. Karger, Tom Heath, Lee Feigenbaum, Diana Maynard, Enrico Motta, and Krishnaprasad Thirunarayan (Eds.). Springer, Berlin, 763--778. Google ScholarDigital Library
Alessandro Cucchiarelli and Paola Velardi. 2001. Unsupervised named entity recognition using syntactic and semantic contextual evidence. Computational Linguistics 27, 1 (2001), 123--131. Google ScholarDigital Library
Silviu Cucerzan. 2007. Large-scale named entity disambiguation based on Wikipedia data. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’07). 708--716.Google Scholar
Hong Cui, David Boufford, and Paul Selden. 2010. Semantic annotation of biosystematics literature without training examples. Journal of the American Society for Information Science and Technology 61, 3 (2010), 522--542. Google ScholarDigital Library
Andrew M. Dai and Amos J. Storkey. 2011. The grouped author-topic model for unsupervised entity resolution. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6791 LNCS, Part 1 (2011), 241--249. Google ScholarDigital Library
Bhavana Bharat Dalvi, William W. Cohen, and Jamie Callan. 2012. WebSets: Extracting sets of entities from the web using unsupervised information extraction. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining (WSDM’12). ACM, New York, NY, 243--252. Google ScholarDigital Library
Grant DeLozier, Jason Baldridge, and Loretta London. 2015. Gazetteer-independent toponym resolution using geographic word profiles. In Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI’15). AAAI Press, 2382--2388. http://dl.acm.org/citation.cfm?id=2886521.2886652 Google ScholarDigital Library
Stephen Dill, Nadav Eiron, David Gibson, Daniel Gruhl, Ramanathan Guha, Anant Jhingran, Tapas Kanungo, Kevin S. McCurley, Sridhar Rajagopalan, Andrew Tomkins, John A. Tomlin, and Jason Y. Zien. 2003a. A case for automated large-scale semantic annotation. Web Semantics: Science, Services and Agents on the World Wide Web 1, 1 (2003), 115--132.Google ScholarCross Ref
Stephen Dill, Nadav Eiron, David Gibson, Daniel Gruhl, Ramanathan Guha, Anant Jhingran, Tapas Kanungo, Sridhar Rajagopalan, Andrew Tomkins, John A. Tomlin, and Jason Y. Zien. 2003b. SemTag and seeker: Bootstrapping the semantic web via automated semantic annotation. In Proceedings of the 12th International Conference on World Wide Web (WWW’03). ACM, New York, NY, 178--186. Google ScholarDigital Library
Alexiei Dingli, Fabio Ciravegna, David Guthrie, and Yorick Wilks. 2003b. Mining web sites using adaptive information extraction. In Proceedings of the 10th Conference on European Chapter of the Association for Computational Linguistics, Vol. 2. Association for Computational Linguistics, 75--78. Google ScholarDigital Library
Alexiei Dingli, Fabio Ciravegna, and Yorick Wilks. 2003a. Automatic semantic annotation using unsupervised information extraction and integration. In Proceedings of the K-CAP 2003 Workshop on Knowledge Markup and Semantic Annotation. 1--8.Google Scholar
John Domingue and Peter Scott. 1998. KMi planet: A web based news server. In Proceedings of the 3rd Asia Pacific Computer Human Interaction (Cat. No. 98EX110). 324--330. Google ScholarDigital Library
Andres Duque, Mark Stevenson, Juan Martinez-Romo, and Lourdes Araujo. 2018. Co-occurrence graphs for word sense disambiguation in the biomedical domain. Artificial Intelligence in Medicine 87 (2018), 9--19.Google ScholarCross Ref
Sourav Dutta and Gerhard Weikum. 2015. C3EL: A joint model for cross-document co-reference resolution and entity linking. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 846--856.Google ScholarCross Ref
Riloff Ellen and Shepherd Jessica. 1997. A corpus-based approach for building semantic lexicons. In Proceedings of the 2nd Conference on Empirical Methods in Natural Language Processing (EMNLP-2’97). 117--124.Google Scholar
Hady Elsahar, Elena Demidova, Simon Gottschalk, Christophe Gravier, and Frederique Laforest. 2017. Unsupervised open relation extraction. In The Semantic Web: ESWC 2017 Satellite Events, Eva Blomqvist, Katja Hose, Heiko Paulheim, Agnieszka Ławrynowicz, Fabio Ciravegna, and Olaf Hartig (Eds.). Springer International Publishing, Cham, 12--16.Google Scholar
Oren Etzioni, Michele Banko, Stephen Soderland, and Daniel S. Weld. 2008. Open information extraction from the web. Communications of the ACM 51, 12 (Dec. 2008), 68--74. Google ScholarDigital Library
Oren Etzioni, Michael Cafarella, Doug Downey, Stanley Kok, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, and Alexander Yates. 2004a. Web-scale information extraction in knowitall: (Preliminary results). In Proceedings of the 13th International Conference on World Wide Web (WWW’04). ACM, New York, NY, 100--110. Google ScholarDigital Library
Oren Etzioni, Michael Cafarella, Doug Downey, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, and Alexander Yates. 2004b. Methods for domain-independent information extraction from the web: An experimental comparison. In Proceedings of the 19th National Conference on Artifical Intelligence (AAAI’ 04). AAAI Press, 391--398. http://dl.acm.org/citation.cfm?id=1597148.1597213 Google ScholarDigital Library
Oren Etzioni, Michael Cafarella, Doug Downey, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, and Alexander Yates. 2005. Unsupervised named-entity extraction from the Web: An experimental study. Artificial Intelligence 165, 1 (2005), 91--134. Google ScholarDigital Library
Anthony Fader, Stephen Soderland, and Oren Etzioni. 2011. Identifying relations for open information extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11). Association for Computational Linguistics, Stroudsburg, PA, 1535--1545. http://dl.acm.org/citation.cfm?id=2145432.2145596 Google ScholarDigital Library
Johannes Fähndrich, Sebastian Ahrndt, and Sahin Albayrak. 2015. Self-explanation through semantic annotation: A survey. In FedCSIS Position Papers. 17--24.Google Scholar
Ronen Feldman and Benjamin Rosenfeld. 2006. Boosting unsupervised relation extraction by using NER. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP’06). Association for Computational Linguistics, Stroudsburg, PA, 473--481. http://dl.acm.org/citation.cfm?id=1610075.1610141 Google ScholarDigital Library
Norberto Fernández, Jesus A. Fisteus, Luis Sánchez, and Damaris Fuentes-Lorenzo. 2012. WikiIdRank: An unsupervised approach for entity linking based on instance co-occurrence. International Journal of Innovative Computing, Information and Control 8, 11 (2012), 7519--7541.Google Scholar
Jenny Rose Finkel, Trond Grenager, and Christopher Manning. 2005. Incorporating non-local information into information extraction systems by Gibbs sampling. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 363--370. Google ScholarDigital Library
John Rupert Firth. 1957. A Synopsis of Linguistic Theory 1930--1955. The Philological Society, Oxford. 1--32 pages.Google Scholar
Shoji Fujiwara and Satoshi Sekine. 2011. Self-adjusting bootstrapping. In Proceedings of the 12th International Conference on Computational Linguistics and Intelligent Text Processing - Volume Part II (CICLing’11). Springer-Verlag, Berlin, 188--201. http://dl.acm.org/citation.cfm?id=1964750.1964767 Google ScholarDigital Library
Pablo Gamallo and Marcos Garcia. 2015. Multilingual open information extraction. In Progress in Artificial Intelligence, Francisco Pereira, Penousal Machado, Ernesto Costa, and Amílcar Cardoso (Eds.). Springer International Publishing, Cham, 711--722.Google Scholar
Fabien Gandon. 2018. A survey of the first 20 years of research on semantic Web and linked data. Revue des Sciences et Technologies de l’Information-Série ISI: Ingénierie des Systèmes d’Information. 11--56.Google Scholar
Artur d’Avila Garcez, Tarek R. Besold, Luc De Raedt, Peter Földiak, Pascal Hitzler, Thomas Icard, Kai-Uwe Kühnberger, Luis C. Lamb, Risto Miikkulainen, and Daniel L. Silver. 2015. Neural-symbolic learning and reasoning: Contributions and challenges. In 2015 AAAI Spring Symposium Series. 18--21.Google Scholar
Mattijs Ghijsen, Jeroen van der Ham, Paola Grosso, Cosmin Dumitru, Hao Zhu, Zhiming Zhao, and Cees de Laat. 2013. A semantic-web approach for modeling computing infrastructures. Computers 8 Electrical Engineering 39, 8 (2013), 2553--2565. Google ScholarDigital Library
Giorgos Giannopoulos, Nikos Bikakis, Theodore Dalamagas, and Timos Sellis. 2010. GoNTogle: A Tool for Semantic Annotation and Search. Springer, Berlin, 376--380.Google Scholar
Edgar Gonzalez and Jordi Turmo. 2009. Unsupervised relation extraction by massive clustering. In 2009 Ninth IEEE International Conference on Data Mining. 782--787. Google ScholarDigital Library
Ralph Grishman and Beth Sundheim. 1996. Message understanding conference-6: A brief history. In Proceedings of the 16th Conference on Computational Linguistics - Volume 1 (COLING’96). Association for Computational Linguistics, Stroudsburg, PA, 466--471. Google ScholarDigital Library
Ramanathan V. Guha, Rob McCool, and Eric Miller. 2003. Semantic search. In Proceedings of the 12th International Conference on World Wide Web (WWW’03). ACM, New York, NY, 700--709. Google ScholarDigital Library
Anupama Gupta, Imon Banerjee, and Daniel L. Rubin. 2018. Automatic information extraction from unstructured mammography reports using distributed semantics. Journal of Biomedical Informatics 78 (2018), 78--86.Google ScholarCross Ref
Sonal Gupta and Christopher D. Manning. 2015. Distributed representations of words to guide bootstrapped entity classifiers. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1215--1220.Google Scholar
Xianpei Han and Le Sun. 2012. An entity-topic model for entity linking. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’12). Association for Computational Linguistics, Stroudsburg, PA, 105--115. http://dl.acm.org/citation.cfm?id=2390948.2390962 Google ScholarDigital Library
Xianpei Han and Jun Zhao. 2009. Named entity disambiguation by leveraging wikipedia semantic knowledge. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM’09). ACM, New York, NY, 215--224. Google ScholarDigital Library
Siegfried Handschuh and Steffen Staab. 2002. Authoring and annotation of web pages in CREAM. In Proceedings of the 11th International Conference on World Wide Web (WWW’02). ACM, New York, NY, 462--473. Google ScholarDigital Library
Hany Hassan, Ahmed Hassan, and Ossama Emam. 2006. Unsupervised information extraction approach using graph mutual reinforcement. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP’06). Association for Computational Linguistics, Stroudsburg, PA, 501--508. http://dl.acm.org/citation.cfm?id=1610075.1610144 Google ScholarDigital Library
Ying He and Mehmet Kayaalp. 2008. Biological entity recognition with conditional random fields. In AMIA Annual Symposium Proceedings, Vol. 2008. American Medical Informatics Association, 293--297.Google Scholar
Marti A. Hearst. 1992. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th Conference on Computational Linguistics - Volume 2 (COLING’92). Association for Computational Linguistics, Stroudsburg, PA, 539--545. Google ScholarDigital Library
Aron Henriksson, Maria Kvist, Hercules Dalianis, and Martin Duneld. 2015. Identifying adverse drug event information in clinical notes with distributional semantic representations of context. Journal of Biomedical Informatics 57 (2015), 333--349. Google ScholarDigital Library
Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Fürstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. 2011. Robust disambiguation of named entities in text. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 782--792. Google ScholarDigital Library
Andrew Hogue and David Karger. 2005. Thresher: Automating the unwrapping of semantic content from the world wide web. In Proceedings of the 14th International Conference on World Wide Web (WWW’05). ACM, New York, NY, 86--95. Google ScholarDigital Library
Julia Hoxha, Guoqian Jiang, and Chunhua Weng. 2016. Automated learning of domain taxonomies from text using background knowledge. Journal of Biomedical Informatics 63 (2016), 295--306.Google ScholarCross Ref
Roman Ivanitskiy, Alexander Shipilo, and Liubov Kovriguina. 2016. Russian named entities recognition and classification using distributed word and phrase representations. In Symposium on Information Management and Big Data. 150--156.Google Scholar
Shengbin Jia, Shijia E, Maozhen Li, and Yang Xiang. 2018. Chinese open relation extraction and knowledge base establishment. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 17, 3 (Feb. 2018), Article 15, 22 pages. Google ScholarDigital Library
MD Jiménez, N. Fernández, Jesús Arias Fisteus, and Luis Sánchez. 2013. WikiIdRank++: Extensions and improvements of the WikiIdRank system for entity linking. International Journal on Artificial Intelligence Tools 22, 3 (2013).Google ScholarCross Ref
Ehsan Kamalloo and Davood Rafiei. 2018. A coherent unsupervised model for toponym resolution. In Proceedings of the 2018 World Wide Web Conference (WWW’18). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 1287--1296. Google ScholarDigital Library
Nanda Kambhatla. 2004. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions (ACLdemo’04). Association for Computational Linguistics, Stroudsburg, PA, Article 22 (2004), 178--181. Google ScholarDigital Library
Lobna Karoui, Marie-aude Aufaure, and Nacera Bennacer. 2007. Context-based hierarchical clustering for the ontology learning. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings) (WI’06). 420--427. Google ScholarDigital Library
Ramakanth Kavuluru, Sifei Han, and Daniel Harris. 2013. Unsupervised extraction of diagnosis codes from EMRs using knowledge-based and extractive text summarization techniques. In Advances in Artificial Intelligence, Osmar R. Zaïane and Sandra Zilles (Eds.). Springer, Berlin, 77--88.Google Scholar
Atanas Kiryakov, Borislav Popov, Ivan Terziev, Dimitar Manov, and Damyan Ognyanoff. 2004. Semantic annotation, indexing, and retrieval. Web Semantics: Science, Services and Agents on the World Wide Web 2, 1 (2004), 49--79. Google ScholarDigital Library
Nadzeya Kiyavitskaya, Nicola Zeni, James R. Cordy, Luisa Mich, and John Mylopoulos. 2009. Cerno: Light-weight tool support for semantic annotation of textual documents. Data 8 Knowledge Engineering 68, 12 (2009), 1470--1492. Google ScholarDigital Library
Paul Kogut and William Holmes. 2001. AeroDAML: Applying information extraction to generate DAML annotations from web pages. In Proceedings of the 1st International Conference on Knowledge Capture (KCAP’01). 1--3.Google Scholar
Teuvo Kohonen, Samuel Kaski, Krista Lagus, J. Salojarvi, J. Honkela, V. Paatero, and A. Saarela. 2000. Self organization of a massive document collection. IEEE Transactions on Neural Networks 11, 3 (May 2000), 574--585. Google ScholarDigital Library
Prodromos Kolyvakis, Alexandros Kalousis, Barry Smith, and Dimitris Kiritsis. 2018. Biomedical ontology alignment: An approach based on representation learning. Journal of Biomedical Semantics 9, 1 (Aug. 2018), 21, 1--20.Google ScholarCross Ref
Michal Konkol, Tomáš Brychcín, and Miloslav Konopík. 2015. Latent semantics in named entity recognition. Expert Systems with Applications 42, 7 (May 2015), 3470--3479. Google ScholarDigital Library
Zornitsa Kozareva and Sujith Ravi. 2011. Unsupervised name ambiguity resolution using a generative model. In Proceedings of the 1st Workshop on Unsupervised Learning in NLP (EMNLP’11). Association for Computational Linguistics, Stroudsburg, PA, 105--112. http://dl.acm.org/citation.cfm?id=2140458.2140471 Google ScholarDigital Library
Mikalai Krapivin, Maurizio Marchese, Andrei Yadrantsau, and Yanchun Liang. 2008. Unsupervised key-phrases extraction from scientific papers using domain and linguistic knowledge. In Proceedings of the 2008 3rd International Conference on Digital Information Management. 105--112.Google ScholarCross Ref
Vitaveska Lanfranchi, Fabio Ciravegna, and Daniela Petrelli. 2005. Semantic web-based document: Editing and browsing in AktiveDoc. In The Semantic Web: Research and Applications, Asunción Gómez-Pérez and Jérôme Euzenat (Eds.). Springer, Berlin, 623--632. Google ScholarDigital Library
Nadège Lechevrel, Kata Gábor, Isabelle Tellier, Thierry Charnois, Haïfa Zargayouna, and Davide Buscaldi. 2017. Combining syntactic and sequential patterns for unsupervised semantic relation extraction. In DMNLP Workshop@ ECML-PKDD. 81--84.Google Scholar
Jens Lehmann and Johanna Völker. 2014. Perspectives on Ontology Learning. Vol. 18. IOS Press.Google Scholar
Yanen Li, Bo-June Paul Hsu, ChengXiang Zhai, and Kuansan Wang. 2013a. Mining entity attribute synonyms via compact clustering. In Proceedings of the 22nd ACM International Conference on Conference on Information 8 Knowledge Management. ACM, 867--872. Google ScholarDigital Library
Yang Li, Chi Wang, Fangqiu Han, Jiawei Han, Dan Roth, and Xifeng Yan. 2013b. Mining evidences for named entity disambiguation. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’13). ACM, New York, NY, 1070--1078. Google ScholarDigital Library
Xiaofeng Liao, Qingshan Jiang, Wei Zhang, and Kai Zhang. 2014. BiModal latent dirichlet allocation for text and image. In Proceedings of the 2014 4th IEEE International Conference on Information Science and Technology. 736--739.Google ScholarCross Ref
Yongxin Liao, Mario Lezoche, Hervé Panetto, Nacer Boudjlida, and Eduardo Rocha Loures. 2015. Semantic annotation for knowledge explicitation in a product lifecycle management context: A survey. Computers in Industry 71 (2015), 24--34. Google ScholarDigital Library
Thomas Lin, Mausam, and Oren Etzioni. 2012. Entity linking at web scale. In Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction (AKBC-WEKEX’12). Association for Computational Linguistics, Stroudsburg, PA, 84--88. http://dl.acm.org/citation.cfm?id=2391200.2391216 Google ScholarDigital Library
Xiao Ling, Sameer Singh, and Daniel S. Weld. 2015. Design challenges for entity linking. Transactions of the Association for Computational Linguistics 3 (2015), 315--328.Google ScholarCross Ref
Xiao Ling and Daniel S. Weld. 2012. Fine-grained entity recognition. In Proceedings of the 26th AAAI Conference on Artificial Intelligence (AAAI’12). AAAI Press, 94--100. http://dl.acm.org/citation.cfm?id=2900728.2900742 Google ScholarDigital Library
Shengyu Liu, Buzhou Tang, Qingcai Chen, and Xiaolong Wang. 2015b. Effects of semantic features on machine learning-based drug name recognition systems: Word embeddings vs. manually constructed dictionaries. Information 6, 4 (2015), 848--865.Google ScholarCross Ref
Yudong Liu, Clinton Burkhart, James Hearne, and Liang Luo. 2015a. Enhancing sumerian lemmatization by unsupervised named-entity recognition. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1446--1451.Google ScholarCross Ref
Yang Liu, Shulin Liu, Kang Liu, Guangyou Zhou, and Jun Zhao. 2013. Exploring distinctive features in distant supervision for relation extraction. In Information Retrieval Technology, Rafael E. Banchs, Fabrizio Silvestri, Tie-Yan Liu, Min Zhang, Sheng Gao, and Jun Lang (Eds.). Springerg, Berlin, 344--355.Google Scholar
Gideon S. Mann and David Yarowsky. 2003. Unsupervised personal name disambiguation. In Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL 2003 - Volume 4 (CONLL’03). Association for Computational Linguistics, Stroudsburg, PA, 33--40. Google ScholarDigital Library
Diana Maynard, Kalina Bontcheva, and Isabelle Augenstein. 2016. Natural Language Processing for the Semantic Web. Morgan 8 Claypool Publishers. Google ScholarDigital Library
Luke K. McDowell and Michael Cafarella. 2006. Ontology-driven information extraction with ontosyphon. In Proceedings of the 5th International Conference on The Semantic Web (ISWC’06). Springer-Verlag, Berlin, 428--444. Google ScholarDigital Library
Paul McNamee and Hoa Trang Dang. 2009. Overview of the TAC 2009 knowledge base population track. In Text Analysis Conference (TAC), Vol. 17. National Institute of Standards and Technology (NIST), Gaithersburg, Maryland, 111--113.Google Scholar
Pablo N. Mendes, Max Jakob, Andrés García-Silva, and Christian Bizer. 2011. DBpedia spotlight: Shedding light on the web of documents. In Proceedings of the 7th International Conference on Semantic Systems (I-Semantics’11). ACM, New York, NY, 1--8. Google ScholarDigital Library
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient estimation of word representations in vector space. In International Conference on Learning Representations Workshop. 1--12.Google Scholar
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013b. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems. 3111--3119. Google ScholarDigital Library
Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. 2013c. Linguistic regularities in continuous space word representations. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT-2013). Association for Computational Linguistics, 746--751. https://www.aclweb.org/anthology/N13-1090.Google Scholar
Bonan Min, Shuming Shi, Ralph Grishman, and Chin-Yew Lin. 2012. Ensemble semantics for large-scale unsupervised relation extraction. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’12). Association for Computational Linguistics, Stroudsburg, PA, 1027--1037. http://dl.acm.org/citation.cfm?id=2390948.2391062 Google ScholarDigital Library
Tom M. Mitchell, William Cohen, Estevam Hruschka, Partha Talukdar, Justin Betteridge, Andrew Carlson, Bhavana Dalvi Mishra, Matthew Gardner, Bryan Kisiel, Jayant Krishnamurthy, Ni Lao, Kathryn Mazaitis, Thahir Mohamed, Ndapa Nakashole, Emmanouil Antonios Platanios, Alan Ritter, Mehdi Samadi, Burr Settles, Richard Wang, Derry Wijaya, Abhinav Gupta, Xinlei Chen, Abulhair Saparov, Malcolm Greaves, and Joel Welling. 2015. Never-ending learning. In Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI’15). 2302--2310. Google ScholarDigital Library
Tom M. Mitchell, William Cohen, Estevam Hruschka, Partha Talukdar, Bishan Yang, Justin Betteridge, Andrew Carlson, Bhavana Dalvi Mishra, Matthew Gardner, Bryan Kisiel, Jayant Krishnamurthy, Ni Lao, Kathryn Mazaitis, Thahir Mohamed, Ndapa Nakashole, Emmanouil Antonios Platanios, Alan Ritter, Mehdi Samadi, Burr Settles, Richard Wang, Derry Wijaya, Abhinav Gupta, Xinlei Chen, Abulhair Saparov, Malcolm Greaves, and Joel Welling. 2018. Never-ending learning. Communications of the ACM 61, 5 (April 2018), 103--115.Google ScholarDigital Library
Thahir P. Mohamed, Estevam R. Hruschka, Jr., and Tom M. Mitchell. 2011. Discovering relations between noun categories. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11). Association for Computational Linguistics, Stroudsburg, PA, 1447--1455. http://dl.acm.org/citation.cfm?id=2145432.2145586 Google ScholarDigital Library
Yusra Mosallam, Alaa Abi-Haidar, and Jean-Gabriel Ganascia. 2014. Unsupervised named entity recognition and disambiguation: An application to old french journals. In Advances in Data Mining. Applications and Theoretical Aspects, Petra Perner (Ed.). Springer International Publishing, Cham, 12--23.Google Scholar
Saikat Mukherjee, I. V. Ramakrishnan, and Amarjeet Singh. 2005. Bootstrapping semantic annotation for content-rich HTML documents. In Proceedings of the 21st International Conference on Data Engineering (ICDE’05). IEEE Computer Society, Washington, DC, 583--593. Google ScholarDigital Library
Robert Munro and Christopher D. Manning. 2012. Accurate unsupervised joint named-entity extraction from unaligned parallel text. In Proceedings of the 4th Named Entity Workshop (NEWS’12). Association for Computational Linguistics, Stroudsburg, PA, 21--29. http://dl.acm.org/citation.cfm?id=2392777.2392781 Google ScholarDigital Library
David Nadeau and Satoshi Sekine. 2007. A survey of named entity recognition and classification. Lingvisticæ Investigationes 30, 1 (2007), 3--26.Google ScholarCross Ref
David Nadeau, Peter D. Turney, and Stan Matwin. 2006. Unsupervised named-entity recognition: Generating gazetteers and resolving ambiguity. In Conference of the Canadian Society for Computational Studies of Intelligence. Springer, 266--277. Google ScholarDigital Library
Thien Huu Nguyen and Ralph Grishman. 2015. Relation extraction: Perspective from convolutional neural networks. In Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing. 39--48.Google ScholarCross Ref
Hilário Oliveira, Rinaldo Lima, João Gomes, Rafael Ferreira, Fred Freitas, and Evandro Costa. 2012. A confidence-weighted metric for unsupervised ontology population from web texts. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 7446 LNCS, 1 (2012), 176--190.Google Scholar
Pedro Oliveira and João Rocha. 2013. Semantic annotation tools survey. In Proceedings of the 2013 IEEE Symposium on Computational Intelligence and Data Mining (CIDM’13). IEEE, 301--307.Google ScholarCross Ref
Chandra Pandey, Zina Ibrahim, Honghan Wu, Ehtesham Iqbal, and Richard Dobson. 2017. Improving RNN with attention and embedding for adverse drug reactions. In Proceedings of the 2017 International Conference on Digital Health (DH’17). ACM, New York, NY, 67--71. Google ScholarDigital Library
Ted Pedersen, Anagha Kulkarni, Roxana Angheluta, Zornitsa Kozareva, and Thamar Solorio. 2006. An unsupervised language independent method of name discrimination using second order co-occurrence features. In Computational Linguistics and Intelligent Text Processing, Alexander Gelbukh (Ed.). Springer, Berlin, 208--222. Google ScholarDigital Library
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1532--1543.Google ScholarCross Ref
Borislav Popov, Atanas Kiryakov, Damyan Ognyanoff, Dimitar Manov, and Angel Kirilov. 2004. KIM—A semantic platform for information extraction and retrieval. Natural Language Engineering 10, 3--4 (Sept. 2004), 375--392. Google ScholarDigital Library
Janardhana Punuru and Jianhua Chen. 2012. Learning non-taxonomical semantic relations from domain texts. Journal of Intelligent Information Systems 38, 1 (Feb. 2012), 191--207. Google ScholarDigital Library
Delip Rao, Paul McNamee, and Mark Dredze. 2013. Entity Linking: Finding Extracted Entities in a Knowledge Base. Springer, Berlin, 93--115.Google Scholar
Lawrence Reeve and Hyoil Han. 2005. Survey of semantic annotation platforms. In Proceedings of the 2005 ACM Symposium on Applied Computing (SAC’05). ACM, New York, NY, 1634--1638. Google ScholarDigital Library
Lawrence Reeve and Hyoil Han. 2006. A comparison of semantic annotation systems for text-based web documents. Web Semantics and Ontology (2006), 165--187.Google Scholar
Steffen Remus. 2014. Unsupervised relation extraction of in-domain data from focused crawls. In Proceedings of the Student Research Workshop at the 14th Conference of the European Chapter of the Association for Computational Linguistics. 11--20.Google ScholarCross Ref
Alan Ritter, Sam Clark, Mausam, and Oren Etzioni. 2011. Named entity recognition in tweets: An experimental study. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11). Association for Computational Linguistics, Stroudsburg, PA, 1524--1534. http://dl.acm.org/citation.cfm?id=2145432.2145595 Google ScholarDigital Library
Benjamin Rosenfeld and Ronen Feldman. 2006. URES: An unsupervised web relation extraction system. In Proceedings of the COLING/ACL on Main Conference Poster Sessions (COLING-ACL’06). Association for Computational Linguistics, Stroudsburg, PA, 667--674. http://dl.acm.org/citation.cfm?id=1273073.1273159 Google ScholarDigital Library
Benjamin Rosenfeld and Ronen Feldman. 2007. Clustering for unsupervised relation identification. In Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM’07). ACM, New York, NY, 411--418. Google ScholarDigital Library
Binjamin Rozenfeld and Ronen Feldman. 2006. High-performance unsupervised relation extraction from large corpora. In Proceedings of the 6th International Conference on Data Mining (ICDM’06). 1032--1037. Google ScholarDigital Library
Rune Sætre, Amund Tveit, Tonje S. Steigedal, and Astrid Lægreid. 2005. Semantic annotation of biomedical literature using Google. In Proceedings of the 2005 International Conference on Computational Science and Its Applications - Volume Part III (ICCSA’05). Springer-Verlag, Berlin, 327--337. Google ScholarDigital Library
Saurav Sahay, Sougata Mukherjea, Eugene Agichtein, Ernest V. Garcia, Shamkant B. Navathe, and Ashwin Ram. 2008. Discovering semantic biomedical relations utilizing the web. ACM Transactions on Knowledge Discovery from Data (TKDD) 2, 1 (April 2008), Article 3, 15 pages. Google ScholarDigital Library
Vladimir Salin, Maria Slastihina, Ivan Ermilov, René Speck, Sören Auer, and Sergey Papshev. 2015. Semantic clustering of website based on its hypertext structure. In Knowledge Engineering and Semantic Web, Pavel Klinov and Dmitry Mouromtsev (Eds.). Springer International Publishing, Cham, 182--194.Google Scholar
David Sánchez, David Isern, and Miquel Millan. 2011. Content annotation for the semantic web: An automatic web-based approach. Knowledge and Information Systems 27, 3 (2011), 393--418. Google ScholarDigital Library
David Sánchez and Antonio Moreno. 2006. Discovering non-taxonomic relations from the web. In Intelligent Data Engineering and Automated Learning -- IDEAL 2006, Emilio Corchado, Hujun Yin, Vicente Botti, and Colin Fyfe (Eds.). Springer, Berlin, 629--636. Google ScholarDigital Library
Wei Shen, Jianyong Wang, and Jiawei Han. 2015. Entity linking with a knowledge base: Issues, techniques, and solutions. IEEE Transactions on Knowledge and Data Engineering 27, 2 (Feb. 2015), 443--460.Google ScholarCross Ref
Wei Shen, Jianyong Wang, Ping Luo, and Min Wang. 2012b. A graph-based approach for ontology population with named entities. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM’12). ACM, New York, NY, 345--354. Google ScholarDigital Library
Wei Shen, Jianyong Wang, Ping Luo, and Min Wang. 2012a. LINDEN: Linking named entities with knowledge base via semantic knowledge. In Proceedings of the 21st International Conference on World Wide Web (WWW’12). ACM, New York, NY, 449--458. Google ScholarDigital Library
Utpal Kumar Sikdar and Björn Gambäck. 2017. Named entity recognition for amharic using stack-based deep learning. In International Conference on Computational Linguistics and Intelligent Text Processing. Springer, 276--287.Google Scholar
Maria Skeppstedt. 2014. Enhancing medical named entity recognition with features derived from unsupervised methods. In Proceedings of the Student Research Workshop at the 14th Conference of the European Chapter of the Association for Computational Linguistics. 21--30.Google ScholarCross Ref
Hassan A. Sleiman and Rafael Corchuelo. 2014. Trinity: On using trinary trees for unsupervised web data extraction. IEEE Transactions on Knowledge and Data Engineering 26, 6 (June 2014), 1544--1556. Google ScholarDigital Library
David Snchez. 2012. Domain Ontology Learning from the Web: An Unsupervised, Automatic and Domain Independent Approach. AV Akademikerverlag. Google ScholarDigital Library
Hye-Jeong Song, Byeong-Cheol Jo, Chan-Young Park, Jong-Dae Kim, and Yu-Seop Kim. 2018. Comparison of named entity recognition methodologies in biomedical documents. BioMedical Engineering OnLine 17, 2 (Nov. 2018), 21--34.Google ScholarCross Ref
Vitór Souza, Nicola Zeni, Nadzeya Kiyavitskaya, Periklis Andritsos, Luisa Mich, and John Mylopoulos. 2008. Automating the Generation of Semantic Annotation Tools Using a Clustering Technique. Springer, Berlin, 91--96.Google Scholar
Matthew C. Swain and Jacqueline M. Cole. 2016. ChemDataExtractor: A toolkit for automated extraction of chemical information from the scientific literature. Journal of Chemical Information and Modeling 56, 10 (2016), 1894--1904.Google ScholarCross Ref
Minna Tamper. 2016. Extraction of entities and concepts from finnish texts. Master's thesis. Aalto University, Espoo, Finland.Google Scholar
Jie Tang, Duo Zhang, Limin Yao, and Yi Li. 2008. Automatic Semantic Annotation Using Machine Learning. IGI Global. 106--149.Google Scholar
Hilário Tomaz, Rinaldo Lima, João Emanoel, and Fred Freitas. 2012. An unsupervised method for ontology population from the web. In Advances in Artificial Intelligence -- IBERAMIA 2012, Juan Pavón, Néstor D. Duque-Méndez, and Rubén Fuentes-Fernández (Eds.). Springer, Berlin, 41--50.Google ScholarCross Ref
Victoria Uren, Philipp Cimiano, José Iria, Siegfried Handschuh, Maria Vargas-Vera, Enrico Motta, and Fabio Ciravegna. 2006. Semantic annotation for knowledge management: Requirements and a survey of the state of the art. Web Semant. 4, 1 (Jan. 2006), 14--28. Google ScholarDigital Library
Hans Uszkoreit, Feiyu Xu, and Hong Li. 2010. Analysis and improvement of minimally supervised machine learning for relation extraction. In Proceedings of the 14th International Conference on Applications of Natural Language to Information Systems (NLDB’09). Springer-Verlag, Berlin, 8--23. Google ScholarDigital Library
Leslie G. Valiant. 2008. Knowledge infusion: In pursuit of robustness in artificial intelligence. In IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science. Schloss Dagstuhl-Leibniz-Zentrum für Informatik. 1--8.Google Scholar
Maria Vargas-Vera, Enrico Motta, John Domingue, Mattia Lanzoni, Arthur Stutt, and Fabio Ciravegna. 2002. MnM: Ontology driven semi-automatic and automatic support for semantic markup. Knowledge Engineering and Knowledge Management: Ontologies and the Semantic Web, 213--221. Google ScholarDigital Library
Marie Verbanck, Sébastien Lê, and Jérôme Pagès. 2013. A new unsupervised gene clustering algorithm based on the integration of biological knowledge into expression data. BMC Bioinformatics 14, 1 (Feb. 2013), 1--11.Google ScholarCross Ref
Juan C. Vidal, Manuel Lama, Estefanía Otero-García, and Alberto Bugarín. 2014. Graph-based semantic annotation for enriching educational content with linked data. Knowledge-Based Systems 55, Supplement C (2014), 29--42. Google ScholarDigital Library
Maximilian Walther. 2012. Unsupervised extraction of product information from semi-structured sources. In Proceedings of 2012 IEEE 13th International Symposium on Computational Intelligence and Informatics (CINTI’12). 257--262.Google ScholarCross Ref
Fang Wang, Wei Wu, Zhoujun Li, and Ming Zhou. 2017b. Named entity disambiguation for questions in community question answering. Knowledge-Based System 126, C (June 2017), 68--77. Google ScholarDigital Library
Wei Wang, Romaric Besançon, Olivier Ferret, and Brigitte Grau. 2011. Filtering and clustering relations for unsupervised information extraction in open domain. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM’11). ACM, New York, NY, 1405--1414. Google ScholarDigital Library
Yingxu Wang, Mehrdad Valipour, Omar D. Zatarain, Marina L. Gavrilova, Amir Hussain, Newton Howard, and Shushma Patel. 2017a. Formal ontology generation by deep machine learning. In Proceedings of the 2017 IEEE 16th International Conference on Cognitive Informatics Cognitive Computing (ICCI*CC). 6--15.Google ScholarCross Ref
Zheng Wang, Shuo Xu, and Lijun Zhu. 2018. Semantic relation extraction aware of N-gram features from unstructured biomedical text. Journal of Biomedical Informatics 86 (2018), 59--70.Google ScholarCross Ref
Wentao Wu, Hongsong Li, Haixun Wang, and Kenny Q. Zhu. 2017. Semantic bootstrapping: A theoretical perspective. IEEE Transactions on Knowledge and Data Engineering 29, 2 (2017), 446--457. Google ScholarDigital Library
Zhifeng Xiao. 2017. Towards a two-phase unsupervised system for cybersecurity concepts extraction. In Proceedings of the 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD’17). 2161--2168.Google ScholarCross Ref
Vikas Yadav and Steven Bethard. 2018. A survey on recent advances in named entity recognition from deep learning models. In Proceedings of the 27th International Conference on Computational Linguistics. 2145--2158.Google Scholar
Yang Yan, Tingwen Liu, Li Guo, Jiapeng Zhao, and Jinqiao Shi. 2016. An unsupervised framework towards sci-tech compound entity recognition. In Knowledge Science, Engineering and Management, Franz Lehner and Nora Fteimi (Eds.). Springer International Publishing, Cham, 110--122.Google Scholar
Yulan Yan, Naoaki Okazaki, Yutaka Matsuo, Zhenglu Yang, and Mitsuru Ishizuka. 2009. Unsupervised relation extraction by mining Wikipedia texts using information from the web. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2 (ACL’09). Association for Computational Linguistics, Stroudsburg, PA, 1021--1029. http://dl.acm.org/citation.cfm?id=1690219.1690289 Google ScholarDigital Library
Hsin-Chang Yang. 2006. A method for automatic construction of learning contents in semantic web by a text mining approach. International Journal of Knowledge and Learning 2, 1--2 (2006), 89--105.Google ScholarCross Ref
Hsin-Chang Yang. 2009. Automatic generation of semantically enriched web pages by a text mining approach. Expert Systems with Applications 36, 6 (2009), 9709--9718. Google ScholarDigital Library
Alexander Yates and Oren Etzioni. 2009. Unsupervised methods for determining object and relation synonyms on the web. Journal of Artificial Intelligence Research 34 (2009), 255--296. Google ScholarDigital Library
Dongxiang Zhang, Long Guo, Xiangnan He, Jie Shao, Sai Wu, and Heng Tao Shen. 2018. A graph-theoretic fusion framework for unsupervised entity resolution. In 2018 IEEE 34th International Conference on Data Engineering (ICDE’18). 713--724.Google ScholarCross Ref
Jing Zhang, Yixin Cao, Lei Hou, Juanzi Li, and Hai-Tao Zheng. 2017. XLink: An unsupervised bilingual entity linking system. In Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. Springer, 172--183.Google Scholar
Min Zhang, Jian Su, Danmei Wang, Guodong Zhou, and Chew Lim Tan. 2005. Discovering relations between named entities from a large raw corpus using tree similarity-based clustering. In Natural Language Processing -- IJCNLP 2005, Robert Dale, Kam-Fai Wong, Jian Su, and Oi Yee Kwong (Eds.). Springer, Berlin, 378--389. Google ScholarDigital Library
Yaoyun Zhang, Jun Xu, Hui Chen, Jingqi Wang, Yonghui Wu, Manu Prakasam, and Hua Xu. 2016. Chemical named entity recognition in patents by domain knowledge and unsupervised feature learning. Database 2016, Article baw049 (2016), 1--10.Google Scholar
Jun Zhu, Zaiqing Nie, Xiaojiang Liu, Bo Zhang, and Ji-Rong Wen. 2009. StatSnowball: A statistical approach to extracting entity relationships. In Proceedings of the 18th International Conference on World Wide Web (WWW’09). ACM, New York, NY, 101--110. Google ScholarDigital Library

Index Terms

Recommendations

Survey of semantic annotation platforms
SAC '05: Proceedings of the 2005 ACM symposium on Applied computing

The realization of the Semantic Web requires the widespread availability of semantic annotations for existing and new documents on the Web. Semantic annotations are to tag ontology class instance data and map it into ontology classes. The fully ...
Read More
Towards the semantic annotation of Web of Things: A collective disambiguation approach
BDCA'17: Proceedings of the 2nd international Conference on Big Data, Cloud and Applications

Developing1 accurate entity linking systems is essential for the semantic annotation of Web of Things data. The Entity linking (EL) task aims at linking a piece of data called mention from a source document to the entity it represents in a knowledge ...
Read More
Information extraction meets the Semantic Web: A survey

We provide a comprehensive survey of the research literature that applies Information Extraction techniques in a Semantic Web setting. Works in the intersection of these two areas can be seen from two overlapping perspectives: using Semantic Web resources ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Computing Surveys Volume 52, Issue 4
July 2020
769 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/3359984
Editor:
Sartaj Sahni
Department of Computer and Information Science and Engineering
Issue’s Table of Contents
Copyright © 2019 Owner/Author
This work is licensed under a Creative Commons Attribution-NonCommercial International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 August 2019
- Accepted: 1 April 2019
- Revised: 1 March 2019
- Received: 1 April 2018
Published in csur Volume 52, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Semantic annotation
entity linking
entity recognition
information extraction
machine learning
relation extraction
unsupervised
Qualifiers
- survey
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 34
  Total Citations
  View Citations
- 3,590
  Total Downloads
- Downloads (Last 12 months)477
- Downloads (Last 6 weeks)48
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Unsupervised Approaches for Textual Semantic Annotation, A Survey

ACM Computing Surveys

Abstract

References

Cited By

Index Terms

Recommendations

Survey of semantic annotation platforms

Towards the semantic annotation of Web of Things: A collective disambiguation approach

Information extraction meets the Semantic Web: A survey