research-article

Free Access

Joint inference of named entity recognition and normalization for tweets

Authors:
Xiaohua Liu

School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China and Microsoft Research Asia, Beijing, China

School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China and Microsoft Research Asia, Beijing, China
View Profile

,
Ming Zhou

Microsoft Research Asia, Beijing, China

Microsoft Research Asia, Beijing, China
View Profile

,
Furu Wei

Microsoft Research Asia, Beijing, China

Microsoft Research Asia, Beijing, China
View Profile

,
Zhongyang Fu

Shanghai Jiao Tong University, Shanghai, China

Shanghai Jiao Tong University, Shanghai, China
View Profile

,
Xiangyang Zhou

Shandong University, Jinan, China

Shandong University, Jinan, China
View Profile

ACL '12: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1July 2012Pages 526–535

Published:08 July 2012Publication History

ACL '12: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1

Pages 526–535

ABSTRACT

Tweets represent a critical source of fresh information, in which named entities occur frequently with rich variations. We study the problem of named entity normalization (NEN) for tweets. Two main challenges are the errors propagated from named entity recognition (NER) and the dearth of information in a single tweet. We propose a novel graphical model to simultaneously conduct NER and NEN on multiple tweets to address these challenges. Particularly, our model introduces a binary random variable for each pair of words with the same lemma across similar tweets, whose value indicates whether the two related words are mentions of the same entity. We evaluate our method on a manually annotated data set, and show that our method outperforms the baseline that handles these two tasks separately, boosting the F1 from 80.2% to 83.6% for NER, and the Accuracy from 79.4% to 82.6% for NEN, respectively.

References

Laura Chiticariu, Rajasekar Krishnamurthy, Yunyao Li, Frederick Reiss, and Shivakumar Vaithyanathan. 2010. Domain adaptation of rule-based annotators for named-entity recognition tasks. In EMNLP, pages 1002--1012. Google ScholarDigital Library
Aaron Cohen. 2005. Unsupervised gene/protein named entity normalization using automatically extracted dictionaries. In Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics, pages 17--24, Detroit, June. Association for Computational Linguistics. Google ScholarDigital Library
Silviu Cucerzan. 2007. Large-scale named entity disambiguation based on wikipedia data. In In Proc. 2007 Joint Conference on EMNLP and CNLL, pages 708--716.Google Scholar
Hong-Jie Dai, Richard Tzong-Han Tsai, and Wen-Lian Hsu. 2011. Entity disambiguation using a markov-logic network. In Proceedings of 5th International Joint Conference on Natural Language Processing, pages 846--855, Chiang Mai, Thailand, November. Asian Federation of Natural Language Processing.Google Scholar
Doug Downey, Matthew Broadhead, and Oren Etzioni. 2007. Locating Complex Named Entities in Web Text. In IJCAI. Google ScholarDigital Library
Oren Etzioni, Michael Cafarella, Doug Downey, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, and Alexander Yates. 2005. Unsupervised named-entity extraction from the web: an experimental study. Artif. Intell., 165(1): 91--134. Google ScholarDigital Library
Tim Finin, Will Murnane, Anand Karandikar, Nicholas Keller, Justin Martineau, and Mark Dredze. 2010. Annotating named entities in twitter data with crowd-sourcing. In CSLDAMT, pages 80--88. Google ScholarDigital Library
Jenny Rose Finkel and Christopher D. Manning. 2009. Nested named entity recognition. In EMNLP, pages 141--150. Google ScholarDigital Library
Michel Galley. 2006. A skip-chain conditional random field for ranking meeting utterances by importance. In Association for Computational Linguistics, pages 364--372. Google ScholarDigital Library
Bo Han and Timothy Baldwin. 2011. Lexical normalisation of short text messages: Makn sens a #twitter. In ACL HLT. Google ScholarDigital Library
Martin Jansche and Steven P. Abney. 2002. Information extraction from voicemail transcripts. In EMNLP, pages 320--327. Google ScholarDigital Library
Valentin Jijkoun, Mahboob Alam Khalid, Maarten Marx, and Maarten de Rijke. 2008. Named entity normalization in user generated content. In Proceedings of the second workshop on Analytics for noisy unstructured text data, AND '08, pages 23--30, New York, NY, USA. ACM. Google ScholarDigital Library
Mahboob Khalid, Valentin Jijkoun, and Maarten de Rijke. 2008. The impact of named entity normalization on information retrieval for question answering. In Craig Macdonald, Iadh Ounis, Vassilis Plachouras, Ian Ruthven, and Ryen White, editors, Advances in Information Retrieval, volume 4956 of Lecture Notes in Computer Science, pages 705--710. Springer Berlin/Heidelberg. Google ScholarDigital Library
George R. Krupka and Kevin Hausman. 1998. Isoquest: Description of the netowl#8482; extractor system as used in muc-7. In MUC-7.Google Scholar
Huifeng Li, Rohini K. Srihari, Cheng Niu, and Wei Li. 2002. Location normalization for information extraction. In COLING. Google ScholarDigital Library
Xiaohua Liu, Shaodian Zhang, Furu Wei, and Ming Zhou. 2011. Recognizing named entities in tweets. In ACL. Google ScholarDigital Library
Walid Magdy, Kareem Darwish, Ossama Emam, and Hany Hassan. 2007. Arabic cross-document person name normalization. In In CASL Workshop 07, pages 25--32. Google ScholarDigital Library
Andrew Mccallum and Wei Li. 2003. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In HLT-NAACL, pages 188--191. Google ScholarDigital Library
Einat Minkov, Richard C. Wang, and William W. Cohen. 2005. Extracting personal names from email: applying named entity recognition to informal text. In HLT, pages 443--450. Google ScholarDigital Library
Kevin P. Murphy, Yair Weiss, and Michael I. Jordan. 1999. Loopy belief propagation for approximate inference: An empirical study. In In Proceedings of Uncertainty in AI, pages 467--475. Google ScholarDigital Library
David Nadeau and Satoshi Sekine. 2007. A survey of named entity recognition and classification. Linguisticae Investigationes, 30: 3--26.Google ScholarCross Ref
Lev Ratinov and Dan Roth. 2009. Design challenges and misconceptions in named entity recognition. In CoNLL, pages 147--155. Google ScholarDigital Library
Alan Ritter, Sam Clark, Mausam, and Oren Etzioni. 2011. Named entity recognition in tweets: An experimental study. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 1524--1534, Edinburgh, Scotland, UK., July. Association for Computational Linguistics. Google ScholarDigital Library
Sameer Singh, Dustin Hillard, and Chris Leggetter. 2010. Minimally-supervised extraction of entities from text advertisements. In HLT-NAACL, pages 73--81. Google ScholarDigital Library
Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In HLT-NAACL, pages 142--147. Google ScholarDigital Library
Yefeng Wang. 2009. Annotating and recognising named entities in clinical notes. In ACL-IJCNLP, pages 18--26. Google ScholarDigital Library
Kazuhiro Yoshida and Jun'ichi Tsujii. 2007. Reranking for biomedical named-entity recognition. In BioNLP, pages 209--216. Google ScholarDigital Library

Recommendations

A joint named entity recognition and entity linking system
HYBRID '12: Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data

We present a joint system for named entity recognition (NER) and entity linking (EL), allowing for named entities mentions extracted from textual data to be matched to uniquely identifiable entities. Our approach relies on combined NER modules which ...
Read More
Re-ranking for joint named-entity recognition and linking
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

Recognizing names and linking them to structured data is a fundamental task in text analysis. Existing approaches typically perform these two steps using a pipeline architecture: they use a Named-Entity Recognition (NER) system to find the boundaries of ...
Read More
Exploring entity relations for named entity disambiguation
HLT-SS '11: Proceedings of the ACL 2011 Student Session

Named entity disambiguation is the task of linking an entity mention in a text to the correct real-world referent predefined in a knowledge base, and is a crucial subtask in many areas like information retrieval or topic detection and tracking. Named ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ACL '12: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
July 2012
1100 pages
General Chair:
Haizhou Li
Institute for Infocomm Research
,
Program Chairs:
Chin-Yew Lin
Microsoft Research Asia
,
Miles Osborne
University of Edinburgh
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 8 July 2012
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate85of443submissions,19%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 577
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Joint inference of named entity recognition and normalization for tweets

ACL '12: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1

ABSTRACT

References

Cited By

Recommendations

A joint named entity recognition and entity linking system

Re-ranking for joint named-entity recognition and linking

Exploring entity relations for named entity disambiguation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Joint inference of named entity recognition and normalization for tweets

ACL '12: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1

ABSTRACT

References

Cited By

Recommendations

A joint named entity recognition and entity linking system

Re-ranking for joint named-entity recognition and linking

Exploring entity relations for named entity disambiguation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media