research-article

Automated Geoparsing of Paris Street Names in 19th Century Novels

Authors:
Ludovic Moncla

Naval Academy Research Institute, Brest - Ecole navale, France

Naval Academy Research Institute, Brest - Ecole navale, France
View Profile

,
Mauro Gaio

Laboratoire LIUPPA, Université de Pau et des Pays de l'Adour, France

Laboratoire LIUPPA, Université de Pau et des Pays de l'Adour, France
View Profile

,
Thierry Joliveau

Laboratoire EVS, Université de Saint-Etienne, France

Laboratoire EVS, Université de Saint-Etienne, France
View Profile

,
Yves-François Le Lay

Laboratoire EVS, ENS Lyon, France

Laboratoire EVS, ENS Lyon, France
View Profile

GeoHumanities '17: Proceedings of the 1st ACM SIGSPATIAL Workshop on Geospatial HumanitiesNovember 2017Pages 1–8https://doi.org/10.1145/3149858.3149859

Published:07 November 2017Publication History

GeoHumanities '17: Proceedings of the 1st ACM SIGSPATIAL Workshop on Geospatial Humanities

Pages 1–8

ABSTRACT

Our project involves building a platform able to retrieve, map and analyze the occurrences of place names in fictional novels published between 1800 and 1914 and whose action occurs wholly or partly in Paris. We describe a proof of concept using queries made via the TXM textual analysis platform for the extraction of street names. Then, we propose a fully automatic process using the named entity recognition (NER) components of the PERDIDO platform. This paper describes some encouraging initial results obtained by combining NLP approaches (NER methods) with textometric tools for the automated geoparsing of street names.

References

Beatrice Alex, Kate Byrne, Claire Grover, and Richard Tobin. 2015. Adapting the Edinburgh geoparser for historical georeferencing. International Journal of Humanities and Arts Computing 9, 1 (2015), 15--35.Google ScholarCross Ref
Beatrice Alex, Claire Grover, Jon Oberlander, Tara Thomson, Miranda Anderson, James Loxley, Uta Hinrichs, and Ke Zhou. 2016. Palimpsest: Improving assisted curation of loco-specific literature. Digital Scholarship in the Humanities 32, 1 (2016), i4--i16.Google Scholar
Miranda Anderson and James Loxley. 2016. The Digital Poetics of Place-Names in Literary Edinburgh. Literary Mapping in the Digital Age (2016), 47.Google Scholar
Frédéric Béchet, Benoît Sagot, and Rosa Stern. 2011. Coopération de méthodes statistiques et symboliques pour l'adaptation non-supervisée d'un système d'étiquetage en entités nommées. In TALN'2011 - Traitement Automatique des Langues Naturelles. https://hal.inria.fr/inria-00617068/documentGoogle Scholar
Noémie Boeglin, Michel Depeyre, Thierry Joliveau, and Yves-Francois Le Lay. 2016. Pour une cartographie romanesque de Paris au XIXe siècle. Proposition méthodologique. In Actes de la conférence SAGEO'2016 - Spatial Analysis and GEOmatics. Nice, France, 76--90.Google Scholar
TEI Consortium (Ed.). 2016. TEI P5: Guidelines for Electronic Text Encoding and Interchange. http://www.tei-c.org/Guidelines/P5/ (accessed July 2017). P5, version 3.1.0. Last updated on 15th December 2016.Google Scholar
David Cooper, Christopher Donaldson, and Patricia Murrieta-Flores. 2016. Literary mapping in the digital age. Routledge.Google Scholar
Nathalie Friburger and Denis Maurel. 2004. Finite-state transducer cascades to extract named entities in texts. Theoretical Computer Science 313, 1 (2004), 93--104. Google ScholarDigital Library
Mauro Gaio and Ludovic Moncla. 2017. Extended Named Entity Recognition Using Finite-State Transducers: An Application to Place Names. In 9th International Conference on Advanced Geographic Information Systems, Applications, and Services. Nice, France.Google Scholar
Ian Gregory and Christopher Donaldson. 2016. Geographical text analysis: Digital cartographies of Lake District literature. Literary Mapping in the Digital Age (2016), 67--87.Google Scholar
Ian Gregory, Christopher Donaldson, Patricia Murrieta-Flores, and Paul Rayson. 2015. Geoparsing, GIS, and Textual Analysis: Current Developments in Spatial Humanities Research. International Journal of Humanities and Arts Computing 9, 1 (March 2015), 1--14.Google ScholarCross Ref
Milan Gritta, Mohammad Taher Pilehvar, Nut Limsopatham, and Nigel Collier. 2017. What's missing in geographical parsing? Language Resources and Evaluation (07 Mar 2017).Google Scholar
Serge Heiden. 2010. The TXM Platform: Building Open-Source Textual Analysis Software Compatible with the TEI Encoding Scheme. In 24th Pacific Asia Conference on Language, Information and Computation, Otoguro Ryo, Ishikawa Kiyoshi, Umemoto Hiroshi, Yoshimoto Kei, and Harada Yasunari (Eds.). Institute for Digital Enhancement of Cognitive Development, Waseda University, Sendai, Japan, 389--398. https://halshs.archives-ouvertes.fr/halshs-00549764Google Scholar
Ryan Heuser, Mark Algee-Hewitt, Van Tran, Annalise Lockhart, and Erik Steiner. 2015. Mapping the emotions of London in fiction, 1700-1900: A crowdsourcing experiment. Proceedings of the Digital Humanities (2015).Google Scholar
Linda L Hill. 2006. Georeferencing: The geographic associations of information. Mit Press. Google ScholarDigital Library
Kerstin Jonasson. 1994. Le nom propre. Duculot, Belgique, Louvain-la-Neuve.Google Scholar
Alexei Lavrentiev, Serge Heiden, and Matthieu Decorde. 2013. Analyzing TEI encoded texts with the TXM platform. In The Linked TEI: Text Encoding in the Web. TEI Conference and Members Meeting 2013.Google Scholar
Monica Matei-Chesnoiu. 2015. Geoparsing early modern English drama. Springer.Google Scholar
Andrew McCallum and Wei Li. 2003. Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-enhanced Lexicons. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003 - Volume 4 (CONLL '03). Association for Computational Linguistics, Stroudsburg, PA, USA, 188--191. Google ScholarDigital Library
Ludovic Moncla and Mauro Gaio. 2015. A Multi-layer Markup Language for Geospatial Semantic Annotations. In Proceedings of the 9th Workshop on Geographic Information Retrieval (GIR '15). ACM, New York, NY, USA, Article 5, 10 pages. Google ScholarDigital Library
Ludovic Moncla, Mauro Gaio, Javier Nogueras-Iso, and Sébastien Mustière. 2016. Reconstruction of itineraries from annotated text with an informed spanning tree algorithm. International Journal of Geographical Information Science 30, 2 (2016). Google ScholarDigital Library
Ludovic Moncla, Walter Renteria-Agualimpia, Javier Nogueras-Iso, and Mauro Gaio. 2014. Geocoding for Texts with Fine-grain Toponyms: An Experiment on a Geoparsed Hiking Descriptions Corpus. In 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (SIGSPATIAL '14). ACM, Dallas,TX, USA, 183--192. Google ScholarDigital Library
Franco Moretti. 1999. Atlas of the European novel, 1800--1900. Verso.Google Scholar
Franco Moretti. 2005. Graphs, maps, trees: abstract models for a literary history. Verso.Google Scholar
PERDIDO. 2017. Extended Named Entity Annotation Service. http://erig.univ-pau.fr/PERDIDO/api.jsp. (2017). {accessed 2017-07-9}.Google Scholar
Barbara Piatti, Hans Rudolf Bär, Anne-Kathrin Reuschel, Lorenz Hurni, and William Cartwright. 2009. Mapping literature: Towards a geography of fiction. Cartography and art (2009), 1--16.Google Scholar
Thierry Poibeau. 2003. In Extraction automatique d'information: du texte brut au web sémantique. Hermès Lavoisier.Google Scholar
Thierry Poibeau. 2011. Traitement automatique du contenu textuel. Lavoisier.Google Scholar
Lisa F. Rau. 1991. Extracting Company Names from Text. In Artificial Intelligence Applications. IEEE, Miami Beach, 29--32.Google Scholar
Erik Rauch, Michael Bukatin, and Kenneth Baker. 2003. A Confidence-based Framework for Disambiguating Geographic Terms. In Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic References - Volume 1 (HLT-NAACL-GEOREF '03). Association for Computational Linguistics, Stroudsburg, PA, USA, 50--54. Google ScholarDigital Library
Unitex. 2017. Unitex/GramLab: an open source, cross-platform, multilingual, lexicon- and grammar-based corpus processing suite. http://www-igm.univ-mlv.fr/~unitex/. (2017). {accessed 2017-01-12}.Google Scholar
Barney Warf and Santa Arias. 2008. The spatial turn: Interdisciplinary perspectives. Routledge.Google Scholar
GuoDong Zhou and Jian Su. 2002. Named Entity Recognition Using an HMM-based Chunk Tagger. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL '02). Association for Computational Linguistics, Stroudsburg, PA, USA, 473--480. Google ScholarDigital Library

Index Terms

Automated Geoparsing of Paris Street Names in 19th Century Novels

Recommendations

On the Ambiguity and Relevance of Place Names in Scientific Text
JCDL '20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020

How hard is it to systematically identify and disambiguate place names in scientific text? In order to address this question, we applied MapAffil, a toponymic search interface, on a random sample of 500 place name sentences from PubMed abstracts.

The ...
Read More
A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics
Abstract
Empirical methods in geoparsing have thus far lacked a standard evaluation framework describing the task, metrics and data used to compare state-of-the-art systems. Evaluation is further made inconsistent, even unrepresentative of real world usage ...
Read More
Learning Recognition of Ambiguous Proper Names in Hindi
ICMLA '11: Proceedings of the 2011 10th International Conference on Machine Learning and Applications and Workshops - Volume 01

An ambiguous proper name is a name which is also a valid dictionary word with a meaning of its own when used in the text. For example in English, the word 'bush' in 'Mr. Bush' is a proper name whereas in 'a dense bush' it is a lexical entity. Almost all ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

GeoHumanities '17: Proceedings of the 1st ACM SIGSPATIAL Workshop on Geospatial Humanities
November 2017
60 pages
ISBN:9781450354967
DOI:10.1145/3149858

Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 November 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Digital Humanities
Geographical Information Retrieval
Geoparsing
Named Entity Recognition
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate15of21submissions,71%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 9
  Total Citations
  View Citations
- 123
  Total Downloads
- Downloads (Last 12 months)16
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automated Geoparsing of Paris Street Names in 19th Century Novels

GeoHumanities '17: Proceedings of the 1st ACM SIGSPATIAL Workshop on Geospatial Humanities

ABSTRACT

References

Cited By

Index Terms

Recommendations

On the Ambiguity and Relevance of Place Names in Scientific Text

A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics

Learning Recognition of Ambiguous Proper Names in Hindi

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Automated Geoparsing of Paris Street Names in 19th Century Novels

GeoHumanities '17: Proceedings of the 1st ACM SIGSPATIAL Workshop on Geospatial Humanities

ABSTRACT

References

Cited By

Index Terms

Recommendations

On the Ambiguity and Relevance of Place Names in Scientific Text

A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics

Learning Recognition of Ambiguous Proper Names in Hindi

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media