skip to main content
10.1145/3149858.3149859acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
research-article

Automated Geoparsing of Paris Street Names in 19th Century Novels

Published:07 November 2017Publication History

ABSTRACT

Our project involves building a platform able to retrieve, map and analyze the occurrences of place names in fictional novels published between 1800 and 1914 and whose action occurs wholly or partly in Paris. We describe a proof of concept using queries made via the TXM textual analysis platform for the extraction of street names. Then, we propose a fully automatic process using the named entity recognition (NER) components of the PERDIDO platform. This paper describes some encouraging initial results obtained by combining NLP approaches (NER methods) with textometric tools for the automated geoparsing of street names.

References

  1. Beatrice Alex, Kate Byrne, Claire Grover, and Richard Tobin. 2015. Adapting the Edinburgh geoparser for historical georeferencing. International Journal of Humanities and Arts Computing 9, 1 (2015), 15--35.Google ScholarGoogle ScholarCross RefCross Ref
  2. Beatrice Alex, Claire Grover, Jon Oberlander, Tara Thomson, Miranda Anderson, James Loxley, Uta Hinrichs, and Ke Zhou. 2016. Palimpsest: Improving assisted curation of loco-specific literature. Digital Scholarship in the Humanities 32, 1 (2016), i4--i16.Google ScholarGoogle Scholar
  3. Miranda Anderson and James Loxley. 2016. The Digital Poetics of Place-Names in Literary Edinburgh. Literary Mapping in the Digital Age (2016), 47.Google ScholarGoogle Scholar
  4. Frédéric Béchet, Benoît Sagot, and Rosa Stern. 2011. Coopération de méthodes statistiques et symboliques pour l'adaptation non-supervisée d'un système d'étiquetage en entités nommées. In TALN'2011 - Traitement Automatique des Langues Naturelles. https://hal.inria.fr/inria-00617068/documentGoogle ScholarGoogle Scholar
  5. Noémie Boeglin, Michel Depeyre, Thierry Joliveau, and Yves-Francois Le Lay. 2016. Pour une cartographie romanesque de Paris au XIXe siècle. Proposition méthodologique. In Actes de la conférence SAGEO'2016 - Spatial Analysis and GEOmatics. Nice, France, 76--90.Google ScholarGoogle Scholar
  6. TEI Consortium (Ed.). 2016. TEI P5: Guidelines for Electronic Text Encoding and Interchange. http://www.tei-c.org/Guidelines/P5/ (accessed July 2017). P5, version 3.1.0. Last updated on 15th December 2016.Google ScholarGoogle Scholar
  7. David Cooper, Christopher Donaldson, and Patricia Murrieta-Flores. 2016. Literary mapping in the digital age. Routledge.Google ScholarGoogle Scholar
  8. Nathalie Friburger and Denis Maurel. 2004. Finite-state transducer cascades to extract named entities in texts. Theoretical Computer Science 313, 1 (2004), 93--104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Mauro Gaio and Ludovic Moncla. 2017. Extended Named Entity Recognition Using Finite-State Transducers: An Application to Place Names. In 9th International Conference on Advanced Geographic Information Systems, Applications, and Services. Nice, France.Google ScholarGoogle Scholar
  10. Ian Gregory and Christopher Donaldson. 2016. Geographical text analysis: Digital cartographies of Lake District literature. Literary Mapping in the Digital Age (2016), 67--87.Google ScholarGoogle Scholar
  11. Ian Gregory, Christopher Donaldson, Patricia Murrieta-Flores, and Paul Rayson. 2015. Geoparsing, GIS, and Textual Analysis: Current Developments in Spatial Humanities Research. International Journal of Humanities and Arts Computing 9, 1 (March 2015), 1--14.Google ScholarGoogle ScholarCross RefCross Ref
  12. Milan Gritta, Mohammad Taher Pilehvar, Nut Limsopatham, and Nigel Collier. 2017. What's missing in geographical parsing? Language Resources and Evaluation (07 Mar 2017).Google ScholarGoogle Scholar
  13. Serge Heiden. 2010. The TXM Platform: Building Open-Source Textual Analysis Software Compatible with the TEI Encoding Scheme. In 24th Pacific Asia Conference on Language, Information and Computation, Otoguro Ryo, Ishikawa Kiyoshi, Umemoto Hiroshi, Yoshimoto Kei, and Harada Yasunari (Eds.). Institute for Digital Enhancement of Cognitive Development, Waseda University, Sendai, Japan, 389--398. https://halshs.archives-ouvertes.fr/halshs-00549764Google ScholarGoogle Scholar
  14. Ryan Heuser, Mark Algee-Hewitt, Van Tran, Annalise Lockhart, and Erik Steiner. 2015. Mapping the emotions of London in fiction, 1700-1900: A crowdsourcing experiment. Proceedings of the Digital Humanities (2015).Google ScholarGoogle Scholar
  15. Linda L Hill. 2006. Georeferencing: The geographic associations of information. Mit Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kerstin Jonasson. 1994. Le nom propre. Duculot, Belgique, Louvain-la-Neuve.Google ScholarGoogle Scholar
  17. Alexei Lavrentiev, Serge Heiden, and Matthieu Decorde. 2013. Analyzing TEI encoded texts with the TXM platform. In The Linked TEI: Text Encoding in the Web. TEI Conference and Members Meeting 2013.Google ScholarGoogle Scholar
  18. Monica Matei-Chesnoiu. 2015. Geoparsing early modern English drama. Springer.Google ScholarGoogle Scholar
  19. Andrew McCallum and Wei Li. 2003. Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-enhanced Lexicons. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003 - Volume 4 (CONLL '03). Association for Computational Linguistics, Stroudsburg, PA, USA, 188--191. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Ludovic Moncla and Mauro Gaio. 2015. A Multi-layer Markup Language for Geospatial Semantic Annotations. In Proceedings of the 9th Workshop on Geographic Information Retrieval (GIR '15). ACM, New York, NY, USA, Article 5, 10 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ludovic Moncla, Mauro Gaio, Javier Nogueras-Iso, and Sébastien Mustière. 2016. Reconstruction of itineraries from annotated text with an informed spanning tree algorithm. International Journal of Geographical Information Science 30, 2 (2016). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Ludovic Moncla, Walter Renteria-Agualimpia, Javier Nogueras-Iso, and Mauro Gaio. 2014. Geocoding for Texts with Fine-grain Toponyms: An Experiment on a Geoparsed Hiking Descriptions Corpus. In 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (SIGSPATIAL '14). ACM, Dallas,TX, USA, 183--192. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Franco Moretti. 1999. Atlas of the European novel, 1800--1900. Verso.Google ScholarGoogle Scholar
  24. Franco Moretti. 2005. Graphs, maps, trees: abstract models for a literary history. Verso.Google ScholarGoogle Scholar
  25. PERDIDO. 2017. Extended Named Entity Annotation Service. http://erig.univ-pau.fr/PERDIDO/api.jsp. (2017). {accessed 2017-07-9}.Google ScholarGoogle Scholar
  26. Barbara Piatti, Hans Rudolf Bär, Anne-Kathrin Reuschel, Lorenz Hurni, and William Cartwright. 2009. Mapping literature: Towards a geography of fiction. Cartography and art (2009), 1--16.Google ScholarGoogle Scholar
  27. Thierry Poibeau. 2003. In Extraction automatique d'information: du texte brut au web sémantique. Hermès Lavoisier.Google ScholarGoogle Scholar
  28. Thierry Poibeau. 2011. Traitement automatique du contenu textuel. Lavoisier.Google ScholarGoogle Scholar
  29. Lisa F. Rau. 1991. Extracting Company Names from Text. In Artificial Intelligence Applications. IEEE, Miami Beach, 29--32.Google ScholarGoogle Scholar
  30. Erik Rauch, Michael Bukatin, and Kenneth Baker. 2003. A Confidence-based Framework for Disambiguating Geographic Terms. In Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic References - Volume 1 (HLT-NAACL-GEOREF '03). Association for Computational Linguistics, Stroudsburg, PA, USA, 50--54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Unitex. 2017. Unitex/GramLab: an open source, cross-platform, multilingual, lexicon- and grammar-based corpus processing suite. http://www-igm.univ-mlv.fr/~unitex/. (2017). {accessed 2017-01-12}.Google ScholarGoogle Scholar
  32. Barney Warf and Santa Arias. 2008. The spatial turn: Interdisciplinary perspectives. Routledge.Google ScholarGoogle Scholar
  33. GuoDong Zhou and Jian Su. 2002. Named Entity Recognition Using an HMM-based Chunk Tagger. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL '02). Association for Computational Linguistics, Stroudsburg, PA, USA, 473--480. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automated Geoparsing of Paris Street Names in 19th Century Novels

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          GeoHumanities '17: Proceedings of the 1st ACM SIGSPATIAL Workshop on Geospatial Humanities
          November 2017
          60 pages
          ISBN:9781450354967
          DOI:10.1145/3149858

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 7 November 2017

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate15of21submissions,71%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader