Retrieval and interpretation of textual geolocalized information based on semantic geolocalized relations

Wojciech Korczynski

Abstract


This paper describes a method for geolocalized information retrieval from natural language text and its interpretation by assigning them geographic coordinates. A proof-of-concept implementation is discussed, along with geolocalized dictionary stored in PostGIS/PostgreSQL spatial relational database. Discussed research focuses on strongly inflectional Polish language, hence additional complexity had to be taken into account. Presented method has been evaluated with the use of diverse metrics.

Keywords


geolocalization; geolocalized dictionary; geolocalized relations; natural language processing

Full Text:

PDF

References


Ballatore A., Bertolotto M., Wilson D.C.: Geographic Knowledge Extraction and Semantic Similarity in OpenStreetMap. Knowledge and Information Systems, pp. 1–21, 2012.

Blasby D.: Building a Spatial Database in PostgreSQL. Refractions Research, 2001. http://wiki.postgis.org/files/OSDB2_PostGIS_Presentation.pdf.

Cheng Z., Caverlee J., Lee K.: You are where you tweet: a content-based approach to geo-locating twitter users. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 759–768. ACM, 2010.

Corcoran P., Mooney P.: Characterising the Metric and Topological Evolution of OpenStreetMap Network Representations. The European Physical Journal Special Topics, vol. 215(1), pp. 109–122, 2013.

Derungs C., Purves R.S.: From text to landscape: locating, identifying and mapping the use of landscape features in a Swiss Alpine corpus. International Journal of Geographical Information Science, vol. 28(6), pp. 1272–1293, 2014.

Egenhofer M.J., Franzosa R.D.: Point-set topological spatial relations. International Journal of Geographical Information System, vol. 5(2), pp. 161–174, 1991.

Gajęcki M.: Słownik fleksyjny języka polskiego CLP – opis użytkowy. Katedra Lingwistyki Komputerowej, Katedra Informatyki, Akademia Górniczo-Hutnicza, Kraków, 2008.

Gey F., Larson R., Sanderson M., Joho H., Clough P., Petras V.: GeoCLEF: the CLEF 2005 cross-language geographic information retrieval track overview. Springer-Verlag, Berlin – Heidelberg, 2006.

Goetz M., Lauer J., Auer M.: An Algorithm Based Methodology for the Creation of a Regularly Updated Global Online Map Derived from Volunteered Geographic Information. In: Proceedings of the Fourth International Conference on Advanced Geographic Information Systems, Applications, and Services, C.P. Rückemann, B. Resch, eds, pp. 50–58. Valencia, 2012.

Haklay M., Weber P.: OpenStreetMap: User-Generated Street Maps. IEEE Pervasive Computing, vol. 7(4), pp. 12–18, 2008.

Han B., Cook P., Baldwin T.: Text-based twitter user geolocation prediction. Journal of Artificial Intelligence Research, pp. 451–500, 2014.

Herskovits A.: Language and spatial cognition. Cambridge University Press, 1987.

Jaśkiewicz G.: Geolocalization of 19th-century villages and cities mentioned in geographical dictionary of the kingdom of Poland. Computer Science, vol. 14(3), pp. 423–442, 2013.

Karimzadeh M., Huang W., Banerjee S., Wallgrün J.O., Hardisty F., Pezanowski S., Mitra P., MacEachren A.M.: GeoTxt: A Web API to Leverage Place References in Text. In: Proceedings of the 7th Workshop on Geographic Information Retrieval, GIR ’13, pp. 72–73. ACM, New York, NY, USA, 2013.

Klien E., Lutz M.: The role of spatial relations in automating the semantic annotation of geodata. In: Spatial Information Theory, pp. 133–148. Springer-Verlag, Berlin – Heidelberg, 2005.

Kondrak G.: N-Gram Similarity and Distance. In: String Processing and Information Retrieval, M. Consens, G. Navarro, eds, Lecture Notes in Computer Science, vol. 3772, pp. 115–126. Springer-Verlag, Berlin – Heidelberg, 2005.

Korczyński W., Korzycki M.: Extraction and application of geolocalized dictionaries, vol. 1, pp. 593–600. STEF92 Technology, 2014.

Korzycki M.: Transducer skończenie stanowy jako narzędzie rozpoznawania form tekstowych wyrazów polskich. Ph.D. thesis, Katedra Informatyki, Wydział Elektrotechniki, Automatyki, Informatyki i Elektroniki, Akademia Górniczo-Hutnicza, Kraków, 2008.

Larson R.R.: Geographic information retrieval and spatial browsing. GIS and Libraries: Patrons, Maps and Spatial Information, pp. 81–124, 1996.

Leidner J.L.: Toponym resolution in text. Ph.D. thesis, University of Edinburgh, 2007.

Levenshtein V.: Binary Codes Capable of Correcting Deletions, Insertions, and Reversals. Soviet Physics-Doklady, vol. 10(8), pp. 707–710, 1966.

Lovins J.B.: Development of a Stemming Algorithm. Mechanical Translation and Computational Linguistics, vol. 11, pp. 22–31, 1968.

Lubaszewski W.: Gramatyka leksykalna w maszynowym słowniku języka polskiego. Prace Instytutu Języka Polskiego. Polska Akademia Nauk, Instytut Języka Polskiego, Kraków, 1997.

Lubaszewski W.: Słowniki komputerowe i automatyczna ekstrakcja informacji z tekstu. AGH Uczelniane Wydawnictwa Naukowo-Dydaktyczne, Kraków, 2009.

Lubaszewski W., Gatkowska I.: Struktura semantyczna języka naturalnego. In: Interfejs dla osób z dysfunkcją wzroku. Model kognitywny i przykład dobrej praktyki, I. Gatkowska, W. Lubaszewski, eds, pp. 49–106. Wydawnictwo Uniwersytetu Jagiellońskiego, Kraków, 2013.

Łuczyński E.: Fleksja języka polskiego z punktu widzenia ontogenezy mowy. Biuletyn Polskiego Towarzystwa Językoznawczego, vol. LVIII, pp. 157–165, 2002.

Manning C.D., Raghavan P., Schütze H.: Introduction to Information Retrieval. Cambridge University Press, New York, 2008.

Markowetz A., Brinkhoff T., Seeger B.: Geographic information retrieval. In: Next generation geospatial information: from digital image analysis to spatio temporal databases, P. Agouris, A. Croitoru, eds, pp. 5–17, A.A. Balkema Publishers, Leiden – London – New York – Philadelphia – Singapore, 2005.

McArdle G., Ballatore A., Tahir A., Bertolotto M.: An Open-Source Web Architecture for Adaptive Location Based Services. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. 38(2), pp. 296–301, 2010.

Obe R.O., Hsu L.S.: PostGIS in Action. Manning Publications Company, 2011.

Pażus R.: Państwowy system odniesień przestrzennych – systemy i układy odniesienia w Polsce. Główny Urząd Geodezji i Kartografii, Departament Geodezji Kartografii i Systemów Informacji Geograficznej, 2009.

Pouliquen B., Steinberger R., Ignat C., De Groeve T.: Geographical Information Recognition and Visualization in Texts Written in Various Languages. In: Proceedings of the 2004 ACM Symposium on Applied Computing, SAC ’04, pp. 1051–1058. ACM, New York, NY, USA, 2004.

Ramsey P.: Introduction to PostGIS. Refractions Research, Victoria, 2007.

Roller S., Speriosu M., Rallapalli S., Wing B., Baldridge J.: Supervised text-based geolocation using language models on an adaptive grid. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1500–1510. Association for Computational Linguistics, 2012.

Shariff A.R.B., Egenhofer M.J., Mark D.M.: Natural-language spatial relations between linear and areal objects: the topology and metric of English-language terms. International Journal of Geographical Information Science, vol. 12(3), pp. 215–245, 1998.

Wang C., Xie X., Wang L., Lu Y., Ma W.Y.: Detecting Geographic Locations from Web Resources. In: Proceedings of the 2005 Workshop on Geographic Information Retrieval, GIR ’05, pp. 17–24. ACM, New York, NY, USA, 2005.

Whitelegg N.: Using OpenStreetMap Data. Southampton Solent University, 2011. http://www.bcs.org/upload/pdf/open-street-map-data-180313.pdf.

Wing B.P., Baldridge J.: Simple Supervised Document Geolocation with Geodesic Grids. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies – Volume 1, HLT ’11, pp. 955–964. Association for Computational Linguistics, Stroudsburg, PA, USA, 2011.

Woodruff A.G., Plaunt C.: GIPSY: Automated geographic indexing of text documents. Journal of the American Society for Information Science, vol. 45(9), pp. 645–655, 1994.

Wróbel H.: Gramatyka języka polskiego. Spółka Wydawnicza "Od Nowa", Kraków, 2001.

Zheng J., Chen X., Ciepłuch B., Winstanley A.C., Mooney P., Jacob R.: Mobile Routing Services for Small Towns Using CloudMade API and OpenStreetMap. In: Proceedings of the 14th Joint International Conference on Theory, Data Handling and Modelling in GeoSpatial Information Science, pp. 149–154. Hong Kong, 2010.




DOI: https://doi.org/10.7494/csci.2015.16.4.395

Refbacks

  • There are currently no refbacks.