ENHANCING REGULAR EXPRESSIONS FOR POLISH TEXT PROCESSING

Krzysztof Dorosz; Anna Szczerbińska

doi:10.7494/csci.2009.10.3.19

Authors

Krzysztof Dorosz AGH University of Science and Technology, Jagiellonian University, Krakow
Anna Szczerbińska AGH University of Science and Technology

DOI:

https://doi.org/10.7494/csci.2009.10.3.19

Keywords:

regular expressions, regex, natural language, the Polish language processing, CLP library

Abstract

The paper presents proposition of regular expressions engine based on the modified Thompson’salgorithm dedicated to the Polish language processing. The Polish inflectional dictionaryhas been used for enhancing regular expressions engine and syntax. Instead of usingcharacters as a basic element of regular expressions patterns (as it takes place in BRE orERE standards) presented tool gives possibility of using words from a natural language orlabels describing words grammar properties in regex syntax.

Downloads

Download data is not yet available.

Author Biographies

Krzysztof Dorosz, AGH University of Science and Technology, Jagiellonian University, Krakow

PhD Student, Institute of Computer Science, Computational Linguistics Department
Anna Szczerbińska, AGH University of Science and Technology

Msc. student, Institute of Computer Science

References

W. Lubaszewski et. al.: Słowniki komputerowe i automatyczna ekstrakcja informacji z tekstu. Wydawnictwo AGH, pp. 107–126, 2009

E. Branny, M. Gajecki: Text Summarizing in Polish. Computer Science, Annual of AGH University Of Science and Technology, pp. 31–46, 2005

G. Grefenstette, P. Tapanainen: What is a word, What is a sentence? Problems of Tokenization.. 3rd Conference on Computational Lexicography and Text Research COMPLEX’94 Budapest, 1994

A. A. R. Sethi, J. D. Ullman: Compilers: Principles, Techniques, and Tools.. Addison-Wesley, 1988

J. Hopcroft, J. Ullman: Introduction to Automata Theory, Languages and Computation. Addison-Wesley, 1979

Regular Expressions. The Single UNIX Specification, Version 2, The Open Group, 1997, http://opengroup.org/onlinepubs/007908775/xbd/re.html

ENHANCING REGULAR EXPRESSIONS FOR POLISH TEXT PROCESSING

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

References

Downloads

Published

Issue

Section

How to Cite

Most read articles by the same author(s)

Latest publications

Information

Make a Submission