Adapting a Constituency Parser to User-Generated Content in Polish Opinion Mining


  • Agnieszka Pluwak Institute of Slavic Studies, Polish Academy of Sciences, Warsaw Fido Intelligence, Gdansk
  • Wojciech Korczynski AGH University of Science and Technology, Faculty of Computer Science, Electronics and Telecommunications, Department of Computer Science, Krakow
  • Marek Kisiel-Dorohinicki AGH University of Science and Technology, Faculty of Computer Science, Electronics and Telecommunications, Department of Computer Science, Krakow



user generated content, text normalization, parsing, sentiment analysis


The paper focuses on the adjustment of NLP tools for Polish; e.g., morphological analyzers and parsers, to user-generated content (UGC). The authors discuss two rule-based techniques applied to improve their efficiency: pre-processing (text normalization) and parser adaptation (modified segmentation and parsing rules). A new solution to handle OOVs based on inflectional translation is also offered.


Download data is not yet available.


